Bayesian Nonparametric Graph Clustering
Banerjee, Sayantan; Akbani, Rehan; Baladandayuthapani, Veerabhadran
2015-01-01
We present clustering methods for multivariate data exploiting the underlying geometry of the graphical structure between variables. As opposed to standard approaches that assume known graph structures, we first estimate the edge structure of the unknown graph using Bayesian neighborhood selection approaches, wherein we account for the uncertainty of graphical structure learning through model-averaged estimates of the suitable parameters. Subsequently, we develop a nonparametric graph cluster...
A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering
Seldin, Yevgeny
2010-01-01
We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PAC-Bayesian generaliza...
Parallel Local Graph Clustering
Shun, Julian; Roosta-Khorasani, Farbod; Fountoulakis, Kimon; Mahoney, Michael W.
2016-01-01
Graph clustering has many important applications in computing, but due to growing sizes of graph, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest. Motivated partly by this, so-called local algorithms for graph clustering have received significant interest due to the fact that they can find good clusters in a graph with work proportional to the size of the cluster rather than that of the entire graph. T...
Cybis, Gabriela Bettella
2014-01-01
Combining models for phenotypic and molecular evolution can lead to powerful inference tools.Under the flexible framework of Bayesian phylogenetics, I develop statistical methods to address phylodynamic problems in this intersection.First, I present a hierarchical phylogeographic method that combines information across multiple datasets to draw inference on a common geographical spread process. Each dataset represents a parallel realization of this geographic process on a different group of ...
Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures.
Orbanz, Peter; Roy, Daniel M
2015-02-01
The natural habitat of most Bayesian methods is data represented by exchangeable sequences of observations, for which de Finetti's theorem provides the theoretical foundation. Dirichlet process clustering, Gaussian process regression, and many other parametric and nonparametric Bayesian models fall within the remit of this framework; many problems arising in modern data analysis do not. This article provides an introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures. We describe results in probability theory that generalize de Finetti's theorem to such data and discuss their relevance to nonparametric Bayesian modeling. With the basic ideas in place, we survey example models available in the literature; applications of such models include collaborative filtering, link prediction, and graph and network analysis. We also highlight connections to recent developments in graph theory and probability, and sketch the more general mathematical foundation of Bayesian methods for other types of data beyond sequences and arrays. PMID:26353253
Crossings in Clustered Level Graphs
Forster, Michael
2005-01-01
Clustered graphs are an enhanced graph model with a recursive clustering of the vertices according to a given nesting relation. This prime technique for expressing coherence of certain parts of the graph is used in many applications, such as biochemical pathways and UML class diagrams. For directed clustered graphs usually level drawings are used, leading to clustered level graphs. In this thesis we analyze the interrelation of clusters and levels and their influence on edge crossings and clu...
Bootstrap clustering for graph partitioning
Gambette, Philippe; Guénoche, Alain
2011-01-01
Given a simple undirected weighted or unweighted graph, we try to cluster the vertex set into communities and also to quantify the robustness of these clusters. For that task, we propose a new method, called bootstrap clustering which consists in (i) defining a new clustering algorithm for graphs, (ii) building a set of graphs similar to the initial one, (iii) applying the clustering method to each of them, making a profile (set) of partitions, (iv) computing a consensus partition for this pr...
Energy Technology Data Exchange (ETDEWEB)
Winlaw, Manda [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); De Sterck, Hans [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sanders, Geoffrey [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
2015-10-26
In very simple terms a network can be de ned as a collection of points joined together by lines. Thus, networks can be used to represent connections between entities in a wide variety of elds including engi- neering, science, medicine, and sociology. Many large real-world networks share a surprising number of properties, leading to a strong interest in model development research and techniques for building synthetic networks have been developed, that capture these similarities and replicate real-world graphs. Modeling these real-world networks serves two purposes. First, building models that mimic the patterns and prop- erties of real networks helps to understand the implications of these patterns and helps determine which patterns are important. If we develop a generative process to synthesize real networks we can also examine which growth processes are plausible and which are not. Secondly, high-quality, large-scale network data is often not available, because of economic, legal, technological, or other obstacles [7]. Thus, there are many instances where the systems of interest cannot be represented by a single exemplar network. As one example, consider the eld of cybersecurity, where systems require testing across diverse threat scenarios and validation across diverse network structures. In these cases, where there is no single exemplar network, the systems must instead be modeled as a collection of networks in which the variation among them may be just as important as their common features. By developing processes to build synthetic models, so-called graph generators, we can build synthetic networks that capture both the essential features of a system and realistic variability. Then we can use such synthetic graphs to perform tasks such as simulations, analysis, and decision making. We can also use synthetic graphs to performance test graph analysis algorithms, including clustering algorithms and anomaly detection algorithms.
Experiments on Graph Clustering Algorithms
Brandes, Ulrik; Gaertler, Marco; Wagner, Dorothea
2003-01-01
A promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. inter-cluster sparsity. While both formalizations and algorithms focusing on particular aspects of this rather vague concept have been proposed no conclusive argument on their appropriateness has been given. As a first step towards understanding the consequences of particular conceptions, we conducted an experimental evaluation of graph clustering approaches. By combining proven techniques f...
Hierarchical clustering for graph visualization
Clémençon, Stéphan; Rossi, Fabrice; Tran, Viet Chi
2012-01-01
This paper describes a graph visualization methodology based on hierarchical maximal modularity clustering, with interactive and significant coarsening and refining possibilities. An application of this method to HIV epidemic analysis in Cuba is outlined.
Median Graph Shift: A New Clustering Algorithm for Graph Domain
Jouili, Salim; Tabbone, Salvatore; Lacroix, Vinciane
2010-01-01
n the context of unsupervised clustering, a new algorithm for the domain of graphs is introduced. In this paper, the key idea is to adapt the mean-shift clustering and its variants proposed for the domain of feature vectors to graph clustering. These algorithms have been applied successfully in image analysis and computer vision domains. The proposed algorithm works in an iterative manner by shifting each graph towards the median graph in a neighborhood. Both the set median graph and the gene...
CORECLUSTER: A Degeneracy Based Graph Clustering Framework
Giatsidis, Christos; Malliaros, Fragkiskos; Thilikos, Dimitrios M.; Vazirgiannis, Michalis
2014-01-01
Graph clustering or community detection constitutes an important task forinvestigating the internal structure of graphs, with a plethora of applications in several domains. Traditional tools for graph clustering, such asspectral methods, typically suffer from high time and space complexity. In thisarticle, we present \\textsc{CoreCluster}, an efficient graph clusteringframework based on the concept of graph degeneracy, that can be used along withany known graph clustering algorithm. Our approa...
Axioms for graph clustering quality functions
Laarhoven, T. van; Marchiori, E.
2013-01-01
We investigate properties that intuitively ought to be satisfied by graph clustering quality functions, that is, functions that assign a score to a clustering of a graph. Graph clustering, also known as network community detection, is often performed by optimizing such a function. Two axioms tailored for graph clustering quality functions are introduced, and the four axioms introduced in previous work on distance based clustering are reformulated and generalized for the graph setting. We show...
Bayesian Agglomerative Clustering with Coalescents
Teh, Yee Whye; Daumé III, Hal; Roy, Daniel
2009-01-01
We introduce a new Bayesian model for hierarchical clustering based on a prior over trees called Kingman's coalescent. We develop novel greedy and sequential Monte Carlo inferences which operate in a bottom-up agglomerative fashion. We show experimentally the superiority of our algorithms over others, and demonstrate our approach in document clustering and phylolinguistics.
Optimized Graph Search Using Multi-Level Graph Clustering
Kala, Rahul; Shukla, Anupam; Tiwari, Ritu
Graphs find a variety of use in numerous domains especially because of their capability to model common problems. The social networking graphs that are used for social networking analysis, a feature given by various social networking sites are an example of this. Graphs can also be visualized in the search engines to carry search operations and provide results. Various searching algorithms have been developed for searching in graphs. In this paper we propose that the entire network graph be clustered. The larger graphs are clustered to make smaller graphs. These smaller graphs can again be clustered to further reduce the size of graph. The search is performed on the smallest graph to identify the general path, which may be further build up to actual nodes by working on the individual clusters involved. Since many searches are carried out on the same graph, clustering may be done once and the data may be used for multiple searches over the time. If the graph changes considerably, only then we may re-cluster the graph.
Graph partitioning advance clustering technique
Madhulatha, T Soni
2012-01-01
Clustering is a common technique for statistical data analysis, Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Dissimilarities are assessed based on the attribute values describing the objects. Often, distance measures are used. Clustering is an unsupervised learning technique, where interesting patterns and structures can be found directly from very large data sets with little or none of the background knowledge. This paper also considers the partitioning of m-dimensional lattice graphs using Fiedler's approach, which requires the determination of the eigenvector belonging to the second smallest Eigenvalue of the Laplacian with K-means partitioning algorithm.
Graph matching and clustering using kernel attributes
Lozano Ortega, Miguel Ángel; Escolano Ruiz, Francisco
2013-01-01
In this paper, we exploit graph kernels for graph matching and clustering. Firstly, we analyze different kinds of graph kernels in order to extract from them attributes to be used as a similarity measure between nodes of non-attributed graphs. Next, such attributes are embedded in a graph-matching cost function, through a probabilistic framework, and we evaluate their performance within a graph-matching algorithm. Secondly, we propose a method for obtaining a representative prototype from a s...
Fowler, Anna; Menon, Vilas; Heard, Nicholas A
2013-10-01
Clusters of time series data may change location and memberships over time; in gene expression data, this occurs as groups of genes or samples respond differently to stimuli or experimental conditions at different times. In order to uncover this underlying temporal structure, we consider dynamic clusters with time-dependent parameters which split and merge over time, enabling cluster memberships to change. These interesting time-dependent structures are useful in understanding the development of organisms or complex organs, and could not be identified using traditional clustering methods. In cell cycle data, these time-dependent structure may provide links between genes and stages of the cell cycle, whilst in developmental data sets they may highlight key developmental transitions. PMID:24131050
Some Applications of Graph Theory to Clustering
Hubert, Lawrence J.
1974-01-01
The connection between graph theory and clustering is reviewed and extended. Major emphasis is on restating, in a graph-theoretic context, selected past work in clustering, and conversely, developing alternative strategies from several standard concepts used in graph theory per se. (Author/RC)
Bayesian Decision Theoretical Framework for Clustering
Chen, Mo
2011-01-01
In this thesis, we establish a novel probabilistic framework for the data clustering problem from the perspective of Bayesian decision theory. The Bayesian decision theory view justifies the important questions: what is a cluster and what a clustering algorithm should optimize. We prove that the spectral clustering (to be specific, the…
Graph Degree Linkage: Agglomerative Clustering on a Directed Graph
Zhang, Wei; Wang, Xiaogang; Zhao, Deli; Tang, Xiaoou
2012-01-01
This paper proposes a simple but effective graph-based agglomerative algorithm, for clustering high-dimensional data. We explore the different roles of two fundamental concepts in graph theory, indegree and outdegree, in the context of clustering. The average indegree reflects the density near a sample, and the average outdegree characterizes the local geometry around a sample. Based on such insights, we define the affinity measure of clusters via the product of average indegree and average o...
Graph coarsening and clustering on the GPU
Fagginger Auer, B.O.; Bisseling, R.H.
2013-01-01
Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained sharedmemory parallel graph co
Graph coarsening and clustering on the GPU
Fagginger Auer, B.O.; Bisseling, R.H.
2013-01-01
Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained sharedmemory parallel graph coarsening algorithm and use this to implement a parallel agglomerative clustering heuristic on both the CPU and the GPU. This heuristic is able to generate clusterings in very little time: a modular...
Topologically Ordered Graph Clustering via Deterministic Annealing
Rossi, Fabrice; Villa-Vialaneix, Nathalie
2009-01-01
This paper proposes an organized generalization of Newman and Girvan's modularity measure for graph clustering. Optimized via a deterministic annealing scheme, this measure produces topologically ordered graph partitions that lead to faithful and readable graph representations on a 2 dimensional SOM like planar grid.
Malware Classification based on Call Graph Clustering
Kinable, Joris
2010-01-01
Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, and enable the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and DB...
Graph Clustering with Density-Cut
Shao, Junming; Liu, Jinhu; Kramer, Stefan
2016-01-01
How can we find a good graph clustering of a real-world network, that allows insight into its underlying structure and also potential functions? In this paper, we introduce a new graph clustering algorithm Dcut from a density point of view. The basic idea is to envision the graph clustering as a density-cut problem, such that the vertices in the same cluster are densely connected and the vertices between clusters are sparsely connected. To identify meaningful clusters (communities) in a graph, a density-connected tree is first constructed in a local fashion. Owing to the density-connected tree, Dcut allows partitioning a graph into multiple densely tight-knit clusters directly. We demonstrate that our method has several attractive benefits: (a) Dcut provides an intuitive criterion to evaluate the goodness of a graph clustering in a more natural and precise way; (b) Built upon the density-connected tree, Dcut allows identifying the meaningful graph clusters of densely connected vertices efficiently; (c) The de...
Exploiting Optimization for Local Graph Clustering
Fountoulakis, Kimon; Cheng, Xiang; Shun, Julian; Roosta-Khorasani, Farbod; Mahoney, Michael W.
2016-01-01
Local graph clustering methods aim to identify well-connected clusters around a given "seed set" of reference nodes. The main focus of prior theoretical work has been on worst-case running time properties or on implicit statistical regularization; and the focus of prior empirical work has been to identify structure in large social and information networks. Here, we adopt an optimization perspective on local graph clustering methods. In particular, we clarify the relationship between the local...
Bipartite graph partitioning and data clustering
International Nuclear Information System (INIS)
Many data types arising from data mining applications can be modeled as bipartite graphs, examples include terms and documents in a text corpus, customers and purchasing items in market basket analysis and reviewers and movies in a movie recommender system. In this paper, the authors propose a new data clustering method based on partitioning the underlying biopartite graph. The partition is constructed by minimizing a normalized sum of edge weights between unmatched pairs of vertices of the bipartite graph. They show that an approximate solution to the minimization problem can be obtained by computing a partial singular value decomposition (SVD) of the associated edge weight matrix of the bipartite graph. They point out the connection of their clustering algorithm to correspondence analysis used in multivariate analysis. They also briefly discuss the issue of assigning data objects to multiple clusters. In the experimental results, they apply their clustering algorithm to the problem of document clustering to illustrate its effectiveness and efficiency
Assessing the Quality of Multilevel Graph Clustering
Queyroi F.; Delest M.; Fedou J.-M.; Melancon G.
2014-01-01
Hierarchical clustering of graphs is a useful strategy to mine, explore and visualize graphs. Popular approaches define ad hoc procedures to decide how subgraphs are subdivided or nested. The popularity of graph hierarchies certainly relates to the relevance of multilevel models appearing in the natural and social sciences. For instance, current models in biology (genomics and/or proteomics) try to capture the multilevel nature of networks formed by various biological entities; cities and wor...
ON CLUSTERING TECHNIQUES OF CITATION GRAPHS.
CHIEN, R.T.; PREPARATA, F.P.
ONE OF THE PROBLEMS ENCOUNTERED IN CLUSTERING TECHNIQUES AS APPLIED TO DOCUMENT RETRIEVAL SYSTEMS USING BIBLIOGRAPHIC COUPLING DEVICES IS THAT THE COMPUTATIONAL EFFORT REQUIRED GROWS ROUGHLY AS THE SQUARE OF THE COLLECTION SIZE. IN THIS STUDY GRAPH THEORY IS APPLIED TO THIS PROBLEM BY FIRST MAPPING THE CITATION GRAPH OF THE DOCUMENT COLLECTION…
Eigenvectors for clustering: Unipartite, bipartite, and directed graph cases
Mirzal, Andri; FURUKAWA, MASASHI
2010-01-01
This paper presents a concise tutorial on spectral clustering for broad spectrum graphs which include unipartite (undirected) graph, bipartite graph, and directed graph. We show how to transform bipartite graph and directed graph into corresponding unipartite graph, therefore allowing a unified treatment to all cases. In bipartite graph, we show that the relaxed solution to the $K$-way co-clustering can be found by computing the left and right eigenvectors of the data matrix. This gives a the...
Clustering gene expression data using graph separators.
Kaba, Bangaly; Pinet, Nicolas; Lelandais, Gaëlle; Sigayret, Alain; Berry, Anne
2007-01-01
Recent work has used graphs to modelize expression data from microarray experiments, in view of partitioning the genes into clusters. In this paper, we introduce the use of a decomposition by clique separators. Our aim is to improve the classical clustering methods in two ways: first we want to allow an overlap between clusters, as this seems biologically sound, and second we want to be guided by the structure of the graph to define the number of clusters. We test this approach with a well-known yeast database (Saccharomyces cerevisiae). Our results are good, as the expression profiles of the clusters we find are very coherent. Moreover, we are able to organize into another graph the clusters we find, and order them in a fashion which turns out to respect the chronological order defined by the the sporulation process. PMID:18391236
Engineering Graph Clustering : Models and Experimental Evaluation
Brandes, Ulrik; Gaertler, Marco; Wagner, Dorothea
2007-01-01
A promising approach to graph clustering is based on the intuitive notion of intracluster density versus intercluster sparsity. As for the weighted case, clusters should accumulate lots of weight, in contrast to their connection to the remaining graph, which should be light. While both formalizations and algorithms focusing on particular aspects of this rather vague concept have been proposed, no conclusive argument on their appropriateness has been given. In order to deepen the understanding...
Accelerating semantic graph databases on commodity clusters
Energy Technology Data Exchange (ETDEWEB)
Morari, Alessandro; Castellana, Vito G.; Haglin, David J.; Feo, John T.; Weaver, Jesse R.; Tumeo, Antonino; Villa, Oreste
2013-10-06
We are developing a full software system for accelerating semantic graph databases on commodity cluster that scales to hundreds of nodes while maintaining constant query throughput. Our framework comprises a SPARQL to C++ compiler, a library of parallel graph methods and a custom multithreaded runtime layer, which provides a Partitioned Global Address Space (PGAS) programming model with fork/join parallelism and automatic load balancing over a commodity clusters. We present preliminary results for the compiler and for the runtime.
Local Clustering in Provenance Graphs (Extended Version)
MACKO Peter; Margo, Daniel Wyatt; Seltzer, Margo I.
2013-01-01
Systems that capture and store data provenance, the record of how an object has arrived at its current state, accumulate historical metadata over time, forming a large graph. Local clustering in these graphs, in which we start with a seed vertex and grow a cluster around it, is of paramount importance because it supports critical provenance applications such as identifying semantically meaningful tasks in an object’s history and selecting appropriate truncation points for returning an object’...
Snake graph calculus and cluster algebras from surfaces
Canakci, Ilke; Schiffler, Ralf
2012-01-01
Snake graphs appear naturally in the theory of cluster algebras. For cluster algebras from surfaces, each cluster variable is given by a formula which is parametrized by the perfect matchings of a snake graph. In this paper, we identify each cluster variable with its snake graph, and interpret relations among the cluster variables in terms of these graphs. In particular, we give a new proof of skein relations of two cluster variables.
Graph Approximation and Clustering on a Budget
Fetaya, Ethan; Shamir, Ohad; Ullman, Shimon
2014-01-01
We consider the problem of learning from a similarity matrix (such as spectral clustering and lowd imensional embedding), when computing pairwise similarities are costly, and only a limited number of entries can be observed. We provide a theoretical analysis using standard notions of graph approximation, significantly generalizing previous results (which focused on spectral clustering with two clusters). We also propose a new algorithmic approach based on adaptive sampling, which experimental...
Topological graph clustering with thin position
Johnson, Jesse
2012-01-01
A clustering algorithm partitions a set of data points into smaller sets (clusters) such that each subset is more tightly packed than the whole. Many approaches to clustering translate the vector data into a graph with edges reflecting a distance or similarity metric on the points, then look for highly connected subgraphs. We introduce such an algorithm based on ideas borrowed from the topological notion of thin position for knots and 3-dimensional manifolds.
Epidemics on random graphs with tunable clustering
Britton, Tom; Deijfen, Maria; Lagerås, Andreas Nordvall; Lindholm, Mathias
2008-01-01
In this paper, a branching process approximation for the spread of a Reed-Frost epidemic on a network with tunable clustering is derived. The approximation gives rise to expressions for the epidemic threshold and the probability of a large outbreak in the epidemic. It is investigated how these quantities varies with the clustering in the graph and it turns out for instance that, as the clustering increases, the epidemic threshold decreases. The network is modelled by a random intersection gra...
Bipartite graph partitioning and data clustering
Zha, Hongyuan; He, Xiaofeng; Ding, Chris; Gu, Ming; Simon, Horst D
2001-01-01
Many data types arising from data mining applications can be modeled as bipartite graphs, examples include terms and documents in a text corpus, customers and purchasing items in market basket analysis and reviewers and movies in a movie recommender system. In this paper, we propose a new data clustering method based on partitioning the underlying bipartite graph. The partition is constructed by minimizing a normalized sum of edge weights between unmatched pairs of vertices of the bipartite g...
Malware Classification based on Call Graph Clustering
Kinable, Joris; Kostakis, Orestis
2010-01-01
Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations...
On the NP-Completeness of Some Graph Cluster Measures
Sima, Jiri; Schaeffer, Satu Elisa
2005-01-01
Graph clustering is the problem of identifying sparsely connected dense subgraphs (clusters) in a given graph. Proposed clustering algorithms usually optimize various fitness functions that measure the quality of a cluster within the graph. Examples of such cluster measures include the conductance, the local and relative densities, and single cluster editing. We prove that the decision problems associated with the optimization tasks of finding the clusters that are optimal with respect to the...
How the result of graph clustering methods depends on the construction of the graph
Maier, Markus; Von Luxburg, Ulrike; Hein, Matthias
2011-01-01
We study the scenario of graph-based clustering algorithms such as spectral clustering. Given a set of data points, one first has to construct a graph on the data points and then apply a graph clustering algorithm to find a suitable partition of the graph. Our main question is if and how the construction of the graph (choice of the graph, choice of parameters, choice of weights) influences the outcome of the final clustering result. To this end we study the convergence of cluster quality meas...
A Nonparametric Bayesian Model for Nested Clustering.
Lee, Juhee; Müller, Peter; Zhu, Yitan; Ji, Yuan
2016-01-01
We propose a nonparametric Bayesian model for clustering where clusters of experimental units are determined by a shared pattern of clustering another set of experimental units. The proposed model is motivated by the analysis of protein activation data, where we cluster proteins such that all proteins in one cluster give rise to the same clustering of patients. That is, we define clusters of proteins by the way that patients group with respect to the corresponding protein activations. This is in contrast to (almost) all currently available models that use shared parameters in the sampling model to define clusters. This includes in particular model based clustering, Dirichlet process mixtures, product partition models, and more. We show results for two typical biostatistical inference problems that give rise to clustering. PMID:26519174
Market Segmentation Using Bayesian Model Based Clustering
Van Hattum, P.
2009-01-01
This dissertation deals with two basic problems in marketing, that are market segmentation, which is the grouping of persons who share common aspects, and market targeting, which is focusing your marketing efforts on one or more attractive market segments. For the grouping of persons who share common aspects a Bayesian model based clustering approach is proposed such that it can be applied to data sets that are specifically used for market segmentation. The cluster algorithm can handle very l...
Seeland, Madeleine
2014-01-01
This thesis focuses on graph clustering. It introduces scalable methods for clustering large databases of small graphs by common scaffolds, i.e., the existence of one sufficiently large subgraph shared by all cluster elements. Further, the thesis studies applications for classification and regression. The experimental results show that it is for the first time possible to cluster millions of graphs within a reasonable time using an accurate scaffold-based similarity measure.
Global cycle properties in graphs with large minimum clustering coefficient
Borchert, Adam; Nicol, Skylar; Oellermann, Ortrud R.
2015-01-01
The clustering coefficient of a vertex in a graph is the proportion of neighbours of the vertex that are adjacent. The minimum clustering coefficient of a graph is the smallest clustering coefficient taken over all vertices. A complete structural characterization of those locally connected graphs, with minimum clustering coefficient 1/2 and maximum degree at most 6, that are fully cycle extendable is given in terms of strongly induced subgraphs with given attachment sets. Moreover, it is show...
Identifying graph clusters using variational inference and links to covariance parametrization.
Barber, David
2009-11-13
Finding clusters of well-connected nodes in a graph is a problem common to many domains, including social networks, the Internet and bioinformatics. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We use a clique matrix decomposition based on a statistical description that encourages clusters to be well connected and few in number. The formal intractability of inferring the clusters is addressed using a variational approximation inspired by mean-field theories in statistical mechanics. Clique matrices also play a natural role in parametrizing positive definite matrices under zero constraints on elements of the matrix. We show that clique matrices can parametrize all positive definite matrices restricted according to a decomposable graph and form a structured factor analysis approximation in the non-decomposable case. Extensions to conjugate Bayesian covariance priors and more general non-Gaussian independence models are briefly discussed. PMID:19805451
Chemical graph-theoretic cluster expansions
International Nuclear Information System (INIS)
A general computationally amenable chemico-graph-theoretic cluster expansion method is suggested as a paradigm for incorporation of chemical structure concepts in a systematic manner. The cluster expansion approach is presented in a formalism general enough to cover a variety of empirical, semiempirical, and even ab initio applications. Formally such approaches for the utilization of chemical structure-related concepts may be viewed as discrete analogues of Taylor series expansions. The efficacy of the chemical structure concepts then is simply bound up in the rate of convergence of the cluster expansions. In many empirical applications, e.g., boiling points, chromatographic separation coefficients, and biological activities, this rate of convergence has been observed to be quite rapid. More note will be made here of quantum chemical applications. Relations to questions concerning size extensivity of energies and size consistency of wave functions are addressed
Efficient Eigen-updating for Spectral Graph Clustering
Dhanjal, Charanpal; Gaudel, Romaric; Clémençon, Stéphan
2014-01-01
Partitioning a graph into groups of vertices such that those within each group are more densely connected than vertices assigned to different groups, known as graph clustering, is often used to gain insight into the organisation of large scale networks and for visualisation purposes. Whereas a large number of dedicated techniques have been recently proposed for static graphs, the design of on-line graph clustering methods tailored for evolving networks is a challenging problem, and much less ...
Graph-based k-means clustering: A comparison of the set versus the generalized median graph
Ferrer, Miquel; Valveny, Ernest; Serratosa, Francesc; Bardaji, I.; Bunke, Horst
2009-01-01
In this paper we propose the application of the generalized median graph in a graph-based k-means clustering algorithm. In the graph-based k-means algorithm, the centers of the clusters have been traditionally represented using the set median graph. We propose an approximate method for the generalized median graph computation that allows to use it to represent the centers of the clusters. Experiments on three databases show that using the generalized median graph as the clusters representativ...
Ranking and clustering of search results: Analysis of Similarity graph
Shevchuk, Ksenia Alexander
2008-01-01
Evaluate the clustering of the similarity matrix and confirm that it is high. Compare the ranking results of the eigenvector ranking and the Link Popularity ranking and confirm for the high clustered graph the correlation between those is larger than for the low clustered graph.
Bayesian Estimation of Conditional Independence Graphs Improves Functional Connectivity Estimates.
Directory of Open Access Journals (Sweden)
Max Hinne
2015-11-01
Full Text Available Functional connectivity concerns the correlated activity between neuronal populations in spatially segregated regions of the brain, which may be studied using functional magnetic resonance imaging (fMRI. This coupled activity is conveniently expressed using covariance, but this measure fails to distinguish between direct and indirect effects. A popular alternative that addresses this issue is partial correlation, which regresses out the signal of potentially confounding variables, resulting in a measure that reveals only direct connections. Importantly, provided the data are normally distributed, if two variables are conditionally independent given all other variables, their respective partial correlation is zero. In this paper, we propose a probabilistic generative model that allows us to estimate functional connectivity in terms of both partial correlations and a graph representing conditional independencies. Simulation results show that this methodology is able to outperform the graphical LASSO, which is the de facto standard for estimating partial correlations. Furthermore, we apply the model to estimate functional connectivity for twenty subjects using resting-state fMRI data. Results show that our model provides a richer representation of functional connectivity as compared to considering partial correlations alone. Finally, we demonstrate how our approach can be extended in several ways, for instance to achieve data fusion by informing the conditional independence graph with data from probabilistic tractography. As our Bayesian formulation of functional connectivity provides access to the posterior distribution instead of only to point estimates, we are able to quantify the uncertainty associated with our results. This reveals that while we are able to infer a clear backbone of connectivity in our empirical results, the data are not accurately described by simply looking at the mode of the distribution over connectivity. The implication
Clustering based on Random Graph Model embedding Vertex Features
Zanghi, Hugo; Volant, Stevenn; Ambroise, Christophe
2009-01-01
Large datasets with interactions between objects are common to numerous scientific fields (i.e. social science, internet, biology...). The interactions naturally define a graph and a common way to explore or summarize such dataset is graph clustering. Most techniques for clustering graph vertices just use the topology of connections ignoring informations in the vertices features. In this paper, we provide a clustering algorithm exploiting both types of data based on a statistical model with l...
Graph Clustering Based on Mixing Time of Random Walks
Avrachenkov, Konstantin; El Chamie, Mahmoud; Neglia, Giovanni
2014-01-01
Clustering of a graph is the task of grouping its nodes in such a way that the nodes within the same cluster are well connected, but they are less connected to nodes in different clusters. In this paper we propose a clustering metric based on the random walks' properties to evaluate the quality of a graph clustering. We also propose a randomized algorithm that identifies a locally optimal clustering of the graph according to the metric defined. The algorithm is intrinsically distributed and a...
Gardiner, Eleanor J; Gillet, Valerie J; Willett, Peter; Cosgrove, David A
2007-01-01
Chemical databases are routinely clustered, with the aim of grouping molecules which share similar structural features. Ideally, medicinal chemists are then able to browse a few representatives of the cluster in order to interpret the shared activity of the cluster members. However, when molecules are clustered using fingerprints, it may be difficult to decipher the structural commonalities which are present. Here, we seek to represent a cluster by means of a maximum common substructure based on the shared functionality of the cluster members. Previously, we have used reduced graphs, where each node corresponds to a generalized functional group, as topological molecular descriptors for virtual screening. In this work, we precluster a database using any clustering method. We then represent the molecules in a cluster as reduced graphs. By repeated application of a maximum common edge substructure (MCES) algorithm, we obtain one or more reduced graph cluster representatives. The sparsity of the reduced graphs means that the MCES calculations can be performed in real time. The reduced graph cluster representatives are readily interpretable in terms of functional activity and can be mapped directly back to the molecules to which they correspond, giving the chemist a rapid means of assessing potential activities contained within the cluster. Clusters of interest are then subject to a detailed R-group analysis using the same iterated MCES algorithm applied to the molecular graphs. PMID:17309248
Document clustering using graph based document representation with constraints
Rafi, Muhammad; Amin, Farnaz; Shaikh, Mohammad Shahid
2014-01-01
Document clustering is an unsupervised approach in which a large collection of documents (corpus) is subdivided into smaller, meaningful, identifiable, and verifiable sub-groups (clusters). Meaningful representation of documents and implicitly identifying the patterns, on which this separation is performed, is the challenging part of document clustering. We have proposed a document clustering technique using graph based document representation with constraints. A graph data structure can easi...
Euclidean Distances, soft and spectral Clustering on Weighted Graphs
Bavaud, François
2010-01-01
We define a class of Euclidean distances on weighted graphs, enabling to perform thermodynamic soft graph clustering. The class can be constructed form the "raw coordinates" encountered in spectral clustering, and can be extended by means of higher-dimensional embeddings (Schoenberg transformations). Geographical flow data, properly conditioned, illustrate the procedure as well as visualization aspects.
Rossi, Fabrice; Villa-Vialaneix, Nathalie
2010-01-01
This paper proposes an organized generalization of Newman and Girvan's modularity measure for graph clustering. Optimized via a deterministic annealing scheme, this measure produces topologically ordered graph clusterings that lead to faithful and readable graph representations based on clustering induced graphs. Topographic graph clustering provides an alternative to more classical solutions in which a standard graph clustering method is applied to build a simpler graph that is then represen...
Graph Theory In Protein Sequence Clustering And Tertiary Structural Matching
Abdullah, Rosni; Rashid, Nur'Aini Abdul; Othman, Fazilah
2008-01-01
The principle of graph theory which has been widely used in computer networks is now being adopted for work in protein clustering, protein structural matching, and protein folding and modeling. In this work, we present two case studies on the use of graph theory for protein clustering and tertiary structural matching. In protein clustering, we extended a clustering algorithm based on a maximal clique while in the protein tertiary structural matching we explored the bipartite graph matching algorithm. The results obtained in both the case studies will be presented.
Statistical mechanics of semi-supervised clustering in sparse graphs
International Nuclear Information System (INIS)
We theoretically study semi-supervised clustering in sparse graphs in the presence of pair-wise constraints on the cluster assignments of nodes. We focus on bi-cluster graphs and study the impact of semi-supervision for varying constraint density and overlap between the clusters. Recent results for unsupervised clustering in sparse graphs indicate that there is a critical ratio of within-cluster and between-cluster connectivities below which clusters cannot be recovered with better than random accuracy. The goal of this paper is to examine the impact of pair-wise constraints on the clustering accuracy. Our results suggest that the addition of constraints does not provide automatic improvement over the unsupervised case. When the density of the constraints is sufficiently small, their only impact is to shift the detection threshold while preserving the criticality. Conversely, if the density of (hard) constraints is above the percolation threshold, the criticality is suppressed and the detection threshold disappears
Bayesian Nonparametric Clustering for Positive Definite Matrices.
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2016-05-01
Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms. PMID:27046838
Limited Random Walk Algorithm for Big Graph Data Clustering
Zhang, Honglei; Kiranyaz, Serkan; Gabbouj, Moncef
2016-01-01
Graph clustering is an important technique to understand the relationships between the vertices in a big graph. In this paper, we propose a novel random-walk-based graph clustering method. The proposed method restricts the reach of the walking agent using an inflation function and a normalization function. We analyze the behavior of the limited random walk procedure and propose a novel algorithm for both global and local graph clustering problems. Previous random-walk-based algorithms depend on the chosen fitness function to find the clusters around a seed vertex. The proposed algorithm tackles the problem in an entirely different manner. We use the limited random walk procedure to find attracting vertices in a graph and use them as features to cluster the vertices. According to the experimental results on the simulated graph data and the real-world big graph data, the proposed method is superior to the state-of-the-art methods in solving graph clustering problems. Since the proposed method uses the embarrass...
Graph-based clustering and data visualization algorithms
Vathy-Fogarassy, Ágnes
2013-01-01
This work presents a data visualization technique that combines graph-based topology representation and dimensionality reduction methods to visualize the intrinsic data structure in a low-dimensional vector space. The application of graphs in clustering and visualization has several advantages. A graph of important edges (where edges characterize relations and weights represent similarities or distances) provides a compact representation of the entire complex data set. This text describes clustering and visualization methods that are able to utilize information hidden in these graphs, based on
Degree and clustering coefficient in sparse random intersection graphs
Bloznelis, Mindaugas
2013-01-01
We establish asymptotic vertex degree distribution and examine its relation to the clustering coefficient in two popular random intersection graph models of Godehardt and Jaworski [Electron. Notes Discrete Math. 10 (2001) 129–132]. For sparse graphs with a positive clustering coefficient, we examine statistical dependence between the (local) clustering coefficient and the degree. Our results are mathematically rigorous. They are consistent with the empirical observation of Foudalis et al. [In...
Accelerated spectral clustering using graph filtering of random signals
Tremblay, Nicolas; Puy, Gilles; Borgnat, Pierre; Gribonval, Rémi; Vandergheynst, Pierre
2016-01-01
We build upon recent advances in graph signal processing to propose a faster spectral clustering algorithm. Indeed, classical spectral clustering is based on the computation of the first $k$ eigenvectors of the similarity matrix' Laplacian, whose computation cost, even for sparse matrices, becomes prohibitive for large datasets. We show that we can estimate the spectral clustering distance matrix without computing these eigenvectors: by graph filtering random signals. Also, we take ad...
Linear and optimization Hamiltonians in clustered exponential random graph modeling
International Nuclear Information System (INIS)
Exponential random graph theory is the complex network analog of the canonical ensemble theory from statistical physics. While it has been particularly successful in modeling networks with specified degree distributions, a naïve model of a clustered network using a graph Hamiltonian linear in the number of triangles has been shown to undergo an abrupt transition into an unrealistic phase of extreme clustering via triangle condensation. Here we study a nonlinear graph Hamiltonian that explicitly forbids such a condensation and show numerically that it generates an equilibrium phase with specified intermediate clustering
Bayesian Network Based Fault Prognosis via Bond Graph Modeling of High-Speed Railway Traction Device
Directory of Open Access Journals (Sweden)
Yunkai Wu
2015-01-01
component-level faults accurately for a high-speed railway traction system, a fault prognosis approach via Bayesian network and bond graph modeling techniques is proposed. The inherent structure of a railway traction system is represented by bond graph model, based on which a multilayer Bayesian network is developed for fault propagation analysis and fault prediction. For complete and incomplete data sets, two different parameter learning algorithms such as Bayesian estimation and expectation maximization (EM algorithm are adopted to determine the conditional probability table of the Bayesian network. The proposed prognosis approach using Pearl’s polytree propagation algorithm for joint probability reasoning can predict the failure probabilities of leaf nodes based on the current status of root nodes. Verification results in a high-speed railway traction simulation system can demonstrate the effectiveness of the proposed approach.
Czech Academy of Sciences Publication Activity Database
Studený, Milan
Granada : DESCAI, University of Granada, 2012, s. 307-314. ISBN 978-84-15536-57-4. [6th European Workshop on Probabilistic Graphical Models (PGM). Granada (ES), 19.09.2012-21.09.2012] R&D Projects: GA ČR GA201/08/0539 Institutional support: RVO:67985556 Keywords : learning Bayesian network structure * characteristic imset * essential graph Subject RIV: BA - General Mathematics http://library.utia.cas.cz/separaty/2012/MTR/studeny-integer linear programming approach to learning Bayesian network structure towards the essential graph.pdf
Bayesian Network Structure Learning from Limited Datasets through Graph Evolution
Tonda, Alberto; Lutton, Evelyne; Reuillon, Romain; Squillero, Giovanni; Wuillemin, Pierre-Henri
2012-01-01
Bayesian networks are stochastic models, widely adopted to encode knowledge in several fields. One of the most interesting features of a Bayesian network is the possibility of learning its structure from a set of data, and subsequently use the resulting model to perform new predictions. Structure learning for such models is a NP-hard problem, for which the scientific community developed two main approaches: score-and-search metaheuristics, often evolutionary-based, and dependency-analysis det...
Clustering and Classification in Text Collections Using Graph Modularity
Pivovarov, Grigory; Trunov, Sergei
2011-01-01
A new fast algorithm for clustering and classification of large collections of text documents is introduced. The new algorithm employs the bipartite graph that realizes the word-document matrix of the collection. Namely, the modularity of the bipartite graph is used as the optimization functional. Experiments performed with the new algorithm on a number of text collections had shown a competitive quality of the clustering (classification), and a record-breaking speed.
Divisive Betweenness Centrality Clustering on Graphs Weighted by Timestamps
Friberg, Oscar; Englesson, Björn
2015-01-01
Data visualization is important to obtain a better understanding of many networks. However, for larger networks it is hard to perceive the visualization of the network. In order to simplify such visualizations of networks, graph clustering algorithms may be used to identify subgraphs of close vertices and group them together. In this paper we study graph clustering algorithms applied to event structures from continuous integration infrastructures. These event structures are special in the sen...
GraphAlignment: Bayesian pairwise alignment of biological networks
Directory of Open Access Journals (Sweden)
Kolář Michal
2012-11-01
Full Text Available Abstract Background With increased experimental availability and accuracy of bio-molecular networks, tools for their comparative and evolutionary analysis are needed. A key component for such studies is the alignment of networks. Results We introduce the Bioconductor package GraphAlignment for pairwise alignment of bio-molecular networks. The alignment incorporates information both from network vertices and network edges and is based on an explicit evolutionary model, allowing inference of all scoring parameters directly from empirical data. We compare the performance of our algorithm to an alternative algorithm, Græmlin 2.0. On simulated data, GraphAlignment outperforms Græmlin 2.0 in several benchmarks except for computational complexity. When there is little or no noise in the data, GraphAlignment is slower than Græmlin 2.0. It is faster than Græmlin 2.0 when processing noisy data containing spurious vertex associations. Its typical case complexity grows approximately as O(N2.6. On empirical bacterial protein-protein interaction networks (PIN and gene co-expression networks, GraphAlignment outperforms Græmlin 2.0 with respect to coverage and specificity, albeit by a small margin. On large eukaryotic PIN, Græmlin 2.0 outperforms GraphAlignment. Conclusions The GraphAlignment algorithm is robust to spurious vertex associations, correctly resolves paralogs, and shows very good performance in identification of homologous vertices defined by high vertex and/or interaction similarity. The simplicity and generality of GraphAlignment edge scoring makes the algorithm an appropriate choice for global alignment of networks.
Graph partitions and cluster synchronization in networks of oscillators
Schaub, Michael T; Billeh, Yazan N; Delvenne, Jean-Charles; Lambiotte, Renaud; Barahona, Mauricio
2016-01-01
Synchronization over networks depends strongly on the structure of the coupling between the oscillators.When the coupling presents certain regularities, the dynamics can be coarse-grained into clusters by means of External Equitable Partitions of the network graph and their associated quotient graphs. We exploit this graph-theoretical concept to study the phenomenon of cluster synchronization, in which different groups of nodes converge to distinct behaviors. We derive conditions and properties of networks in which such clustered behavior emerges, and show that the ensuing dynamics is the result of the localization of the eigenvectors of the associated graph Laplacians linked to the existence of invariant subspaces. The framework is applied to both linear and non-linear models, first for the standard case of networks with positive edges, before being generalized to the case of signed networks with both positive and negative interactions. We illustrate our results with examples of both signed and unsigned grap...
Distributed Implementation of SIGNAL: Scheduling & Graph Clustering
Maffeis, Olivier; Le Guernic, Paul
1994-01-01
This paper introduces the scheduling strategy and some key tools which have been designed for the distributed implementation of Signal, a real-time synchronous dataflow language. First, we motivate a scheduling strategy with respect to the reactivity and time-predictability requirements bound to real-time computing. Then, several key tools to implement this scheduling strategy are described. These tools are acting on the concept of Synchronous-Flow Dependence Graph (SFD Graph) which defines a...
The clustering coefficient of a scale-free random graph
Eggemann, N.; Noble, S D
2009-01-01
We consider a random graph process in which, at each time step, a new vertex is added with m out-neighbours, chosen with probabilities proportional to their degree plus a strictly positive constant. We show that the expectation of the clustering coefficient of the graph process is asymptotically proportional to log n/n. Bollob\\'as and Riordan have previously shown that when the constant is zero, the same expectation is asymptotically proportional to ((log n)^2)/n.
Graph Connectivity in Noisy Sparse Subspace Clustering
Wang, Yining; Wang, Yu-Xiang; Singh, Aarti
2015-01-01
Subspace clustering is the problem of clustering data points into a union of low-dimensional linear/affine subspaces. It is the mathematical abstraction of many important problems in computer vision, image processing and machine learning. A line of recent work (4, 19, 24, 20) provided strong theoretical guarantee for sparse subspace clustering (4), the state-of-the-art algorithm for subspace clustering, on both noiseless and noisy data sets. It was shown that under mild conditions, with high ...
Bayesian Inference of Kinematics and Memberships of Open Cluster
Shao, Z. Y.; Chen, L.; Zhong, J.; Hou, J. L.
2014-07-01
Based on the Bayesian Inference (BI) method, the Multiple-modelling approach is improved to combine coordinative positions, proper motions (PM) and radial velocities (RV), to separate the motion of the open cluster from field stars, as well as to describe the intrinsic kinematic status of the cluster.
R/BHC: fast Bayesian hierarchical clustering for microarray data
Directory of Open Access Journals (Sweden)
Grant Murray
2009-08-01
Full Text Available Abstract Background Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data analysis, little attention has been paid to uncertainty in the results obtained. Results We present an R/Bioconductor port of a fast novel algorithm for Bayesian agglomerative hierarchical clustering and demonstrate its use in clustering gene expression microarray data. The method performs bottom-up hierarchical clustering, using a Dirichlet Process (infinite mixture to model uncertainty in the data and Bayesian model selection to decide at each step which clusters to merge. Conclusion Biologically plausible results are presented from a well studied data set: expression profiles of A. thaliana subjected to a variety of biotic and abiotic stresses. Our method avoids several limitations of traditional methods, for example how many clusters there should be and how to choose a principled distance metric.
van Laarhoven, Twan; Marchiori, Elena
2014-01-01
Many graph clustering quality functions suffer from a resolution limit, the inability to find small clusters in large graphs. So called resolution-limit-free quality functions do not have this limit. This property was previously introduced for hard clustering, that is, graph partitioning. We investigate the resolution-limit-free property in the context of Non-negative Matrix Factorization (NMF) for hard and soft graph clustering. To use NMF in the hard clustering setting, a common approach is...
An Empirical Comparison of the Summarization Power of Graph Clustering Methods
Liu, Yike; Shah, Neil; Koutra, Danai
2015-01-01
How do graph clustering techniques compare with respect to their summarization power? How well can they summarize a million-node graph with a few representative structures? Graph clustering or community detection algorithms can summarize a graph in terms of coherent and tightly connected clusters. In this paper, we compare and contrast different techniques: METIS, Louvain, spectral clustering, SlashBurn and KCBC, our proposed k-core-based clustering method. Unlike prior work that focuses on v...
Learning Bayesian network structure: towards the essential graph by integer linear programming tools
Czech Academy of Sciences Publication Activity Database
Studený, Milan; Haws, D.
2014-01-01
Roč. 55, č. 4 (2014), s. 1043-1071. ISSN 0888-613X R&D Projects: GA ČR GA13-20012S Institutional support: RVO:67985556 Keywords : learning Bayesian network structure * integer linear programming * characteristic imset * essential graph Subject RIV: BA - General Mathematics Impact factor: 2.451, year: 2014 http://library.utia.cas.cz/separaty/2014/MTR/studeny-0427002.pdf
A hypergraph-based model for graph clustering: application to image indexing
Jouili, Salim; Tabbone, Salvatore
2009-01-01
Version finale disponible : www.springerlink.com International audience In this paper, we introduce a prototype-based clustering algorithm dealing with graphs. We propose a hypergraph-based model for graph data sets by allowing clusters overlapping. More precisely, in this representation one graph can be assigned to more than one cluster. Using the concept of the graph median and a given threshold, the proposed algorithm detects automatically the number of classes in the graph database....
GPU Acceleration of Graph Matching, Clustering, and Partitioning
Fagginger Auer, B.O.
2013-01-01
We consider sequential algorithms for hypergraph partitioning and GPU (i.e., fine-grained shared-memory parallel) algorithms for graph partitioning and clustering. Our investigation into sequential hypergraph partitioning is concerned with the efficient construction of high-quality matchings for hyp
Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs
Energy Technology Data Exchange (ETDEWEB)
Zhang, Baichuan; Choudhury, Sutanay; Al-Hasan, Mohammad; Ning, Xia; Agarwal, Khushbu; Purohit, Sumit; Pesantez, Paola
2016-02-01
Estimating the confidence for a link is a critical task for Knowledge Graph construction. Link prediction, or predicting the likelihood of a link in a knowledge graph based on prior state is a key research direction within this area. We propose a Latent Feature Embedding based link recommendation model for prediction task and utilize Bayesian Personalized Ranking based optimization technique for learning models for each predicate. Experimental results on large-scale knowledge bases such as YAGO2 show that our approach achieves substantially higher performance than several state-of-art approaches. Furthermore, we also study the performance of the link prediction algorithm in terms of topological properties of the Knowledge Graph and present a linear regression model to reason about its expected level of accuracy.
Non-parametric Bayesian graph models reveal community structure in resting state fMRI
DEFF Research Database (Denmark)
Andersen, Kasper Winther; Madsen, Kristoffer H.; Siebner, Hartwig Roman;
2014-01-01
Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian...... models for node clustering in complex networks. In particular, we test their ability to predict unseen data and their ability to reproduce clustering across datasets. The three generative models considered are the Infinite Relational Model (IRM), Bayesian Community Detection (BCD), and the Infinite...... Diagonal Model (IDM). The models define probabilities of generating links within and between clusters and the difference between the models lies in the restrictions they impose upon the between-cluster link probabilities. IRM is the most flexible model with no restrictions on the probabilities of links...
A cluster expansion approach to exponential random graph models
International Nuclear Information System (INIS)
The exponential family of random graphs are among the most widely studied network models. We show that any exponential random graph model may alternatively be viewed as a lattice gas model with a finite Banach space norm. The system may then be treated using cluster expansion methods from statistical mechanics. In particular, we derive a convergent power series expansion for the limiting free energy in the case of small parameters. Since the free energy is the generating function for the expectations of other random variables, this characterizes the structure and behavior of the limiting network in this parameter region
Exploration and Exploitation Operators for Genetic Graph Clustering Algorithm
Czech Academy of Sciences Publication Activity Database
Kohout, J.; Neruda, Roman
Berlin : Springer, 2012 - (Chen, L.; Felfernig, A.; Liu, J.; Ras, Z.), s. 87-92 ISBN 978-3-642-34623-1. ISSN 0302-9743. - (Lecture Notes in Computer Science. 7661). [ISMIS 2012. International Symposium on Methodologies for Intelligent System /20./. Macau (MO), 05.12.2012-07.12.2012] R&D Projects: GA ČR GAP202/11/1368 Institutional support: RVO:67985807 Keywords : genetic algorithms * graph clustering * social networks Subject RIV: IN - Informatics, Computer Science
Graphic Symbol Recognition using Graph Based Signature and Bayesian Network Classifier
Luqman, Muhammad Muzzamil; Ramel, Jean-Yves
2010-01-01
We present a new approach for recognition of complex graphic symbols in technical documents. Graphic symbol recognition is a well known challenge in the field of document image analysis and is at heart of most graphic recognition systems. Our method uses structural approach for symbol representation and statistical classifier for symbol recognition. In our system we represent symbols by their graph based signatures: a graphic symbol is vectorized and is converted to an attributed relational graph, which is used for computing a feature vector for the symbol. This signature corresponds to geometry and topology of the symbol. We learn a Bayesian network to encode joint probability distribution of symbol signatures and use it in a supervised learning scenario for graphic symbol recognition. We have evaluated our method on synthetically deformed and degraded images of pre-segmented 2D architectural and electronic symbols from GREC databases and have obtained encouraging recognition rates.
Bayesian analysis for exponential random graph models using the adaptive exchange sampler
Jin, Ick Hoon
2013-01-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the issue of intractable normalizing constants encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.
Bayesian Analysis for Exponential Random Graph Models Using the Adaptive Exchange Sampler.
Jin, Ick Hoon; Yuan, Ying; Liang, Faming
2013-10-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the intractable normalizing constant and model degeneracy. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the intractable normalizing constant and model degeneracy issues encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency. PMID:24653788
On the NP-Completeness of Some Graph Cluster Measures
Czech Academy of Sciences Publication Activity Database
Šíma, Jiří; Schaeffer, S.E.
Berlin : Springer, 2006 - (Wiedermann, J.; Tel, G.; Pokorný, J.; Bieliková, M.; Štuller, J.), s. 530-537 ISBN 3-540-31198-X. - (Lecture Notes in Computer Science. 3831). [SOFSEM 2006. Conference on Current Trends in Theory and Practice of Computer Science /32./. Měřín (CZ), 21.01.2006-27.01.2006] R&D Projects: GA MŠk(CZ) 1M0545 Grant ostatní: Academy of Finland(FI) 126235 Institutional research plan: CEZ:AV0Z10300504 Keywords : graph clustering * conductance * density * cluster editing * NP-completeness Subject RIV: BA - General Mathematics
Exact solution of random graphs for cluster fragmentation
International Nuclear Information System (INIS)
We present the exact solution of a combinatorial fragmentation model and we show how it can be used as a touchstone for the fragmentation of atomic clusters. This model, random graphs (RG), also called mean field percolation, displays a phase transition. In this model, the clusters are solely described as connected entities called nodes. The connections, called bonds, can be active of broken. We have established the algebraic formulas of the probability of all the fragmentation channels. The results depend on the number of nodes and of the number of broken bonds. Using RG, we show example where information was deduced from fragmentation of systems consisting of finite sets of nodes.
Application of Bayesian graphs to SN Ia data analysis and compression
Ma, Con; Bassett, Bruce A
2016-01-01
Bayesian graphical models are an efficient tool for modelling complex data and derive self-consistent expressions of the posterior distribution of model parameters. We apply Bayesian graphs to perform statistical analyses of Type Ia supernova (SN Ia) luminosity distance measurements from the Joint Light-curve Analysis (JLA) dataset (Betoule et al. 2014, arXiv:1401.4064). In contrast to the $\\chi^2$ approach used in previous studies, the Bayesian inference allows us to fully account for the standard-candle parameter dependence of the data covariance matrix. Comparing with $\\chi^2$ analysis results we find a systematic offset of the marginal model parameter bounds. We demonstrate that the bias is statistically significant in the case of the SN Ia standardization parameters with a maximal $6\\sigma$ shift of the SN light-curve colour correction. In addition, we find that the evidence for a host galaxy correction is now only $2.4\\sigma$. Systematic offsets on the cosmological parameters remain small, but may incre...
Atomic cluster and graph states. An engineering proposal
International Nuclear Information System (INIS)
We suggest a simple method to generate cluster states of two-level atoms. The protocol utilizes minimum cavity/atomic resources and is based upon standard cavity QED tools along with Ramsey technique. Two identical atoms, each initially prepared in coherent superposition, interact simultaneously with an initially vacuum state cavity for a specified time. One atom has resonant interactions while at the same time the other one interacts dispersively. When, after completion of the interaction, cavity is again left into vacuum then atoms are shown to be entangled in bipartite cluster state. The method is also generalized to generate multi-partite linear cluster states as well as the atomic graph states. (author)
Event Coreference Resolution using Mincut based Graph Clustering
Directory of Open Access Journals (Sweden)
Sangeetha.S
2012-10-01
Full Text Available To extract participants of an event instance, it is necessary to identify all the sentences that describe the event instance. The set of all sentences referring to the same event instance are said to be corefering each other. Our proposed approach formulates the event coreference resolution as a graph based clustering model. It identifies the corefering sentences using minimum cut (mincut based on similarity score between each pair of sentences at various levels such as trigger word similarity, time stamp similarity, entity similarity and semantic similarity. It achieves good B-Cubed F-measure score with some loss in recall.
Two-Phase Genetic Algorithm for Social Network Graphs Clustering
Czech Academy of Sciences Publication Activity Database
Kohout, J.; Neruda, Roman
Los Alamitos: IEEE Computer Society, 2013 - (Barolli, L.; Xhafa, F.; Takizawa, M.; Enokido, T.; Hsu, H.), s. 197-202 ISBN 978-0-7695-4952-1. [WAINA 2013. International Conference on Advanced Information Networking and Applications Workshops /27./. Barcelona (ES), 25.03.2013-28.03.2013] R&D Projects: GA ČR GAP202/11/1368 Grant ostatní: European Office of Aerospace Research and Development(XE) FA8655-11-3035 Institutional support: RVO:67985807 Keywords : clustering * genetic algorithms * graph Subject RIV: IN - Informatics, Computer Science
Bayesian inference of mass segregation of open clusters
Shao, Zhengyi; Chen, Li; Lin, Chien-Cheng; Zhong, Jing; Hou, Jinliang
2015-08-01
Based on the Bayesian inference (BI) method, the mixture-modeling approach is improved to combine all kinematic data, including the coordinative position, proper motion (PM) and radial velocity (RV), to separate the motion of the cluster from field stars in its area, as well as to describe the intrinsic kinematic status. Meanwhile, the membership probabilities of individual stars are determined as by product results. This method has been testified by simulation of toy models and it was found that the joint usage of multiple kinematic data can significantly reduce the missing rate of membership determination, say from ~15% for single data type to 1% for using all position, proper motion and radial velocity data.By combining kinematic data from multiple sources of photometric and redshift surveys, such as WIYN and APOGEE, M67 and NGC188 are revisited. Mass segregation is identified clearly for both of these two old open clusters, either in position or in PM spaces, since the Bayesian evidence (BE) of the model, which includes the segregation parameters, is much larger than that without it. The ongoing work is applying this method to the LAMOST released data which contains a large amount of RVs cover ~200 nearby open clusters. If the coming GAIA data can be used, the accuracy of tangential velocity will be largely improved and the intrinsic kinematics of open clusters can be well investigated, though they are usually less than 1 km/s.
Detecting Galaxy Clusters in the DLS and CARS: a Bayesian Cluster Finder
Ascaso, Begoña; Benítez, Narciso
2010-01-01
The detection of galaxy clusters in present and future surveys enables measuring mass-to-light ratios, clustering properties or galaxy cluster abundances and therefore, constraining cosmological parameters. We present a new technique for detecting galaxy clusters, which is based on the Matched Filter Algorithm from a Bayesian point of view. The method is able to determine the position, redshift and richness of the cluster through the maximization of a filter depending on galaxy luminosity, density and photometric redshift combined with a galaxy cluster prior. We tested the algorithm through realistic mock galaxy catalogs, revealing that the detections are 100% complete and 80% pure for clusters up to z 25 (Abell Richness > 0). We applied the algorithm to the CFHTLS Archive Research Survey (CARS) data, recovering similar detections as previously published using the same data plus additional clusters that are very probably real. We also applied this algorithm to the Deep Lens Survey (DLS), obtaining the first ...
An agglomerative hierarchical approach to visualization in Bayesian clustering problems.
Dawson, K J; Belkhir, K
2009-07-01
Clustering problems (including the clustering of individuals into outcrossing populations, hybrid generations, full-sib families and selfing lines) have recently received much attention in population genetics. In these clustering problems, the parameter of interest is a partition of the set of sampled individuals--the sample partition. In a fully Bayesian approach to clustering problems of this type, our knowledge about the sample partition is represented by a probability distribution on the space of possible sample partitions. As the number of possible partitions grows very rapidly with the sample size, we cannot visualize this probability distribution in its entirety, unless the sample is very small. As a solution to this visualization problem, we recommend using an agglomerative hierarchical clustering algorithm, which we call the exact linkage algorithm. This algorithm is a special case of the maximin clustering algorithm that we introduced previously. The exact linkage algorithm is now implemented in our software package PartitionView. The exact linkage algorithm takes the posterior co-assignment probabilities as input and yields as output a rooted binary tree, or more generally, a forest of such trees. Each node of this forest defines a set of individuals, and the node height is the posterior co-assignment probability of this set. This provides a useful visual representation of the uncertainty associated with the assignment of individuals to categories. It is also a useful starting point for a more detailed exploration of the posterior distribution in terms of the co-assignment probabilities. PMID:19337306
Dorow, Beate; Widdows, Dominic; Ling, Katarina; Eckmann, Jean-Pierre; Sergi, Danilo; Moses, Elisha
2004-01-01
We introduce two different approaches for clustering semantically similar words. We accommodate ambiguity by allowing a word to belong to several clusters. Both methods use a graph-theoretic representation of words and their paradigmatic relationships. The first approach is based on the concept of curvature and divides the word graph into classes of similar words by removing words of low curvature which connect several dispersed clusters. The second method, instead of clustering the nodes, cl...
K-means based approaches to clustering nodes in annotated graphs
Witsenburg, Tijn; Blockeel, Hendrik
2011-01-01
The goal of clustering is to form groups of similar elements. Quality criteria for clusterings, as well as the notion of similarity, depend strongly on the application domain, which explains the existence of many different clustering algorithms and similarity measures. In this paper we focus on the problem of clustering annotated nodes in a graph, when the similarity between nodes depends on both their annotations and their context in the graph ("hybrid" similarity), using k-means-like cluste...
Low-Complexity Bayesian Estimation of Cluster-Sparse Channels
Ballal, Tarig
2015-09-18
This paper addresses the problem of channel impulse response estimation for cluster-sparse channels under the Bayesian estimation framework. We develop a novel low-complexity minimum mean squared error (MMSE) estimator by exploiting the sparsity of the received signal profile and the structure of the measurement matrix. It is shown that due to the banded Toeplitz/circulant structure of the measurement matrix, a channel impulse response, such as underwater acoustic channel impulse responses, can be partitioned into a number of orthogonal or approximately orthogonal clusters. The orthogonal clusters, the sparsity of the channel impulse response and the structure of the measurement matrix, all combined, result in a computationally superior realization of the MMSE channel estimator. The MMSE estimator calculations boil down to simpler in-cluster calculations that can be reused in different clusters. The reduction in computational complexity allows for a more accurate implementation of the MMSE estimator. The proposed approach is tested using synthetic Gaussian channels, as well as simulated underwater acoustic channels. Symbol-error-rate performance and computation time confirm the superiority of the proposed method compared to selected benchmark methods in systems with preamble-based training signals transmitted over clustersparse channels.
Bayesian Analysis of Multiple Populations in Galactic Globular Clusters
Wagner-Kaiser, Rachel A.; Sarajedini, Ata; von Hippel, Ted; Stenning, David; Piotto, Giampaolo; Milone, Antonino; van Dyk, David A.; Robinson, Elliot; Stein, Nathan
2016-01-01
We use GO 13297 Cycle 21 Hubble Space Telescope (HST) observations and archival GO 10775 Cycle 14 HST ACS Treasury observations of Galactic Globular Clusters to find and characterize multiple stellar populations. Determining how globular clusters are able to create and retain enriched material to produce several generations of stars is key to understanding how these objects formed and how they have affected the structural, kinematic, and chemical evolution of the Milky Way. We employ a sophisticated Bayesian technique with an adaptive MCMC algorithm to simultaneously fit the age, distance, absorption, and metallicity for each cluster. At the same time, we also fit unique helium values to two distinct populations of the cluster and determine the relative proportions of those populations. Our unique numerical approach allows objective and precise analysis of these complicated clusters, providing posterior distribution functions for each parameter of interest. We use these results to gain a better understanding of multiple populations in these clusters and their role in the history of the Milky Way.Support for this work was provided by NASA through grant numbers HST-GO-10775 and HST-GO-13297 from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555. This material is based upon work supported by the National Aeronautics and Space Administration under Grant NNX11AF34G issued through the Office of Space Science. This project was supported by the National Aeronautics & Space Administration through the University of Central Florida's NASA Florida Space Grant Consortium.
Wagner-Kaiser, R; Sarajedini, A; von Hippel, T; van Dyk, D A; Robinson, E; Stein, N; Jefferys, W H
2016-01-01
We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of 30 Galactic Globular Clusters to characterize two distinct stellar populations. A sophisticated Bayesian technique is employed to simultaneously sample the joint posterior distribution of age, distance, and extinction for each cluster, as well as unique helium values for two populations within each cluster and the relative proportion of those populations. We find the helium differences among the two populations in the clusters fall in the range of ~0.04 to 0.11. Because adequate models varying in CNO are not presently available, we view these spreads as upper limits and present them with statistical rather than observational uncertainties. Evidence supports previous studies suggesting an increase in helium content concurrent with increasing mass of the cluster and also find that the proportion of the first population of stars increases with mass as well. Our results are examined in the context of proposed g...
Clustering and Bayesian network for image of faces classification
Jayech, Khlifia
2012-01-01
In a content based image classification system, target images are sorted by feature similarities with respect to the query (CBIR). In this paper, we propose to use new approach combining distance tangent, k-means algorithm and Bayesian network for image classification. First, we use the technique of tangent distance to calculate several tangent spaces representing the same image. The objective is to reduce the error in the classification phase. Second, we cut the image in a whole of blocks. For each block, we compute a vector of descriptors. Then, we use K-means to cluster the low-level features including color and texture information to build a vector of labels for each image. Finally, we apply five variants of Bayesian networks classifiers (Na\\"ive Bayes, Global Tree Augmented Na\\"ive Bayes (GTAN), Global Forest Augmented Na\\"ive Bayes (GFAN), Tree Augmented Na\\"ive Bayes for each class (TAN), and Forest Augmented Na\\"ive Bayes for each class (FAN) to classify the image of faces using the vector of labels. ...
A SURVEY ON CLUSTERING TECHNIQUE FOR DATASETS USING EFFICIENT GRAPH STRUCTURES
K. Rajendra Prasad,; Dr. P.Govinda Rajulu
2010-01-01
Clustering is an unsupervised learning datamining technique. The principle of clustering depends on the concept of Distance Metric or Similarity Metric. Because of the variety of feature types and scales, the distance measure (or measures) must be chosen carefully. Edge weight in the graph finds the information for dissimilarity value between two data objects in the given database. A survey on clustering schema using minimum spanning tree (MST) approach is discussed in this paper. A graph-bas...
Hypergraph Modelling and Graph Clustering Process Applied to Co-word Analysis
Polanco, Xavier; San Juan, Eric
2007-01-01
We argue that any document set can be modelled as a hypergraph, and we apply a graph clustering process as a way of analysis. A variant of the single link clustering is presented, and we assert that it is better suited to extract interesting clusters formed along easily interpretable paths of associated items than algorithms based on detecting high density regions. We propose a methodology that involves the extraction of similarity graphs from the indexed-dataset represented as a hypergraph. ...
Cluster expansion in the truncated bootstrap model and linear graphs theory
International Nuclear Information System (INIS)
Using the formalism of linear graphs theory, we obtain the cluster expansion for the grand potential of interacting hadronic systems in the framework of the truncated bootstrap model. We show that the coefficients of the expansion are constructed from two classes of Cayley-tree graphs contributing with opposite sign, related to the two-phase nature of the truncated bootstrap model. (orig.)
A note on cluster expansions, tree graph identities, extra 1/N factors
International Nuclear Information System (INIS)
We draw attention to a new tree graph identity which substantially improves on the usual tree graph method of proving convergence of cluster expansions in statistical mechanics and quantum field theory. We can control expansions that could not be controlled before. (orig.)
Degree-degree distribution in a power law random intersection graph with clustering
Bloznelis, Mindaugas
2014-01-01
The bivariate distribution of degrees of adjacent vertices (degree-degree distribution) is an important network characteristic defining the statistical dependencies between degrees of adjacent vertices. We show the asymptotic degree-degree distribution of a sparse inhomogeneous random intersection graph and discuss its relation to the clustering and power law properties of the graph.
Clustering cliques for graph-based summarization of the biomedical research literature
DEFF Research Database (Denmark)
Zhang, Han; Fiszman, Marcelo; Shin, Dongwook;
2013-01-01
Background: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts).Results: SemRep is u...
Spectral clustering and biclustering learning large graphs and contingency tables
Bolla, Marianna
2013-01-01
Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, mult
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Voltage Graphs and Cluster Consensus with Point Group Symmetries
Chen, Xudong; Belabbas, Mohamed-Ali; Basar, Tamer
2016-01-01
A cluster consensus system is a multi-agent system in which the autonomous agents communicate to form multiple clusters, with each cluster of agents asymptotically converging to the same clustering point. We introduce in this paper a special class of cluster consensus dynamics, termed the $G$-clustering dynamics for $G$ a point group, whereby the autonomous agents can form as many as $|G|$ clusters, and moreover, the associated $|G|$ clustering points exhibit a geometric symmetry induced by t...
Self-similar non-clustered planar graphs as models for complex networks
International Nuclear Information System (INIS)
In this paper we introduce a family of planar, modular and self-similar graphs which has small-world and scale-free properties. The main parameters of this family are comparable to those of networks associated with complex systems, and therefore the graphs are of interest as mathematical models for these systems. As the clustering coefficient of the graphs is zero, this family is an explicit construction that does not match the usual characterization of hierarchical modular networks, namely that vertices have clustering values inversely proportional to their degrees
Feroz, F.; Hobson, M. P.; Zwart, J T L; Saunders, R. D. E.; Grainge, K. J. B.
2008-01-01
We present a Bayesian approach to modelling galaxy clusters using multi-frequency pointed observations from telescopes that exploit the Sunyaev--Zel'dovich effect. We use the recently developed MultiNest technique (Feroz, Hobson & Bridges, 2008) to explore the high-dimensional parameter spaces and also to calculate the Bayesian evidence. This permits robust parameter estimation as well as model comparison. Tests on simulated Arcminute Microkelvin Imager observations of a cluster, in the prese...
Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures
Raymond, J.W.; Blankley, C.J.; Willett, P.
2003-01-01
This paper compares several published methods for clustering chemical structures, using both graph- and fingerprint-based similarity measures. The clusterings from each method were compared to determine the degree of cluster overlap. Each method was also evaluated on how well it grouped structures into clusters possessing a non-trivial substructural commonality. The methods which employ adjustable parameters were tested to determine the stability of each parameter for datasets of varying size...
Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
Directory of Open Access Journals (Sweden)
Theis Fabian J
2010-10-01
Full Text Available Abstract Background Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. Results Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. Conclusions In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning
A framework for graph-based synthesis, analysis, and visualization of HPC cluster job data.
Energy Technology Data Exchange (ETDEWEB)
Mayo, Jackson R.; Kegelmeyer, W. Philip, Jr.; Wong, Matthew H.; Pebay, Philippe Pierre; Gentile, Ann C.; Thompson, David C.; Roe, Diana C.; De Sapio, Vincent; Brandt, James M.
2010-08-01
The monitoring and system analysis of high performance computing (HPC) clusters is of increasing importance to the HPC community. Analysis of HPC job data can be used to characterize system usage and diagnose and examine failure modes and their effects. This analysis is not straightforward, however, due to the complex relationships that exist between jobs. These relationships are based on a number of factors, including shared compute nodes between jobs, proximity of jobs in time, etc. Graph-based techniques represent an approach that is particularly well suited to this problem, and provide an effective technique for discovering important relationships in job queuing and execution data. The efficacy of these techniques is rooted in the use of a semantic graph as a knowledge representation tool. In a semantic graph job data, represented in a combination of numerical and textual forms, can be flexibly processed into edges, with corresponding weights, expressing relationships between jobs, nodes, users, and other relevant entities. This graph-based representation permits formal manipulation by a number of analysis algorithms. This report presents a methodology and software implementation that leverages semantic graph-based techniques for the system-level monitoring and analysis of HPC clusters based on job queuing and execution data. Ontology development and graph synthesis is discussed with respect to the domain of HPC job data. The framework developed automates the synthesis of graphs from a database of job information. It also provides a front end, enabling visualization of the synthesized graphs. Additionally, an analysis engine is incorporated that provides performance analysis, graph-based clustering, and failure prediction capabilities for HPC systems.
KmsGC: An Unsupervised Color Image Segmentation Algorithm Based on K-Means Clustering and Graph Cut
Binmei Liang; Jianzhou Zhang
2014-01-01
For unsupervised color image segmentation, we propose a two-stage algorithm, KmsGC, that combines K-means clustering with graph cut. In the first stage, K-means clustering algorithm is applied to make an initial clustering, and the optimal number of clusters is automatically determined by a compactness criterion that is established to find clustering with maximum intercluster distance and minimum intracluster variance. In the second stage, a multiple terminal vertices weighted graph is constr...
Cluster Consensus of Nonlinearly Coupled Multi-Agent Systems in Directed Graphs
Lu, Xiao-Qing; Francis, Austin; Chen, Shi-Hua
2010-05-01
We investigate the cluster consensus problem in directed networks of nonlinearly coupled multi-agent systems by using pinning control. Depending on the community structure generated by the group partition of the underlying digraph, various clusters can be made coherently independent by applying feedback injections to a fraction of the agents. Sufficient conditions for cluster consensus are obtained using algebraic graph theory and matrix theory and some simulations results are included to illustrate the method.
Constructing the L2-Graph for Robust Subspace Learning and Subspace Clustering
Peng, Xi; Yu, Zhiding; Tang, Huajin; Yi, Zhang
2012-01-01
Under the framework of graph-based learning, the key to robust subspace clustering and subspace learning is to obtain a good similarity graph that eliminates the effects of errors and retains only connections between the data points from the same subspace (i.e., intra-subspace data points). Recent works achieve good performance by modeling errors into their objective functions to remove the errors from the inputs. However, these approaches face the limitations that the structure of errors sho...
Image Segmentation by Size-Dependent Single Linkage Clustering of a Watershed Basin Graph
Zlateski, Aleksandar; Seung, H. Sebastian
2015-01-01
We present a method for hierarchical image segmentation that defines a disaffinity graph on the image, over-segments it into watershed basins, defines a new graph on the basins, and then merges basins with a modified, size-dependent version of single linkage clustering. The quasilinear runtime of the method makes it suitable for segmenting large images. We illustrate the method on the challenging problem of segmenting 3D electron microscopic brain images.
Shortest-path constraints for 3D multiobject semiautomatic segmentation via clustering and Graph Cut
Kéchichian, Razmig; Valette, Sébastien; Desvignes, Michel; Prost, Rémy
2013-01-01
We derive shortest-path constraints from graph models of structure adjacency relations and introduce them in a joint centroidal Voronoi image clustering and Graph Cut multiobject semiautomatic segmentation framework. The vicinity prior model thus defined is a piecewise-constant model incurring multiple levels of penalization capturing the spatial configuration of structures in multiobject segmentation. Qualitative and quantitative analyses and comparison with a Potts prior-based approach and ...
Markov clustering versus affinity propagation for the partitioning of protein interaction graphs
Directory of Open Access Journals (Sweden)
Vlasblom James
2009-03-01
Full Text Available Abstract Background Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clustering algorithms, some of which have been applied to this problem. One of the most successful clustering procedures in this context has been the Markov Cluster algorithm (MCL, which was recently shown to outperform a number of other procedures, some of which were specifically designed for partitioning protein interactions graphs. A novel promising clustering procedure termed Affinity Propagation (AP was recently shown to be particularly effective, and much faster than other methods for a variety of problems, but has not yet been applied to partition protein interaction graphs. Results In this work we compare the performance of the Affinity Propagation (AP and Markov Clustering (MCL procedures. To this end we derive an unweighted network of protein-protein interactions from a set of 408 protein complexes from S. cervisiae hand curated in-house, and evaluate the performance of the two clustering algorithms in recalling the annotated complexes. In doing so the parameter space of each algorithm is sampled in order to select optimal values for these parameters, and the robustness of the algorithms is assessed by quantifying the level of complex recall as interactions are randomly added or removed to the network to simulate noise. To evaluate the performance on a weighted protein interaction graph, we also apply the two algorithms to the consolidated protein interaction network of S. cerevisiae, derived from genome scale purification experiments and to versions of
Network cluster detecting in associated bi-graph picture
He, Zhe; Xu, Rui-Jie; Wang, Bing-Hong; Ou-Yang, Zhong-Can
2014-01-01
We find that there is a close relationship between the associated bigraph and the clustering. the imbedding of the bigraph into some space can identify the clusters. Thus, we propose a new method for network cluster detecting through associated bigraph,of which the physical meaning is clear and the time complexity is acceptable. These characteristics help people to understand the structure and character of networks. We uncover the clusters on serval real networks in this paper as examples. The Zachary Network, which presents the structure of a karate club,can be partation into two clusters correctly by this method. And the Dolphin network is partitioned reasonably.
Directory of Open Access Journals (Sweden)
Guillaume Marrelec
Full Text Available The use of mutual information as a similarity measure in agglomerative hierarchical clustering (AHC raises an important issue: some correction needs to be applied for the dimensionality of variables. In this work, we formulate the decision of merging dependent multivariate normal variables in an AHC procedure as a Bayesian model comparison. We found that the Bayesian formulation naturally shrinks the empirical covariance matrix towards a matrix set a priori (e.g., the identity, provides an automated stopping rule, and corrects for dimensionality using a term that scales up the measure as a function of the dimensionality of the variables. Also, the resulting log Bayes factor is asymptotically proportional to the plug-in estimate of mutual information, with an additive correction for dimensionality in agreement with the Bayesian information criterion. We investigated the behavior of these Bayesian alternatives (in exact and asymptotic forms to mutual information on simulated and real data. An encouraging result was first derived on simulations: the hierarchical clustering based on the log Bayes factor outperformed off-the-shelf clustering techniques as well as raw and normalized mutual information in terms of classification accuracy. On a toy example, we found that the Bayesian approaches led to results that were similar to those of mutual information clustering techniques, with the advantage of an automated thresholding. On real functional magnetic resonance imaging (fMRI datasets measuring brain activity, it identified clusters consistent with the established outcome of standard procedures. On this application, normalized mutual information had a highly atypical behavior, in the sense that it systematically favored very large clusters. These initial experiments suggest that the proposed Bayesian alternatives to mutual information are a useful new tool for hierarchical clustering.
Learning with $\\ell^{0}$-Graph: $\\ell^{0}$-Induced Sparse Subspace Clustering
Yang, Yingzhen; Feng, Jiashi; Yang, Jianchao; Huang, Thomas S.
2015-01-01
Sparse subspace clustering methods, such as Sparse Subspace Clustering (SSC) \\cite{ElhamifarV13} and $\\ell^{1}$-graph \\cite{YanW09,ChengYYFH10}, are effective in partitioning the data that lie in a union of subspaces. Most of those methods use $\\ell^{1}$-norm or $\\ell^{2}$-norm with thresholding to impose the sparsity of the constructed sparse similarity graph, and certain assumptions, e.g. independence or disjointness, on the subspaces are required to obtain the subspace-sparse representatio...
Bond percolation on a class of correlated and clustered random graphs
International Nuclear Information System (INIS)
We introduce a formalism for computing bond percolation properties of a class of correlated and clustered random graphs. This class of graphs is a generalization of the configuration model where nodes of different types are connected via different types of hyperedges, edges that can link more than two nodes. We argue that the multitype approach coupled with the use of clustered hyperedges can reproduce a wide spectrum of complex patterns, and thus enhances our capability to model real complex networks. As an illustration of this claim, we use our formalism to highlight unusual behaviours of the size and composition of the components (small and giant) in a synthetic, albeit realistic, social network. (paper)
Identifying industry clusters in Colombia based on graph theory
Juan C. Duque; Rey, S.J.; Gómez, D. A.
2009-01-01
This paper presents a new way to identify and understand the industry clusters in the Colombian economy. The analysis relies on a recent methodology proposed by Duque and Rey (2008) where intricate input-output linkages between industries are simplified using network analysis. In addition to other techniques for cluster identification available in the literature, this novel methodology allows us not only to classify each industry in a given cluster, but also to understand how industries are r...
International Nuclear Information System (INIS)
The energy levels in a delocalized two- or three-dimensional chemical structure are related to the eigenvalues of the graph representing the corresponding bonding topology. Such relatively crude but computationally undemanding graph theory-derived models provide a clear demonstration of the close relationship between two-dimensional aromatic systems such as benzene and three-dimensional aromatic systems such as deltahedral boranes, carboranes, and metal clusters. The basic building blocks for the three-dimensional aromatic systems are deltahedra, having no degree 3 vertices. Delocalized bonding in such systems having v vertices requires two electrons for a multicenter core bond as well as 2v electrons for pairwise surface bonding. A problem of particular interest is how metal cluster polyhedra can fuse together, leading ultimately to the infinite structures of the bulk metals. As a model for such processes the fusion of rhodium carbonyl octahedra is examined using graph theory
An effective trust-based recommendation method using a novel graph clustering algorithm
Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin
2015-10-01
Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.
Xu, Xin; Huang, Zhenhua; Graves, Daniel; Pedrycz, Witold
2014-12-01
In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings. PMID:24802018
Comparisons of Graph-structure Clustering Methods for Gene Expression Data
Institute of Scientific and Technical Information of China (English)
Zhuo FANG; Lei LIU; Jiong YANG; Qing-Ming LUO; Yi-Xue LI
2006-01-01
Although many numerical clustering algorithms have been applied to gene expression data analysis, the essential step is still biological interpretation by manual inspection. The correlation between genetic co-regulation and affiliation to a common biological process is what biologists expect. Here, we introduce some clustering algorithms that are based on graph structure constituted by biological knowledge. After applying a widely used dataset, we compared the result clusters of two of these algorithms in terms of the homogeneity of clusters and coherence of annotation and matching ratio. The results show that the clusters of knowledge-guided analysis are the kernel parts of the clusters of Gene Ontology (GO)-Cluster software, which contains the genes that are most expression correlative and most consistent with biological functions. Moreover, knowledge-guided analysis seems much more applicable than GO-Cluster in a larger dataset.
On the critical parameters of the q ≤ 4 random-cluster model on isoradial graphs
Beffara, V.; Duminil-Copin, H.; Smirnov, S.
2015-12-01
The critical surface for the random-cluster model with cluster-weight q≥slant 4 on isoradial graphs is identified using parafermionic observables. Correlations are also shown to decay exponentially fast in the subcritical regime. While this result is restricted to random-cluster models with q≥slant 4, it extends the recent theorem of (Beffara and Duminil-Copin 2012 Probl. Theory Relat. Fields 153 511-42) to a large class of planar graphs. In particular, the anisotropic random-cluster model on the square lattice is shown to be critical if \\frac{{p}{{v}}{p}{{h}}}{(1-{p}{{v}})(1-{p}{{h}})}=q, where p v and p h denote the horizontal and vertical edge-weights respectively. We also mention consequences for Potts models.
The Study about the Analysis of Responsiveness Pair Clustering to Social Network Bipartite Graph
Otsuki, Akira; Kawamura, Masayoshi
2013-01-01
In this study, regional (cities, towns and villages) data and tweet data are obtained from Twitter, and extract information of purchase information (Where and what bought) from the tweet data by morphological analysis and rule-based dependency analysis. Then, the "The regional information" and "The information of purchase history (Where and what bought information)" are captured as bipartite graph, and Responsiveness Pair Clustering analysis (a clustering using correspondence analysis as simi...
On the Eigenspaces of Lamplighter Random Walks and Percolation Clusters on Graphs
Lehner, Franz
2008-01-01
We show that the Plancherel measure of the lamplighter random walk on a graph coincides with the expected spectral measure of the absorbing random walk on the Bernoulli percolation clusters. In the subcritical regime the spectrum is pure point and we construct a complete orthonormal basis of finitely supported eigenfunctions.
GraphCrunch 2: Software tool for network modeling, alignment and clustering
Directory of Open Access Journals (Sweden)
Hayes Wayne
2011-01-01
Full Text Available Abstract Background Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. Results We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL" for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other
Pitman Yor Diffusion Trees for Bayesian Hierarchical Clustering.
Knowles, David A; Ghahramani, Zoubin
2015-02-01
In this paper we introduce the Pitman Yor Diffusion Tree (PYDT), a Bayesian non-parametric prior over tree structures which generalises the Dirichlet Diffusion Tree [30] and removes the restriction to binary branching structure. The generative process is described and shown to result in an exchangeable distribution over data points. We prove some theoretical properties of the model including showing its construction as the continuum limit of a nested Chinese restaurant process model. We then present two alternative MCMC samplers which allow us to model uncertainty over tree structures, and a computationally efficient greedy Bayesian EM search algorithm. Both algorithms use message passing on the tree structure. The utility of the model and algorithms is demonstrated on synthetic and real world data, both continuous and binary. PMID:26353241
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Banerjee, B.; Krishna Moohan, B.
2014-11-01
This paper addresses the problem of unsupervised land-cover classification of multi-spectral remotely sensed images in the context of self-learning by exploring different graph based clustering techniques hierarchically. The only assumption used here is that the number of land-cover classes is known a priori. Object based image analysis paradigm which processes a given image at different levels, has emerged as a popular alternative to the pixel based approaches for remote sensing image segmentation considering the high spatial resolution of the images. A graph based fuzzy clustering technique is proposed here to obtain a better merging of an initially oversegmented image in the spectral domain compared to conventional clustering techniques. Instead of using Euclidean distance measure, the cumulative graph edge weight is used to find the distance between a pair of points to better cope with the topology of the feature space. In order to handle uncertainty in assigning class labels to pixels, which is not always a crisp allocation for remote sensing data, fuzzy set theoretic technique is incorporated to the graph based clustering. Minimum Spanning Tree (MST) based clustering technique is used to over-segment the image at the first level. Furthermore, considering that the spectral signature of different land-cover classes may overlap significantly, a self-learning based Maximum Likelihood (ML) classifier coupled with the Expectation Maximization (EM) based iterative unsupervised parameter retraining scheme is used to generate the final land-cover classification map. Results on two medium resolution images establish the superior performance of the proposed technique in comparison to the traditional fuzzy c-means clustering technique.
Kaushik Kishore Phukon; Hemanta .K. Baruah
2013-01-01
Clustering techniques are mostly unsupervised methods that can be used to organize data into groups based on similarities among the individual data items. Fuzzy c-means (FCM) clustering is one of well known unsupervised clustering techniques, which can also be used for unsupervised web document clustering. In this chapter we will introduce a modified method of clustering where the data to be clustered will be represented by graphs instead of vectors or other models. Specifically, we will exte...
Directory of Open Access Journals (Sweden)
Kaushik Kishore Phukon
2013-12-01
Full Text Available Clustering techniques are mostly unsupervised methods that can be used to organize data into groups based on similarities among the individual data items. Fuzzy c-means (FCM clustering is one of well known unsupervised clustering techniques, which can also be used for unsupervised web document clustering. In this chapter we will introduce a modified method of clustering where the data to be clustered will be represented by graphs instead of vectors or other models. Specifically, we will extend the classical FCM clustering algorithm to work with graphs that represent web documents [1,2,3]. We wish to use graphs because they can allow us to retain information which is often discarded in simpler models.
Banerjee, B.; Krishna Moohan, B.
2014-01-01
This paper addresses the problem of unsupervised land-cover classification of multi-spectral remotely sensed images in the context of self-learning by exploring different graph based clustering techniques hierarchically. The only assumption used here is that the number of land-cover classes is known a priori. Object based image analysis paradigm which processes a given image at different levels, has emerged as a popular alternative to the pixel based approaches for remote sensing ima...
Combination of meta-analysis and graph clustering to identify prognostic markers of ESCC
Directory of Open Access Journals (Sweden)
Hongyun Gao
2012-01-01
Full Text Available Esophageal squamous cell carcinoma (ESCC is one of the most malignant gastrointestinal cancers and occurs at a high frequency rate in China and other Asian countries. Recently, several molecular markers were identified for predicting ESCC. Notwithstanding, additional prognostic markers, with a clear understanding of their underlying roles, are still required. Through bioinformatics, a graph-clustering method by DPClus was used to detect co-expressed modules. The aim was to identify a set of discriminating genes that could be used for predicting ESCC through graph-clustering and GO-term analysis. The results showed that CXCL12, CYP2C9, TGM3, MAL, S100A9, EMP-1 and SPRR3 were highly associated with ESCC development. In our study, all their predicted roles were in line with previous reports, whereby the assumption that a combination of meta-analysis, graph-clustering and GO-term analysis is effective for both identifying differentially expressed genes, and reflecting on their functions in ESCC.
Metabolomic correlation-network modules in Arabidopsis based on a graph-clustering approach
Directory of Open Access Journals (Sweden)
Redestig Henning
2011-01-01
Full Text Available Abstract Background Deciphering the metabolome is essential for a better understanding of the cellular metabolism as a system. Typical metabolomics data show a few but significant correlations among metabolite levels when data sampling is repeated across individuals grown under strictly controlled conditions. Although several studies have assessed topologies in metabolomic correlation networks, it remains unclear whether highly connected metabolites in these networks have specific functions in known tissue- and/or genotype-dependent biochemical pathways. Results In our study of metabolite profiles we subjected root tissues to gas chromatography-time-of-flight/mass spectrometry (GC-TOF/MS and used published information on the aerial parts of 3 Arabidopsis genotypes, Col-0 wild-type, methionine over-accumulation 1 (mto1, and transparent testa4 (tt4 to compare systematically the metabolomic correlations in samples of roots and aerial parts. We then applied graph clustering to the constructed correlation networks to extract densely connected metabolites and evaluated the clusters by biochemical-pathway enrichment analysis. We found that the number of significant correlations varied by tissue and genotype and that the obtained clusters were significantly enriched for metabolites included in biochemical pathways. Conclusions We demonstrate that the graph-clustering approach identifies tissue- and/or genotype-dependent metabolomic clusters related to the biochemical pathway. Metabolomic correlations complement information about changes in mean metabolite levels and may help to elucidate the organization of metabolically functional modules.
The Study About the Analysis of Responsiveness Pair Clustering Tosocial Network Bipartite Graph
Directory of Open Access Journals (Sweden)
Akira Otsuki
2013-11-01
Full Text Available In this study, regional (cities, towns and villages data and tweet data are obtained from Twitter, andextract information of "purchase information (Whereand what bought" from the tweet data bymorphological analysis and rule-based dependency analysis. Then, the "The regional information" and the"Theinformation of purchase history (Where and what bought information" are captured as bipartitegraph, and Responsiveness Pair Clustering analysis(a clustering using correspondence analysis assimilarity measure is conducted. In this study, since it was found to be difficult to analyze a network suchas bipartite graph having limitations in links by using modularity Q, responsiveness is used instead ofmodularity Q as similarity measure. As a result ofthis analysis, "regional information cluster" whichrefersto similar "Theinformation of purchase history" nodes group is generated. Finally, similar regions arevisualized by mapping the regional information cluster on the map. This visualization system is expected tocontribute as an analytical tool for customers’ purchasing behaviour and so on.
Euler, Christoph
2015-01-01
Using Monte Carlo simulations of globular clusters we developed a method separating metallicity effects from age effects on observed integrated ugriz colors. We demonstrate that these colors do not evolve with time significantly after an age of 4 Gyr and use Bayesian statistics to calculate a probability distribution function of the metallicity. We tested the method using the M31 globular cluster system and then applied to explain the observed color bimodality in globular cluster sets and tidal effects on it. We show that the color bimodality is an effect of a nonlinearity in the color-metallicity relation caused by stellar dynamics on the Giant Branch, that colors including only the UV show a weaker bimodality than those subtracting from visual bands and that cluster sets with a distinct bimodality are in principle older than those with only a weak bimodal distribution. Furthermore a bimodal color distribution of coeval clusters implies a bimodal metallicity distribution, but a unimodal color distribution do...
Abdlwafa, Alan; Edman, Henrik
2015-01-01
Distributed data mining is a relatively new area within computer science that is steadily growing, emerging from the demands of being able to gather and process various distributed data by utilising clusters. This report presents the properties of graph structured data and what paradigms to use for efficiently processing the data type, based on comprehensive theoretical studies applied on practical tests performed on a single node cluster. The results in the study showcase the various perform...
Bonomo, Flavia; Duran, Guillermo; Napoli, Amedeo; Valencia-Pabon, Mario
2015-01-01
In this note we show a one-to-one correspondence between potentially optimal solutions to the cluster deletion problem in a graph G and potentially optimal solutions for the minimum sum coloring problem in G (i.e. the complement graph of G). We apply this correspondence to polynomially solve the cluster deletion problem in a subclass of P 4 -sparse graphs that strictly includes P 4 -reducible graphs.
Emerging Pattern-Based Clustering of Web Users Utilizing a Simple Page-Linked Graph
Xiuming Yu; Meijing Li; Kyung Ah Kim; Jimoon Chung; Keun Ho Ryu
2016-01-01
Web usage mining is a popular research area in data mining. With the extensive use of the Internet, it is essential to learn about the favorite web pages of its users and to cluster web users in order to understand the structural patterns of their usage behavior. In this paper, we propose an efficient approach to determining favorite web pages by generating large web pages, and emerging patterns of generated simple page-linked graphs. We identify the favorite web pages of each user by elimina...
Stenning, D. C.; Wagner-Kaiser, R.; Robinson, E.; van Dyk, D. A.; von Hippel, T.; Sarajedini, A.; Stein, N.
2016-07-01
We develop a Bayesian model for globular clusters composed of multiple stellar populations, extending earlier statistical models for open clusters composed of simple (single) stellar populations. Specifically, we model globular clusters with two populations that differ in helium abundance. Our model assumes a hierarchical structuring of the parameters in which physical properties—age, metallicity, helium abundance, distance, absorption, and initial mass—are common to (i) the cluster as a whole or to (ii) individual populations within a cluster, or are unique to (iii) individual stars. An adaptive Markov chain Monte Carlo (MCMC) algorithm is devised for model fitting that greatly improves convergence relative to its precursor non-adaptive MCMC algorithm. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We use numerical studies to demonstrate that our method can recover parameters of two-population clusters, and also show how model misspecification can potentially be identified. As a proof of concept, we analyze the two stellar populations of globular cluster NGC 5272 using our model and methods. (BASE-9 is available from GitHub: https://github.com/argiopetech/base/releases).
Emerging Pattern-Based Clustering of Web Users Utilizing a Simple Page-Linked Graph
Directory of Open Access Journals (Sweden)
Xiuming Yu
2016-03-01
Full Text Available Web usage mining is a popular research area in data mining. With the extensive use of the Internet, it is essential to learn about the favorite web pages of its users and to cluster web users in order to understand the structural patterns of their usage behavior. In this paper, we propose an efficient approach to determining favorite web pages by generating large web pages, and emerging patterns of generated simple page-linked graphs. We identify the favorite web pages of each user by eliminating noise due to overall popular pages, and by clustering web users according to the generated emerging patterns. Afterwards, we label the clusters by using Term Frequency-Inverse Document Frequency (TF-IDF. In the experiments, we evaluate the parameters used in our proposed approach, discuss the effect of the parameters on generating emerging patterns, and analyze the results from clustering web users. The results of the experiments prove that the exact patterns generated in the emerging-pattern step eliminate the need to consider noise pages, and consequently, this step can improve the efficiency of subsequent mining tasks. Our proposed approach is capable of clustering web users from web log data.
KmsGC: An Unsupervised Color Image Segmentation Algorithm Based on K-Means Clustering and Graph Cut
Directory of Open Access Journals (Sweden)
Binmei Liang
2014-01-01
Full Text Available For unsupervised color image segmentation, we propose a two-stage algorithm, KmsGC, that combines K-means clustering with graph cut. In the first stage, K-means clustering algorithm is applied to make an initial clustering, and the optimal number of clusters is automatically determined by a compactness criterion that is established to find clustering with maximum intercluster distance and minimum intracluster variance. In the second stage, a multiple terminal vertices weighted graph is constructed based on an energy function, and the image is segmented according to a minimum cost multiway cut. A large number of performance evaluations are carried out, and the experimental results indicate the proposed approach is effective compared to other existing image segmentation algorithms on the Berkeley image database.
Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation
Srivastava, Akash; Zou, James; Adams, Ryan P.; Sutton, Charles
2016-01-01
A good clustering can help a data analyst to explore and understand a data set, but what constitutes a good clustering may depend on domain-specific and application-specific criteria. These criteria can be difficult to formalize, even when it is easy for an analyst to know a good clustering when they see one. We present a new approach to interactive clustering for data exploration called TINDER, based on a particularly simple feedback mechanism, in which an analyst can reject a given clusteri...
Bayesian clustering of fuzzy feature vectors using a quasi-likelihood approach.
Marttinen, Pekka; Tang, Jing; De Baets, Bernard; Dawyndt, Peter; Corander, Jukka
2009-01-01
Bayesian model-based classifiers, both unsupervised and supervised, have been studied extensively and their value and versatility have been demonstrated on a wide spectrum of applications within science and engineering. A majority of the classifiers are built on the assumption of intrinsic discreteness of the considered data features or on the discretization of them prior to the modeling. On the other hand, Gaussian mixture classifiers have also been utilized to a large extent for continuous features in the Bayesian framework. Often the primary reason for discretization in the classification context is the simplification of the analytical and numerical properties of the models. However, the discretization can be problematic due to its \\textit{ad hoc} nature and the decreased statistical power to detect the correct classes in the resulting procedure. We introduce an unsupervised classification approach for fuzzy feature vectors that utilizes a discrete model structure while preserving the continuous characteristics of data. This is achieved by replacing the ordinary likelihood by a binomial quasi-likelihood to yield an analytical expression for the posterior probability of a given clustering solution. The resulting model can be justified from an information-theoretic perspective. Our method is shown to yield highly accurate clusterings for challenging synthetic and empirical data sets. PMID:19029547
Recognition of Crowd Behavior from Mobile Sensors with Pattern Analysis and Graph Clustering Methods
Roggen, Daniel; Tröster, Gerhard; Helbing, Dirk; 10.3934/nhm.2011.6.521
2011-01-01
Mobile on-body sensing has distinct advantages for the analysis and understanding of crowd dynamics: sensing is not geographically restricted to a specific instrumented area, mobile phones offer on-body sensing and they are already deployed on a large scale, and the rich sets of sensors they contain allows one to characterize the behavior of users through pattern recognition techniques. In this paper we present a methodological framework for the machine recognition of crowd behavior from on-body sensors, such as those in mobile phones. The recognition of crowd behaviors opens the way to the acquisition of large-scale datasets for the analysis and understanding of crowd dynamics. It has also practical safety applications by providing improved crowd situational awareness in cases of emergency. The framework comprises: behavioral recognition with the user's mobile device, pairwise analyses of the activity relatedness of two users, and graph clustering in order to uncover globally, which users participate in a gi...
Graph Mining Algorithms for Memory Leak Diagnosis and Biological Database Clustering
Maxwell, Evan Kyle
2010-01-01
Large graph-based datasets are common to many applications because of the additional structure provided to data by graphs. Patterns extracted from graphs must adhere to these structural properties, making them a more complex class of patterns to identify. The role of graph mining is to efficiently extract these patterns and quantify their significance. In this thesis, we focus on two application domains and demonstrate the design of graph mining algorithms in these domains. First, we inve...
Bayesian Investigation of Isochrone Consistency Using the Old Open Cluster NGC 188
Hills, Shane; Courteau, Stephane; Geller, Aaron M
2015-01-01
This paper provides a detailed comparison of the differences in parameters derived for a star cluster from its color-magnitude diagrams depending on the filters and models used. We examine the consistency and reliability of fitting three widely-used stellar evolution models to fifteen combinations of optical and near-IR photometry for the old open cluster NGC 188. The optical filter response curves match those of the theoretical systems and are thus not the source of fit inconsistencies. NGC 188 is ideally suited to the present study thanks to a wide variety of high-quality photometry and available proper motions and radial velocities which enable us to remove non-cluster members and many binaries. Our Bayesian fitting technique yields inferred values of age, metallicity, distance modulus, and absorption as a function of the photometric band combinations and stellar models. We show that the historically-favored three band combinations of UBV and VRI can be meaningfully inconsistent with each other and with lo...
Fang, Jun; Zhang, Lizao; Duan, Huiping; Huang, Lei; Li, Hongbin
2016-05-01
The application of sparse representation to SAR/ISAR imaging has attracted much attention over the past few years. This new class of sparse representation based imaging methods present a number of unique advantages over conventional range-Doppler methods, the basic idea behind these works is to formulate SAR/ISAR imaging as a sparse signal recovery problem. In this paper, we propose a new two-dimensional pattern-coupled sparse Bayesian learning(SBL) method to capture the underlying cluster patterns of the ISAR target images. Based on this model, an expectation-maximization (EM) algorithm is developed to infer the maximum a posterior (MAP) estimate of the hyperparameters, along with the posterior distribution of the sparse signal. Experimental results demonstrate that the proposed method is able to achieve a substantial performance improvement over existing algorithms, including the conventional SBL method.
Wagner-Kaiser, R.; Stenning, D. C.; Robinson, E.; von Hippel, T.; Sarajedini, A.; van Dyk, D. A.; Stein, N.; Jefferys, W. H.
2016-07-01
We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival Advanced Camera for Surveys Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from ˜0.05 to 0.11 for these three clusters. Model grids with solar α-element abundances ([α/Fe] = 0.0) and enhanced α-elements ([α/Fe] = 0.4) are adopted.
Wagner-Kaiser, R; Robinson, E; von Hippel, T; Sarajedini, A; van Dyk, D A; Stein, N; Jefferys, W H
2016-01-01
We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from $\\sim$0.05 to 0.11 for these three clusters. Model grids with solar $\\alpha$-element abundances ([$\\alpha$/Fe] =0.0) and enhanced $\\alpha$-elements ([$\\alpha$/Fe]=0.4) are adopted.
Wagner-Kaiser, R.; Stenning, D. C.; Robinson, E.; von Hippel, T.; Sarajedini, A.; van Dyk, D. A.; Stein, N.; Jefferys, W. H.
2016-07-01
We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival Advanced Camera for Surveys Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from ∼0.05 to 0.11 for these three clusters. Model grids with solar α-element abundances ([α/Fe] = 0.0) and enhanced α-elements ([α/Fe] = 0.4) are adopted.
Generating Realistic Labelled, Weighted Random Graphs
Davis, Michael Charles; Liu, Weiru; Miller, Paul; Hunter, Ruth; Kee, Frank
2015-01-01
Generative algorithms for random graphs have yielded insights into the structure and evolution of real-world networks. Most networks exhibit a well-known set of properties, such as heavy-tailed degree distributions, clustering and community formation. Usually, random graph models consider only structural information, but many real-world networks also have labelled vertices and weighted edges. In this paper, we present a generative model for random graphs with discrete vertex labels and numeric edge weights. The weights are represented as a set of Beta Mixture Models (BMMs) with an arbitrary number of mixtures, which are learned from real-world networks. We propose a Bayesian Variational Inference (VI) approach, which yields an accurate estimation while keeping computation times tractable. We compare our approach to state-of-the-art random labelled graph generators and an earlier approach based on Gaussian Mixture Models (GMMs). Our results allow us to draw conclusions about the contribution of vertex labels a...
Bayesian investigation of isochrone consistency using the old open cluster NGC 188
Energy Technology Data Exchange (ETDEWEB)
Hills, Shane; Courteau, Stéphane [Department of Physics, Engineering Physics and Astronomy, Queen’s University, Kingston, ON K7L 3N6 Canada (Canada); Von Hippel, Ted [Department of Physical Sciences, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114 (United States); Geller, Aaron M., E-mail: shane.hills@queensu.ca, E-mail: courteau@astro.queensu.ca, E-mail: ted.vonhippel@erau.edu, E-mail: a-geller@northwestern.edu [Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA) and Department of Physics and Astronomy, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208 (United States)
2015-03-01
This paper provides a detailed comparison of the differences in parameters derived for a star cluster from its color–magnitude diagrams (CMDs) depending on the filters and models used. We examine the consistency and reliability of fitting three widely used stellar evolution models to 15 combinations of optical and near-IR photometry for the old open cluster NGC 188. The optical filter response curves match those of theoretical systems and are thus not the source of fit inconsistencies. NGC 188 is ideally suited to this study thanks to a wide variety of high-quality photometry and available proper motions and radial velocities that enable us to remove non-cluster members and many binaries. Our Bayesian fitting technique yields inferred values of age, metallicity, distance modulus, and absorption as a function of the photometric band combinations and stellar models. We show that the historically favored three-band combinations of UBV and VRI can be meaningfully inconsistent with each other and with longer baseline data sets such as UBVRIJHK{sub S}. Differences among model sets can also be substantial. For instance, fitting Yi et al. (2001) and Dotter et al. (2008) models to UBVRIJHK{sub S} photometry for NGC 188 yields the following cluster parameters: age = (5.78 ± 0.03, 6.45 ± 0.04) Gyr, [Fe/H] = (+0.125 ± 0.003, −0.077 ± 0.003) dex, (m−M){sub V} = (11.441 ± 0.007, 11.525 ± 0.005) mag, and A{sub V} = (0.162 ± 0.003, 0.236 ± 0.003) mag, respectively. Within the formal fitting errors, these two fits are substantially and statistically different. Such differences among fits using different filters and models are a cautionary tale regarding our current ability to fit star cluster CMDs. Additional modeling of this kind, with more models and star clusters, and future Gaia parallaxes are critical for isolating and quantifying the most relevant uncertainties in stellar evolutionary models.
Bayesian investigation of isochrone consistency using the old open cluster NGC 188
International Nuclear Information System (INIS)
This paper provides a detailed comparison of the differences in parameters derived for a star cluster from its color–magnitude diagrams (CMDs) depending on the filters and models used. We examine the consistency and reliability of fitting three widely used stellar evolution models to 15 combinations of optical and near-IR photometry for the old open cluster NGC 188. The optical filter response curves match those of theoretical systems and are thus not the source of fit inconsistencies. NGC 188 is ideally suited to this study thanks to a wide variety of high-quality photometry and available proper motions and radial velocities that enable us to remove non-cluster members and many binaries. Our Bayesian fitting technique yields inferred values of age, metallicity, distance modulus, and absorption as a function of the photometric band combinations and stellar models. We show that the historically favored three-band combinations of UBV and VRI can be meaningfully inconsistent with each other and with longer baseline data sets such as UBVRIJHKS. Differences among model sets can also be substantial. For instance, fitting Yi et al. (2001) and Dotter et al. (2008) models to UBVRIJHKS photometry for NGC 188 yields the following cluster parameters: age = (5.78 ± 0.03, 6.45 ± 0.04) Gyr, [Fe/H] = (+0.125 ± 0.003, −0.077 ± 0.003) dex, (m−M)V = (11.441 ± 0.007, 11.525 ± 0.005) mag, and AV = (0.162 ± 0.003, 0.236 ± 0.003) mag, respectively. Within the formal fitting errors, these two fits are substantially and statistically different. Such differences among fits using different filters and models are a cautionary tale regarding our current ability to fit star cluster CMDs. Additional modeling of this kind, with more models and star clusters, and future Gaia parallaxes are critical for isolating and quantifying the most relevant uncertainties in stellar evolutionary models.
Kayser, K; Sandau, K; Paul, J; Weisse, G
1992-02-01
An approach based on graph theory is described for detecting clusters of cells in tissue specimens (two-dimensional space). With a set of discrete basic elements (cell nuclei) having several measurable features (area, surface, main and minor axis of best-fitting ellipses) a graph is defined as having attributes associated with edges. Different minimum spanning trees (MSTs) can be constructed using different weight functions on the attributes (attributed MST). Analysis of the MST and of an attributed MST by use of a decomposition function allows detection of image areas with similar local properties. These clusters, which are then clusters of the tree, describe, for example, partial growth in different directions in a case of a human fibrosarcoma assuming that tumour cell nuclei are homogeneous with respect to their configuration and size. The model allows the separation of clusters of tumour cells growing in different directions and the approximation of the different growth angles. This decomposition also allows us to create new (higher) orders of structure (cluster tree). PMID:1564724
DEFF Research Database (Denmark)
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes and...... largely due to the availability of efficient inference algorithms for answering probabilistic queries about the states of the variables in the network. Furthermore, to support the construction of Bayesian network models, learning algorithms are also available. We give an overview of the Bayesian network...
Optimal Clustering in Graphs with Weighted Edges: A Unified Approach to the Threshold Problem.
Goetschel, Roy; Voxman, William
1987-01-01
Relations on a finite set V are viewed as weighted graphs. Using the language of graph theory, two methods of partitioning V are examined: selecting threshold values and applying them to a maximal weighted spanning forest, and using a parametric linear program to obtain a most adhesive partition. (Author/EM)
Czech Academy of Sciences Publication Activity Database
Valečková, Markéta; Kárný, Miroslav; Sutanto, E. L.
2001-01-01
Roč. 37, č. 6 (2001), s. 1071-1078. ISSN 0005-1098 R&D Projects: GA ČR GA102/99/1564 Grant ostatní: IST(XE) 1999/12058 Institutional research plan: AV0Z1075907 Keywords : Markov chain * clustering * Bayesian mixture estimation Subject RIV: BC - Control Systems Theory Impact factor: 1.449, year: 2001
Zhang, Zhen; Lim, Chae Young; Maiti, Tapabrata; Kato, Seiji
2016-01-01
In climate change study, the infrared spectral signatures of climate change have recently been conceptually adopted, and widely applied to identifying and attributing atmospheric composition change. We propose a Bayesian hierarchical model for spatial clustering of the high-dimensional functional data based on the effects of functional covariates and local features. We couple the functional mixed-effects model with a generalized spatial partitioning method for: (1) producing spatially contigu...
Review on Graph Clustering and Subgraph Similarity Based Analysis of Neurological Disorders
Directory of Open Access Journals (Sweden)
Jaya Thomas
2016-06-01
Full Text Available How can complex relationships among molecular or clinico-pathological entities of neurological disorders be represented and analyzed? Graphs seem to be the current answer to the question no matter the type of information: molecular data, brain images or neural signals. We review a wide spectrum of graph representation and graph analysis methods and their application in the study of both the genomic level and the phenotypic level of the neurological disorder. We find numerous research works that create, process and analyze graphs formed from one or a few data types to gain an understanding of specific aspects of the neurological disorders. Furthermore, with the increasing number of data of various types becoming available for neurological disorders, we find that integrative analysis approaches that combine several types of data are being recognized as a way to gain a global understanding of the diseases. Although there are still not many integrative analyses of graphs due to the complexity in analysis, multi-layer graph analysis is a promising framework that can incorporate various data types. We describe and discuss the benefits of the multi-layer graph framework for studies of neurological disease.
Review on Graph Clustering and Subgraph Similarity Based Analysis of Neurological Disorders.
Thomas, Jaya; Seo, Dongmin; Sael, Lee
2016-01-01
How can complex relationships among molecular or clinico-pathological entities of neurological disorders be represented and analyzed? Graphs seem to be the current answer to the question no matter the type of information: molecular data, brain images or neural signals. We review a wide spectrum of graph representation and graph analysis methods and their application in the study of both the genomic level and the phenotypic level of the neurological disorder. We find numerous research works that create, process and analyze graphs formed from one or a few data types to gain an understanding of specific aspects of the neurological disorders. Furthermore, with the increasing number of data of various types becoming available for neurological disorders, we find that integrative analysis approaches that combine several types of data are being recognized as a way to gain a global understanding of the diseases. Although there are still not many integrative analyses of graphs due to the complexity in analysis, multi-layer graph analysis is a promising framework that can incorporate various data types. We describe and discuss the benefits of the multi-layer graph framework for studies of neurological disease. PMID:27258269
Directory of Open Access Journals (Sweden)
Ali Dashti
Full Text Available This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU. The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
Clusters and Maps of Science Journals Based on Bi-connected Graphs in the Journal Citation Reports
Leydesdorff, Loet
2009-01-01
The aggregated journal-journal citation matrix derived from the Journal Citation Reports 2001 can be decomposed into a unique subject classification by using the graph-analytical algorithm of bi-connected components. This technique was recently incorporated in software tools for social network analysis. The matrix can be assessed in terms of its decomposability using articulation points which indicate overlap between the components. The articulation points of this set did not exhibit a next-order network of 'general science' journals. However, the clusters differ in size and in terms of the internal density of their relations. A full classification of the journals is provided in an Appendix. The clusters can also be extracted and mapped for the visualization.
Goetschel, Roy, Jr.
1987-01-01
Multivalent relations, inferred as relationships with an added dimension of discernment, are realized as weighted graphs with multivalued edges. A unified treatment of the threshold problem is discussed and a reliability measure is produced to judge various partitions. (Author/EM)
A BIPARTITE GRAPH PARTITION AND LINK BASED APPROACH FOR SOLVING CATEGORICAL DATA CLUSTERING
E. Dhivyalakshmi; Ramakrishnan, R; K. Umapathy
2013-01-01
Clustering ensembles have emerged as a powerful method for improving both the robustness and the stability of unsupervised classification solutions. The project presents an analysis that suggests this problem degrades the quality of the clustering result, and it presents a new link-based approach, which improves the conventional matrix by discovering unknown entries through similarity between clusters in an ensemble. A critical problem in cluster ensemble research is how to combine multiple c...
Leal, Wilmer; Llanos, Eugenio J.; RESTREPO Guillermo; Carlos F Suárez; Patarroyo, Manuel Elkin
2016-01-01
Background Hierarchical cluster analysis (HCA) is a widely used classificatory technique in many areas of scientific knowledge. Applications usually yield a dendrogram from an HCA run over a given data set, using a grouping algorithm and a similarity measure. However, even when such parameters are fixed, ties in proximity (i.e. two equidistant clusters from a third one) may produce several different dendrograms, having different possible clustering patterns (different classifications). This s...
Jelil, Mahmutjan; Abaydulla, Alimjan
2015-07-28
A graph theoretical procedure to generate all the possible topology-distinct structures for hydrogen fluoride (HF) clusters is presented in this work. The hydrogen bond matrix is defined and used to enumerate the topology-distinct structures of hydrogen fluoride (HF)n (n = 2-8) clusters. From close investigation of the structural patterns obtained, several restrictions that should be satisfied for a structure of the HF clusters to be stable are found. The corresponding digraphs of generated hydrogen bond matrices are used as the theoretical framework to obtain all the topology-distinct local minima for (HF)n (n ≤ 6), at the level of MP2/6-31G**(d, p) of ab initio MO method and B3LYP/6-31G**(d, p) of density functional theory method. For HF clusters up to tetramers, the local minimum structures that we generated are same as those in the literature. For HF pentamers and hexamers, we found some new local minima structures which had not been obtained previously. PMID:26233123
Jelil, Mahmutjan; Abaydulla, Alimjan
2015-07-01
A graph theoretical procedure to generate all the possible topology-distinct structures for hydrogen fluoride (HF) clusters is presented in this work. The hydrogen bond matrix is defined and used to enumerate the topology-distinct structures of hydrogen fluoride (HF)n (n = 2-8) clusters. From close investigation of the structural patterns obtained, several restrictions that should be satisfied for a structure of the HF clusters to be stable are found. The corresponding digraphs of generated hydrogen bond matrices are used as the theoretical framework to obtain all the topology-distinct local minima for (HF)n (n ≤ 6), at the level of MP2/6-31G**(d, p) of ab initio MO method and B3LYP/6-31G**(d, p) of density functional theory method. For HF clusters up to tetramers, the local minimum structures that we generated are same as those in the literature. For HF pentamers and hexamers, we found some new local minima structures which had not been obtained previously.
Graph based Approach and Clustering of Patterns (GACP) for Sequential Pattern Mining
Ashish Patel ,; Amisha Patel
2011-01-01
The sequential pattern mining generates the sequential patterns. It can be used as the input of another program for retrieving the information from the large collection of data. It requires a largeamount of memory as well as numerous I/O operations. Multistage operations reduce the efficiency of the algorithm. The given GACP is based on graph representation and avoids recursively reconstructingintermediate trees during the mining process. The algorithm also eliminates the need of repeatedly s...
Graph-Based Approaches to Clustering Network-Constrained Trajectory Data
EL MAHRSI, Mohamed Khalil; Rossi, Fabrice
2013-01-01
Even though clustering trajectory data attracted considerable attention in the last few years, most of prior work assumed that moving objects can move freely in an euclidean space and did not consider the eventual presence of an underlying road network and its influence on evaluating the similarity between trajectories. In this paper, we present two approaches to clustering network-constrained trajectory data. The first approach discovers clusters of trajectories that traveled along the same ...
Regularity in Vague Intersection Graphs and Vague Line Graphs
Muhammad Akram; Wieslaw A. Dudek; M. Murtaza Yousaf
2014-01-01
Fuzzy graph theory is commonly used in computer science applications, particularly in database theory, data mining, neural networks, expert systems, cluster analysis, control theory, and image capturing. A vague graph is a generalized structure of a fuzzy graph that gives more precision, flexibility, and compatibility to a system when compared with systems that are designed using fuzzy graphs. In this paper, we introduce the notion of vague line graphs, and certain types of vague line graphs ...
Graph Construction for Learning with Unbalanced Data
Qian, Jing; Saligrama, Venkatesh; Zhao, Manqi
2011-01-01
Unbalanced data arises in many learning tasks such as clustering of multi-class data, hierarchical divisive clustering and semisupervised learning. Graph-based approaches are popular tools for these problems. Graph construction is an important aspect of graph-based learning. We show that graph-based algorithms can fail for unbalanced data for many popular graphs such as k-NN, \\epsilon-neighborhood and full-RBF graphs. We propose a novel graph construction technique that encodes global statist...
Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
Trivedi, Shubhendu; Pardos, Zachary A.; Sarkozy, Gabor N.; Heffernan, Neil T.
2012-01-01
Learning a more distributed representation of the input feature space is a powerful method to boost the performance of a given predictor. Often this is accomplished by partitioning the data into homogeneous groups by clustering so that separate models could be trained on each cluster. Intuitively each such predictor is a better representative of…
The Study About the Analysis of Responsiveness Pair Clustering Tosocial Network Bipartite Graph
Akira Otsuki,; Masayoshi Kawamura
2013-01-01
In this study, regional (cities, towns and villages) data and tweet data are obtained from Twitter, andextract information of "purchase information (Whereand what bought)" from the tweet data bymorphological analysis and rule-based dependency analysis. Then, the "The regional information" and the"Theinformation of purchase history (Where and what bought information)" are captured as bipartitegraph, and Responsiveness Pair Clustering analysis(a clustering using correspondence analysis assimila...
Sun, Xun; Lall, Upmanu; Merz, Bruno; Dung, Nguyen Viet
2015-08-01
Especially for extreme precipitation or floods, there is considerable spatial and temporal variability in long term trends or in the response of station time series to large-scale climate indices. Consequently, identifying trends or sensitivity of these extremes to climate parameters can be marked by high uncertainty. When one develops a nonstationary frequency analysis model, a key step is the identification of potential trends or effects of climate indices on the station series. An automatic clustering procedure that effectively pools stations where there are similar responses is desirable to reduce the estimation variance, thus improving the identification of trends or responses, and accounting for spatial dependence. This paper presents a new hierarchical Bayesian approach for exploring homogeneity of response in large area data sets, through a multicomponent mixture model. The approach allows the reduction of uncertainties through both full pooling and partial pooling of stations across automatically chosen subsets of the data. We apply the model to study the trends in annual maximum daily stream flow at 68 gauges over Germany. The effects of changing the number of clusters and the parameters used for clustering are demonstrated. The results show that there are large, mainly upward trends in the gauges of the River Rhine Basin in Western Germany and along the main stream of the Danube River in the south, while there are also some small upward trends at gauges in Central and Northern Germany.
Generating Realistic Labelled, Weighted Random Graphs
Directory of Open Access Journals (Sweden)
Michael Charles Davis
2015-12-01
Full Text Available Generative algorithms for random graphs have yielded insights into the structure and evolution of real-world networks. Most networks exhibit a well-known set of properties, such as heavy-tailed degree distributions, clustering and community formation. Usually, random graph models consider only structural information, but many real-world networks also have labelled vertices and weighted edges. In this paper, we present a generative model for random graphs with discrete vertex labels and numeric edge weights. The weights are represented as a set of Beta Mixture Models (BMMs with an arbitrary number of mixtures, which are learned from real-world networks. We propose a Bayesian Variational Inference (VI approach, which yields an accurate estimation while keeping computation times tractable. We compare our approach to state-of-the-art random labelled graph generators and an earlier approach based on Gaussian Mixture Models (GMMs. Our results allow us to draw conclusions about the contribution of vertex labels and edge weights to graph structure.
Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina
2014-07-15
In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. PMID:24650600
Directory of Open Access Journals (Sweden)
Jing Chen
2015-06-01
Full Text Available This study takes the concept of food logistics distribution as the breakthrough point, by means of the aim of optimization of food logistics distribution routes and analysis of the optimization model of food logistics route, as well as the interpretation of the genetic algorithm, it discusses the optimization of food logistics distribution route based on genetic and cluster scheme algorithm.
A Latent Variable Bayesian Approach to Spatial Clustering with Background Noise
Kayabol, K.
2011-01-01
We propose a finite mixture model for clustering of the spatial data patterns. The model is based on the spatial distances between the data locations in such a way that both the distances of the points to the cluster centers and the distances of a given point to its neighbors within a defined window
Bayesian History Reconstruction of Complex Human Gene Clusters on a Phylogeny
Vinař, Tomáš; Song, Giltae; Siepel, Adam
2009-01-01
Clusters of genes that have evolved by repeated segmental duplication present difficult challenges throughout genomic analysis, from sequence assembly to functional analysis. Improved understanding of these clusters is of utmost importance, since they have been shown to be the source of evolutionary innovation, and have been linked to multiple diseases, including HIV and a variety of cancers. Previously, Zhang et al. (2008) developed an algorithm for reconstructing parsimonious evolutionary histories of such gene clusters, using only human genomic sequence data. In this paper, we propose a probabilistic model for the evolution of gene clusters on a phylogeny, and an MCMC algorithm for reconstruction of duplication histories from genomic sequences in multiple species. Several projects are underway to obtain high quality BAC-based assemblies of duplicated clusters in multiple species, and we anticipate that our method will be useful in analyzing these valuable new data sets.
Shuai, Hong-Han; Yu, Philip S; Shen, Chih-Ya; Chen, Ming-Syan
2013-01-01
The importance of graph mining has been widely recognized thanks to a large variety of applications in many areas, while real datasets always play important roles to examine the solution quality and efficiency of a graph mining algorithm. Nevertheless, the size of a real dataset is usually fixed and constrained according to the available resources, such as the efforts to crawl an on-line social network. In this case, employing a synthetic graph generator is a possible way to generate a massive graph (e.g., billions nodes) for evaluating the scalability of an algorithm, and current popular statistical graph generators are properly designed to maintain statistical metrics such as total node degree, degree distribution, diameter, and clustering coefficient of the original social graphs. Nevertheless, in addition to the above metrics, recent studies on graph mining point out that graph frequent patterns are also important to provide useful implications for the corresponding social networking applications, but thi...
A Clustering Method of Highly Dimensional Patent Data Using Bayesian Approach
Sunghae Jun
2012-01-01
Patent data have diversely technological information of any technology field. So, many companies have managed the patent data to build their RD policy. Patent analysis is an approach to the patent management. Also, patent analysis is an important tool for technology forecasting. Patent clustering is one of the works for patent analysis. In this paper, we propose an efficient clustering method of patent documents. Generally, patent data are consisted of text document. The patent documents have...
Clique graphs and overlapping communities
International Nuclear Information System (INIS)
It is shown how to construct a clique graph in which properties of cliques of a fixed order in a given graph are represented by vertices in a weighted graph. Various definitions and motivations for these weights are given. The detection of communities or clusters is used to illustrate how a clique graph may be exploited. In particular a benchmark network is shown where clique graphs find the overlapping communities accurately while vertex partition methods fail
Clique graphs and overlapping communities
Evans, T. S.
2010-12-01
It is shown how to construct a clique graph in which properties of cliques of a fixed order in a given graph are represented by vertices in a weighted graph. Various definitions and motivations for these weights are given. The detection of communities or clusters is used to illustrate how a clique graph may be exploited. In particular a benchmark network is shown where clique graphs find the overlapping communities accurately while vertex partition methods fail.
Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data
Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.
2016-01-01
We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872
Graph Embedding for Pattern Analysis
Ma, Yunqian
2013-01-01
Graph Embedding for Pattern Analysis covers theory methods, computation, and applications widely used in statistics, machine learning, image processing, and computer vision. This book presents the latest advances in graph embedding theories, such as nonlinear manifold graph, linearization method, graph based subspace analysis, L1 graph, hypergraph, undirected graph, and graph in vector spaces. Real-world applications of these theories are spanned broadly in dimensionality reduction, subspace learning, manifold learning, clustering, classification, and feature selection. A selective group of experts contribute to different chapters of this book which provides a comprehensive perspective of this field.
Quinn, Christopher; Coleman, Todd P
2012-01-01
We propose two graphical models to represent a concise description of the causal statistical dependence structure between a group of coupled stochastic processes. The first, minimum generative model graphs, is motivated by generative models. The second, directed information graphs, is motivated by Granger causality. We show that under mild assumptions, the graphs are identical. In fact, these are analogous to Bayesian and Markov networks respectively, in terms of Markov blankets and I-map properties. Furthermore, the underlying variable dependence structure is the unique causal Bayesian network. Lastly, we present a method using minimal-dimension statistics to identify the structure when upper bounds on the in-degrees are known. Simulations show the effectiveness of the approach.
Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model.
Jääskinen, Väinö; Parkkinen, Ville; Cheng, Lu; Corander, Jukka
2014-02-01
In many biological applications it is necessary to cluster DNA sequences into groups that represent underlying organismal units, such as named species or genera. In metagenomics this grouping needs typically to be achieved on the basis of relatively short sequences which contain different types of errors, making the use of a statistical modeling approach desirable. Here we introduce a novel method for this purpose by developing a stochastic partition model that clusters Markov chains of a given order. The model is based on a Dirichlet process prior and we use conjugate priors for the Markov chain parameters which enables an analytical expression for comparing the marginal likelihoods of any two partitions. To find a good candidate for the posterior mode in the partition space, we use a hybrid computational approach which combines the EM-algorithm with a greedy search. This is demonstrated to be faster and yield highly accurate results compared to earlier suggested clustering methods for the metagenomics application. Our model is fairly generic and could also be used for clustering of other types of sequence data for which Markov chains provide a reasonable way to compress information, as illustrated by experiments on shotgun sequence type data from an Escherichia coli strain. PMID:24246289
Aldecoa, Rodrigo; Orsini, Chiara; Krioukov, Dmitri
2015-11-01
Networks representing many complex systems in nature and society share some common structural properties like heterogeneous degree distributions and strong clustering. Recent research on network geometry has shown that those real networks can be adequately modeled as random geometric graphs in hyperbolic spaces. In this paper, we present a computer program to generate such graphs. Besides real-world-like networks, the program can generate random graphs from other well-known graph ensembles, such as the soft configuration model, random geometric graphs on a circle, or Erdős-Rényi random graphs. The simulations show a good match between the expected values of different network structural properties and the corresponding empirical values measured in generated graphs, confirming the accurate behavior of the program.
Bayesian networks and food security - An introduction
Stein, A.
2004-01-01
This paper gives an introduction to Bayesian networks. Networks are defined and put into a Bayesian context. Directed acyclical graphs play a crucial role here. Two simple examples from food security are addressed. Possible uses of Bayesian networks for implementation and further use in decision sup
Directory of Open Access Journals (Sweden)
Arturo Medrano-Soto
2004-12-01
Full Text Available Based on mixture models, we present a Bayesian method (called BClass to classify biological entities (e.g. genes when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects.
Directory of Open Access Journals (Sweden)
Urbi Garay
2016-03-01
Full Text Available We define a dynamic and self-adjusting mixture of Gaussian Graphical Models to cluster financial returns, and provide a new method for extraction of nonparametric estimates of dynamic alphas (excess return and betas (to a choice set of explanatory factors in a multivariate setting. This approach, as well as the outputs, has a dynamic, nonstationary and nonparametric form, which circumvents the problem of model risk and parametric assumptions that the Kalman filter and other widely used approaches rely on. The by-product of clusters, used for shrinkage and information borrowing, can be of use to determine relationships around specific events. This approach exhibits a smaller Root Mean Squared Error than traditionally used benchmarks in financial settings, which we illustrate through simulation. As an illustration, we use hedge fund index data, and find that our estimated alphas are, on average, 0.13% per month higher (1.6% per year than alphas estimated through Ordinary Least Squares. The approach exhibits fast adaptation to abrupt changes in the parameters, as seen in our estimated alphas and betas, which exhibit high volatility, especially in periods which can be identified as times of stressful market events, a reflection of the dynamic positioning of hedge fund portfolio managers.
Learning Probabilistic Decision Graphs
DEFF Research Database (Denmark)
Jaeger, Manfred; Dalgaard, Jens; Silander, Tomi
2004-01-01
Probabilistic decision graphs (PDGs) are a representation language for probability distributions based on binary decision diagrams. PDGs can encode (context-specific) independence relations that cannot be captured in a Bayesian network structure, and can sometimes provide computationally more...... efficient representations than Bayesian networks. In this paper we present an algorithm for learning PDGs from data. First experiments show that the algorithm is capable of learning optimal PDG representations in some cases, and that the computational efficiency of PDG models learned from real-life data...
DEFF Research Database (Denmark)
Hartnell, B.L.; Vestergaard, Preben Dahl
graph that remains can still be decomposed (such graphs are called or ). In this paper we consider the follwing variation. Given a fixed graph H, determine which graphs (call them ) have the property that every edge disjoint packing with H is maximum. In the case that the graph H is isomorphic to the...
S-PowerGraph: Streaming Graph Partitioning for Natural Graphs by Vertex-Cut
Xie, Cong; Li, Wu-Jun; Zhang, Zhihua
2015-01-01
One standard solution for analyzing large natural graphs is to adopt distributed computation on clusters. In distributed computation, graph partitioning (GP) methods assign the vertices or edges of a graph to different machines in a balanced way so that some distributed algorithms can be adapted for. Most of traditional GP methods are offline, which means that the whole graph has been observed before partitioning. However, the offline methods often incur high computation cost. Hence, streamin...
Higher-order graph wavelets and sparsity on circulant graphs
Kotzagiannidis, Madeleine S.; Dragotti, Pier Luigi
2015-08-01
The notion of a graph wavelet gives rise to more advanced processing of data on graphs due to its ability to operate in a localized manner, across newly arising data-dependency structures, with respect to the graph signal and underlying graph structure, thereby taking into consideration the inherent geometry of the data. In this work, we tackle the problem of creating graph wavelet filterbanks on circulant graphs for a sparse representation of certain classes of graph signals. The underlying graph can hereby be data-driven as well as fixed, for applications including image processing and social network theory, whereby clusters can be modelled as circulant graphs, respectively. We present a set of novel graph wavelet filter-bank constructions, which annihilate higher-order polynomial graph signals (up to a border effect) defined on the vertices of undirected, circulant graphs, and are localised in the vertex domain. We give preliminary results on their performance for non-linear graph signal approximation and denoising. Furthermore, we provide extensions to our previously developed segmentation-inspired graph wavelet framework for non-linear image approximation, by incorporating notions of smoothness and vanishing moments, which further improve performance compared to traditional methods.
Random intersection graph process
Bloznelis, Mindaugas; Karonski, Michal
2013-01-01
We introduce a random intersection graph process aimed at modeling sparse evolving affiliation networks that admit tunable (power law) degree distribution and assortativity and clustering coefficients. We show the asymptotic degree distribution and provide explicit asymptotic formulas for assortativity and clustering coefficients.
Hendrix, William; Jenkins, John; Padmanabhan, Kanchana; Chakraborty, Arpan
2014-01-01
Practical Graph Mining with R presents a "do-it-yourself" approach to extracting interesting patterns from graph data. It covers many basic and advanced techniques for the identification of anomalous or frequently recurring patterns in a graph, the discovery of groups or clusters of nodes that share common patterns of attributes and relationships, the extraction of patterns that distinguish one category of graphs from another, and the use of those patterns to predict the category of new graphs. Hands-On Application of Graph Data Mining Each chapter in the book focuses on a graph mining task, such as link analysis, cluster analysis, and classification. Through applications using real data sets, the book demonstrates how computational techniques can help solve real-world problems. The applications covered include network intrusion detection, tumor cell diagnostics, face recognition, predictive toxicology, mining metabolic and protein-protein interaction networks, and community detection in social networks. De...
DEFF Research Database (Denmark)
Vestergaard, Preben Dahl; Hartnell, Bert L.
2006-01-01
graph that remains can still be decomposed (such graphs are called randomly packable or randomly decomposable). In this paper we consider the following variation. Given a fixed graph H, determine which graphs (call them equipackable) have the property that every maximal edge disjoint packing with H is...
Current trends in Bayesian methodology with applications
Upadhyay, Satyanshu K; Dey, Dipak K; Loganathan, Appaia
2015-01-01
Collecting Bayesian material scattered throughout the literature, Current Trends in Bayesian Methodology with Applications examines the latest methodological and applied aspects of Bayesian statistics. The book covers biostatistics, econometrics, reliability and risk analysis, spatial statistics, image analysis, shape analysis, Bayesian computation, clustering, uncertainty assessment, high-energy astrophysics, neural networking, fuzzy information, objective Bayesian methodologies, empirical Bayes methods, small area estimation, and many more topics.Each chapter is self-contained and focuses on
Managing and Mining Graph Data
Aggarwal, Charu C
2010-01-01
Managing and Mining Graph Data is a comprehensive survey book in graph management and mining. It contains extensive surveys on a variety of important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy. It also studies a number of domain-specific scenarios such as stream mining, web graphs, social networks, chemical and biological data. The chapters are written by well known researchers in the field, and provide a broad perspective of the area. This is the first comprehensive survey book in t
GRAPH DATABASES AND GRAPH VIZUALIZATION
Klančar, Jure
2013-01-01
The thesis presents graph databases. Graph databases are a part of NoSQL databases, which is why this thesis presents basics of NoSQL databases as well. We have focused on advantages of graph databases compared to rela- tional databases. We have used one of native graph databases (Neo4j), to present more detailed processing of graph databases. To get more acquainted with graph databases and its principles, we developed a simple application that uses a Neo4j graph database to...
Revisiting k-means: New Algorithms via Bayesian Nonparametrics
Kulis, Brian; Jordan, Michael I.
2011-01-01
Bayesian models offer great flexibility for clustering applications---Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For the most part, such flexibility is lacking in classical clustering methods such as k-means. In this paper, we revisit the k-means clustering algorithm from a Bayesian nonparametric viewpoint. Inspired by the asymptotic connection between k-means and mixtures...
Parameterized Complexity Results for Exact Bayesian Network Structure Learning
Sebastian Ordyniak; Stefan Szeider
2014-01-01
Bayesian network structure learning is the notoriously difficult problem of discovering a Bayesian network that optimally represents a given set of training data. In this paper we study the computational worst-case complexity of exact Bayesian network structure learning under graph theoretic restrictions on the (directed) super-structure. The super-structure is an undirected graph that contains as subgraphs the skeletons of solution networks. We introduce the directed super-structure as a nat...
Algorithms and Complexity Results for Exact Bayesian Structure Learning
Sebastian Ordyniak; Stefan Szeider
2012-01-01
Bayesian structure learning is the NP-hard problem of discovering a Bayesian network that optimally represents a given set of training data. In this paper we study the computational worst-case complexity of exact Bayesian structure learning under graph theoretic restrictions on the super-structure. The super-structure (a concept introduced by Perrier, Imoto, and Miyano, JMLR 2008) is an undirected graph that contains as subgraphs the skeletons of solution networks. Our results apply to severa...
Ramon, Jan
2013-01-01
Graph mining is the study of how to perform data mining and machine learning on data represented with graphs. One can distinguish between, on the one hand, transactional graph mining, where a database of separate, independent graphs is considered (such as databases of molecules and databases of images), and, on the other hand, large network analysis, where a single large network is considered (such as chemical interaction networks and concept networks).
Application of Fuzzy Graph in Cluster Analys is of Species%模糊图在生物群落聚类中的应用
Institute of Scientific and Technical Information of China (English)
赵学靖; 苏锦霞; 杨凤翔
2001-01-01
基于模糊图的聚类方法，对Tanganyika湖的生物群落作了一动态聚类分析．在考虑捕食及被捕食强度基础上，作出如下假设：若某些种群捕食食饵的强度和被天敌捕食的强度相同，则这些种群可视为同一种群，从而对原始种群作一初步分类．基于捕食关系，在定义了相似系数及相似矩阵的基础上研究了种群的最大树关系图及动态聚类．%Based on the method of c luster analysis in fuzzy graph, adynamic cluster on Lake Tanganyika is brought in. Thinking of the strength of predator, the following supposition is made: so me species can be looked upon as the same species if the strength is equal. So t he original community can be clustered into some species classes. Also, the coef ficient and the matrix of analogy are induced, and the maximum tree and the dyn amic cluster based on the ration of species are researched.
Fotuhi Siahpirani, Alireza; Ay, Ferhat; Roy, Sushmita
2016-01-01
Chromosome conformation capture methods are being increasingly used to study three-dimensional genome architecture in multiple cell types and species. An important challenge is to examine changes in three-dimensional architecture across cell types and species. We present Arboretum-Hi-C, a multi-task spectral clustering method, to identify common and context-specific aspects of genome architecture. Compared to standard clustering, Arboretum-Hi-C produced more biologically consistent patterns of conservation. Most clusters are conserved and enriched for either high- or low-activity genomic signals. Most genomic regions diverge between clusters with similar chromatin state except for a few that are associated with lamina-associated domains and open chromatin. PMID:27233632
FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs
Zheng, Da; Mhembere, Disa; Burns, Randal; Vogelstein, Joshua; Carey E. Priebe; Szalay, Alexander S.
2014-01-01
Graph analysis performs many random reads and writes, thus, these workloads are typically performed in memory. Traditionally, analyzing large graphs requires a cluster of machines so the aggregate memory exceeds the graph size. We demonstrate that a multicore server can process graphs with billions of vertices and hundreds of billions of edges, utilizing commodity SSDs with minimal performance loss. We do so by implementing a graph-processing engine on top of a user-space SSD file system desi...
Fortunato, Santo
2010-02-01
The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i.e. the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Such clusters, or communities, can be considered as fairly independent compartments of a graph, playing a similar role like, e.g., the tissues or the organs in the human body. Detecting communities is of great importance in sociology, biology and computer science, disciplines where systems are often represented as graphs. This problem is very hard and not yet satisfactorily solved, despite the huge effort of a large interdisciplinary community of scientists working on it over the past few years. We will attempt a thorough exposition of the topic, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.
Sahasranand, K R
2010-01-01
Almost all known secret sharing schemes work on numbers. Such methods will have difficulty in sharing graphs since the number of graphs increases exponentially with the number of nodes. We propose a secret sharing scheme for graphs where we use graph intersection for reconstructing the secret which is hidden as a sub graph in the shares. Our method does not rely on heavy computational operations such as modular arithmetic or polynomial interpolation but makes use of very basic operations like assignment and checking for equality, and graph intersection can also be performed visually. In certain cases, the secret could be reconstructed using just pencil and paper by authorised parties but cannot be broken by an adversary even with unbounded computational power. The method achieves perfect secrecy for (2, n) scheme and requires far fewer operations compared to Shamir's algorithm. The proposed method could be used to share objects such as matrices, sets, plain text and even a heterogeneous collection of these. S...
Vishwanathan, S V N; Kondor, Imre Risi; Schraudolph, Nicol N
2008-01-01
We present a unified framework to study graph kernels, special cases of which include the random walk graph kernel \\citep{GaeFlaWro03,BorOngSchVisetal05}, marginalized graph kernel \\citep{KasTsuIno03,KasTsuIno04,MahUedAkuPeretal04}, and geometric kernel on graphs \\citep{Gaertner02}. Through extensions of linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) and reduction to a Sylvester equation, we construct an algorithm that improves the time complexity of kernel computation from $O(n^6)$ to $O(n^3)$. When the graphs are sparse, conjugate gradient solvers or fixed-point iterations bring our algorithm into the sub-cubic domain. Experiments on graphs from bioinformatics and other application domains show that it is often more than a thousand times faster than previous approaches. We then explore connections between diffusion kernels \\citep{KonLaf02}, regularization on graphs \\citep{SmoKon03}, and graph kernels, and use these connections to propose new graph kernels. Finally, we show that rational kernels ...
一种基于非参数贝叶斯模型的聚类算法%Data Clustering via Nonparametric Bayesian Models
Institute of Scientific and Technical Information of China (English)
张媛媛
2013-01-01
鉴于聚类分析是机器学习和数据挖掘领域的一项重要技术，并且与监督学习不同的是聚类分析中没有类别或标签的指导信息，所以如何选择合适的聚类个数(即模型选择)一直是聚类分析中的难点。由此提出了一种基于Dirichlet过程混合模型的聚类算法，并用collapsed Gibbs采样算法对混合模型的参数进行估计。新算法基于非参数贝叶斯模型的框架，能够在不断的采样过程中优化模型参数并形成合适的聚类个数。在人工合成数据集和真实数据集上的聚类实验结果表明：基于Dirichlet过程混合模型的聚类算法不但能够自动确定聚类个数，而且具有较强灵活性和鲁棒性。%Clustering is one of the most useful techniques in machine learning and data mining. In cluster analysis, model selection concerning how to determine the number of clusters is an important issue. Unlike supervised learning, there are no class labels and criteria to guide the search, so the model for clustering is always difficult to select. To tackle this problem, we present the concept of nonparametric clustering approach based on Dirichlet process mixture model (DPMM), and apply a collapsed Gibbs sampling technique to sample the posterior distribution. The proposed clustering algorithm follows the Bayesian nonparametric framework and can optimize the number of components and the parameters of the model. The experimental result of clustering shows that this Bayes model has promising properties and robust performance.
Directory of Open Access Journals (Sweden)
Peixin Zhao
2014-01-01
Full Text Available This paper suggests a novel clustering method for analyzing the National Incident-Based Reporting System (NIBRS data, which include the determination of correlation of different crime types, the development of a likelihood index for crimes to occur in a jurisdiction, and the clustering of jurisdictions based on crime type. The method was tested by using the 2005 assault data from 121 jurisdictions in Virginia as a test case. The analyses of these data show that some different crime types are correlated and some different crime parameters are correlated with different crime types. The analyses also show that certain jurisdictions within Virginia share certain crime patterns. This information assists with constructing a pattern for a specific crime type and can be used to determine whether a jurisdiction may be more likely to see this type of crime occur in their area.
Spectral Convergence Rate of Graph Laplacian
Wang, Xu
2015-01-01
Laplacian Eigenvectors of the graph constructed from a data set are used in many spectral manifold learning algorithms such as diffusion maps and spectral clustering. Given a graph constructed from a random sample of a $d$-dimensional compact submanifold $M$ in $\\mathbb{R}^D$, we establish the spectral convergence rate of the graph Laplacian. It implies the consistency of the spectral clustering algorithm via a standard perturbation argument. A simple numerical study indicates the necessity o...
Graph Mining Sub Domains and a Framework for Indexing – A Graphical Approach
K.Vivekanandan; A. Pankaj Moses Monickaraj; D. Ramya Chithra
2013-01-01
Graphs are one of the popular models for effective representation of complex structured huge data and the similarity search for graphs has become a fundamental research problem in Graph Mining. In this paper initially, the preliminary graph related basic theorems are brushed and showcased on with various research sub domains such as Graph Classification, Graph Searching, Graph Indexing, and Graph Clustering. These are discussed with few of the most dominant algorithms in their respective sub ...
Görke, Robert; Meyer-Bäse, Anke; Plant, Claudia; He, Huan; Emmett, Mark R.; Nilsson, Carol; Colman, Howard; Conrad, Charles A.
2011-06-01
Cancer stem cells (CSC) represent a very small percentage of the total tumor population however they pose a big challenge in treating cancer. Glycans play a key role in cancer therapeutics since overexpression of them depending on the glycan type can lead either to cell death or more invasive metastasis. Two major components, fetal bovine serum (FBS) and STAT3, are known to up- or down-regulate certain glycolipid or phospholipid compositions found in glioblastoma CSCs. The analysis and the understanding of the global interactional behavior of lipidomic networks remains a challenging task and can not be accomplished solely based on intuitive reasoning. The present contribution aims at applying graph clustering networks to analyze the functional aspects of certain activators or inhibitors at the molecular level in glioblastoma stem cells (GSCs). This research enhances our understanding of the differences in phenotype changes and determining the responses of glycans to certain treatments for the aggressive GSCs, and represents together with a quantitative phosphoproteomic study1 the most detailed systems biology study of GSCs differentiation known so far. Thus, these new paradigms are providing unique understanding of the mechanisms involved in GSC maintenance and tumorigenicity and are thus opening a new window to biomedical frontiers.
Vishwanathan, S. V. N.; Borgwardt, Karsten M.; Kondor, Imre Risi; Schraudolph, Nicol N.
2010-01-01
We present a unified framework to study graph kernels, special cases of which include the random walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexity of kernel computation between unlabeled graphs with n vertices from O(n^6) to O(n^3). We find a spectral decomposition approach even more efficient when computing entire kernel matric...
Energy Technology Data Exchange (ETDEWEB)
Sanfilippo, Antonio P.
2005-12-27
Graph theory is a branch of discrete combinatorial mathematics that studies the properties of graphs. The theory was pioneered by the Swiss mathematician Leonhard Euler in the 18th century, commenced its formal development during the second half of the 19th century, and has witnessed substantial growth during the last seventy years, with applications in areas as diverse as engineering, computer science, physics, sociology, chemistry and biology. Graph theory has also had a strong impact in computational linguistics by providing the foundations for the theory of features structures that has emerged as one of the most widely used frameworks for the representation of grammar formalisms.
Brokered Graph State Quantum Computing
Benjamin, Simon C.; Browne, Dan E.; Fitzsimons, Joe; Morton, John J. L.
2005-01-01
We describe a procedure for graph state quantum computing that is tailored to fully exploit the physics of optically active multi-level systems. Leveraging ideas from the literature on distributed computation together with the recent work on probabilistic cluster state synthesis, our model assigns to each physical system two logical qubits: the broker and the client. Groups of brokers negotiate new graph state fragments via a probabilistic optical protocol. Completed fragments are mapped from...
Effective resistance on graphs and the Epidemic quasimetric
Ericson, Josh; Poggi-Corradini, Pietro; Zhang, Hainan
2012-01-01
We introduce the epidemic quasimetric on graphs and study its behavior with respect to clustering techniques. In particular we compare its behavior to known objects such as the graph distance, effective resistance, and modulus of path families.
Abusing a hypergraph partitioner for unweighted graph partitioning
Fagginger Auer, B.O.; Bisseling, R.H.
2013-01-01
We investigate using the Mondriaan matrix partitioner for unweighted graph partitioning in the communication volume and edgecut metrics. By converting the unweighted graphs to appropriate matrices, we measure Mondriaan’s performance as a graph partitioner for the 10th DIMACS challenge on graph partitioning and clustering. We find that Mondriaan can effectively be used as a graph partitioner: w.r.t. the edge-cut metric, Mondriaan’s average results are within 21% of the best known results as li...
NXgraph: An Efficient Graph Processing System on a Single Machine
Chi, Yuze; Dai, Guohao; Wang, Yu; Sun, Guangyu; Li, Guoliang; Yang, Huazhong
2015-01-01
Recent studies show that graph processing systems on a single machine can achieve competitive performance compared with cluster-based graph processing systems. In this paper, we present NXgraph, an efficient graph processing system on a single machine. With the abstraction of vertex intervals and edge sub-shards, we propose the Destination-Sorted Sub-Shard (DSSS) structure to store a graph. By dividing vertices and edges into intervals and sub-shards, NXgraph ensures graph data access localit...
Trudeau, Richard J
1994-01-01
Preface1. Pure Mathematics Introduction; Euclidean Geometry as Pure Mathematics; Games; Why Study Pure Mathematics?; What's Coming; Suggested Reading2. Graphs Introduction; Sets; Paradox; Graphs; Graph diagrams; Cautions; Common Graphs; Discovery; Complements and Subgraphs; Isomorphism; Recognizing Isomorphic Graphs; Semantics The Number of Graphs Having a Given nu; Exercises; Suggested Reading3. Planar Graphs Introduction; UG, K subscript 5, and the Jordan Curve Theorem; Are there More Nonplanar Graphs?; Expansions; Kuratowski's Theorem; Determining Whether a Graph is Planar or
Brody, Samuel; Lapata, Mirella
2009-01-01
Sense induction seeks to automatically identify word senses directly from a corpus. A key assumption underlying previous work is that the context surrounding an ambiguous word is indicative of its meaning. Sense induction is thus typically viewed as an unsupervised clustering problem where the aim is to partition a word’s contexts into different classes, each representing a word sense. Our work places sense induction in a Bayesian context by modeling the contexts of the ambiguous word as samp...
Diestel, Reinhard
2000-01-01
This book is a concise, yet carefully written, introduction to modern graph theory, covering all its major recent developments. It can be used both as a reliable textbook for an introductory course and as a graduate text: on each topic it covers all the basic material in full detail, and adds one or two deeper results (again with detailed proofs) to illustrate the more advanced methods of that field. This second edition extends the first in two ways. It offers a thoroughly revised and updated chapter on graph minors, which now includes full new proofs of two of the central Robertson-Seymour theorems (as well as a detailed sketch of the entire proof of their celebrated Graph Minor Theorem). Second, there is now a section of hints for all the exercises, to enhance their value for both individual study and classroom use.
Graph hierarchies for phylogeography.
Cybis, Gabriela B; Sinsheimer, Janet S; Lemey, Philippe; Suchard, Marc A
2013-03-19
Bayesian phylogeographic methods simultaneously integrate geographical and evolutionary modelling, and have demonstrated value in assessing spatial spread patterns of measurably evolving organisms. We improve on existing phylogeographic methods by combining information from multiple phylogeographic datasets in a hierarchical setting. Consider N exchangeable datasets or strata consisting of viral sequences and locations, each evolving along its own phylogenetic tree and according to a conditionally independent geographical process. At the hierarchical level, a random graph summarizes the overall dispersion process by informing which migration rates between sampling locations are likely to be relevant in the strata. This approach provides an efficient and improved framework for analysing inherently hierarchical datasets. We first examine the evolutionary history of multiple serotypes of dengue virus in the Americas to showcase our method. Additionally, we explore an application to intrahost HIV evolution across multiple patients. PMID:23382428
Indian Academy of Sciences (India)
RAJAN SHRIVASTAVA; AVIJIT RAKSHIT; SUDHANSHU SHANKER; LOVEKESH VIG; PRADIPTA BANDYOPADHYAY
2016-09-01
The knowledge of degree of completeness of energy landscape search by stochastic algorithms is often lacking. A graph theory based method is used to investigate the completeness of search performed by Monte Carlo Temperature Basin Paving (MCTBP) algorithm for (H₂O)n, (n=6, 7, and 20). In the second part of the work, a combination of MCTBP and graph theory was used to devise a new algorithm for finding low energy structures of (H₂O)n, (n=21-25), where input structures for (H₂O)n comes from the graphs of (H₂O)n−1. The new algorithm can be a complementary tool to the MCTBP method.
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
Clustering with Spectral Methods
Gaertler, Marco
2002-01-01
Grouping and sorting are problems with a great tradition in the history of mankind. Clustering and cluster analysis is a small aspect in the wide spectrum. But these topics have applications in most scientific disciplines. Graph clustering is again a little fragment in the clustering area. Nevertheless it has the potential for new pioneering and innovative methods. One such method is the Markov Clustering presented by van Dongen in 'Graph Clustering by Flow Simulation'. We investigated the qu...
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Directory of Open Access Journals (Sweden)
Liangdong Hu
Full Text Available Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.
Draper, D.
2001-01-01
© 2012 Springer Science+Business Media, LLC. All rights reserved. Article Outline: Glossary Definition of the Subject and Introduction The Bayesian Statistical Paradigm Three Examples Comparison with the Frequentist Statistical Paradigm Future Directions Bibliography
Beeken, Paul
2014-01-01
Graphing is an essential skill that forms the foundation of any physical science. Understanding the relationships between measurements ultimately determines which modeling equations are successful in predicting observations. Over the years, science and math teachers have approached teaching this skill with a variety of techniques. For secondary…
Antipodal Interval-Valued Fuzzy Graphs
Rashmanlou, Hossein; Pal, Madhumangal
2014-01-01
Concepts of graph theory have applications in many areas of computer science including data mining, image segmentation, clustering, image capturing, networks, etc . An interval-valued fuzzy set is a generalization of the notion of a fuzzy set. Interval-valued fuzzy models give more precision, flexibility and compatibility to the system as compared to the fuzzy models. In this paper, we introduce the concept of antipodal interval - valued fuzzy graph and self median interval-valued fuzzy graph...
Graphs in machine learning: an introduction
Latouche, Pierre; Rossi, Fabrice
2015-01-01
Graphs are commonly used to characterise interactions between objects of interest. Because they are based on a straightforward formalism, they are used in many scientific fields from computer science to historical sciences. In this paper, we give an introduction to some methods relying on graphs for learning. This includes both unsupervised and supervised methods. Unsupervised learning algorithms usually aim at visualising graphs in latent spaces and/or clustering the nodes. Both focus on ext...
Betweenness Centrality in Graphs
Gago Álvarez, Silvia; Coronicová Hurajová, Jana; Madaras, Tomas
2014-01-01
The first book devoted exclusively to quantitative graph theory, Quantitative Graph Theory: Mathematical Foundations and Applications presents and demonstrates existing and novel methods for analyzing graphs quantitatively. Incorporating interdisciplinary knowledge from graph theory, information theory, measurement theory, and statistical techniques, this book covers a wide range of quantitative-graph theoretical concepts and methods, including those pertaining to real and random graphs such ...
Gould, Ronald
2012-01-01
This introduction to graph theory focuses on well-established topics, covering primary techniques and including both algorithmic and theoretical problems. The algorithms are presented with a minimum of advanced data structures and programming details. This thoroughly corrected 1988 edition provides insights to computer scientists as well as advanced undergraduates and graduate students of topology, algebra, and matrix theory. Fundamental concepts and notation and elementary properties and operations are the first subjects, followed by examinations of paths and searching, trees, and networks. S
Fission of Halving Edges Graphs
Khovanova, Tanya; Yang, Dai
2013-01-01
In this paper we discuss an operation on halving edges graph that we call fission. Fission replaces each point in a given configuration with a small cluster of k points. The operation interacts nicely with halving edges, so we examine its properties in detail.
Distributed Parallel Inference on Large Factor Graphs
Gonzalez, Joseph E.; Low, Yucheng; Guestrin, Carlos E.; O'Hallaron, David
2012-01-01
As computer clusters become more common and the size of the problems encountered in the field of AI grows, there is an increasing demand for efficient parallel inference algorithms. We consider the problem of parallel inference on large factor graphs in the distributed memory setting of computer clusters. We develop a new efficient parallel inference algorithm, DBRSplash, which incorporates over-segmented graph partitioning, belief residual scheduling, and uniform work Splash operations. We e...
Diestel, Reinhard
2012-01-01
HauptbeschreibungThis standard textbook of modern graph theory, now in its fourth edition, combinesthe authority of a classic with the engaging freshness of style that is the hallmarkof active mathematics. It covers the core material of the subject with concise yetreliably complete proofs, while offering glimpses of more advanced methodsin each field by one or two deeper results, again with proofs given in full detail.The book can be used as a reliable text for an introductory course, as a graduatetext, and for self-study. Rezension"Deep, clear, wonderful. This is a serious book about the
Graphs in machine learning: an introduction
Latouche, Pierre
2015-01-01
Graphs are commonly used to characterise interactions between objects of interest. Because they are based on a straightforward formalism, they are used in many scientific fields from computer science to historical sciences. In this paper, we give an introduction to some methods relying on graphs for learning. This includes both unsupervised and supervised methods. Unsupervised learning algorithms usually aim at visualising graphs in latent spaces and/or clustering the nodes. Both focus on extracting knowledge from graph topologies. While most existing techniques are only applicable to static graphs, where edges do not evolve through time, recent developments have shown that they could be extended to deal with evolving networks. In a supervised context, one generally aims at inferring labels or numerical values attached to nodes using both the graph and, when they are available, node characteristics. Balancing the two sources of information can be challenging, especially as they can disagree locally or globall...
Parallel Graph Partitioning for Complex Networks
Meyerhenke, Henning; Schulz, Christian
2014-01-01
Processing large complex networks like social networks or web graphs has recently attracted considerable interest. In order to do this in parallel, we need to partition them into pieces of about equal size. Unfortunately, previous parallel graph partitioners originally developed for more regular mesh-like networks do not work well for these networks. This paper addresses this problem by parallelizing and adapting the label propagation technique originally developed for graph clustering. By introducing size constraints, label propagation becomes applicable for both the coarsening and the refinement phase of multilevel graph partitioning. We obtain very high quality by applying a highly parallel evolutionary algorithm to the coarsened graph. The resulting system is both more scalable and achieves higher quality than state-of-the-art systems like ParMetis or PT-Scotch. For large complex networks the performance differences are very big. For example, our algorithm can partition a web graph with 3.3 billion edges ...
Directory of Open Access Journals (Sweden)
Jinfei Liu
2013-04-01
Full Text Available DBSCAN is a well-known density-based clustering algorithm which offers advantages for finding clusters of arbitrary shapes compared to partitioning and hierarchical clustering methods. However, there are few papers studying the DBSCAN algorithm under the privacy preserving distributed data mining model, in which the data is distributed between two or more parties, and the parties cooperate to obtain the clustering results without revealing the data at the individual parties. In this paper, we address the problem of two-party privacy preserving DBSCAN clustering. We first propose two protocols for privacy preserving DBSCAN clustering over horizontally and vertically partitioned data respectively and then extend them to arbitrarily partitioned data. We also provide performance analysis and privacy proof of our solution..
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Spectral bounds for percolation on directed and undirected graphs
Hamilton, Kathleen E.; Pryadko, Leonid P.
2015-01-01
We give several algebraic bounds for percolation on directed and undirected graphs: proliferation of strongly-connected clusters, proliferation of in- and out-clusters, and the transition associated with the number of giant components.
Graph-Theoretic Solutions to Computational Geometry Problems
Eppstein, David
2009-01-01
Many problems in computational geometry are not stated in graph-theoretic terms, but can be solved efficiently by constructing an auxiliary graph and performing a graph-theoretic algorithm on it. Often, the efficiency of the algorithm depends on the special properties of the graph constructed in this way. We survey the art gallery problem, partition into rectangles, minimum-diameter clustering, rectilinear cartogram construction, mesh stripification, angle optimization in tilings, and metric ...
Linear embeddings of graphs and graph limits
Chuangpishit, Huda; Ghandehari, Mahya; Hurshman, Matt; Janssen, Jeannette; Kalyaniwalla, Nauzer
2012-01-01
Consider a random graph process where vertices are chosen from the interval $[0,1]$, and edges are chosen independently at random, but so that, for a given vertex $x$, the probability that there is an edge to a vertex $y$ decreases as the distance between $x$ and $y$ increases. We call this a random graph with a linear embedding. We define a new graph parameter $\\Gamma^*$, which aims to measure the similarity of the graph to an instance of a random graph with a linear embedding. For a graph $...
An enhanced classical approach to graph isomorphism using continuous-time quantum walk
International Nuclear Information System (INIS)
Studies on graph isomorphism play an important role in graph research, and graph isomorphism algorithms have a wide range of applications in image matching, pattern recognition, computer vision, biochemistry and other fields. Previous research demonstrated that involving discrete-time quantum walk in the graph isomorphism algorithm could achieve complexity O(N7) for general graphs, since quantum walk could be utilized as a new toolbox for solving graph problems. We develop an enhanced classical approach to graph isomorphism using continuous-time quantum walk, which has lower complexity O(N5) and can effectively distinguish the graphs that are generally considered difficult. In addition, we define a graph similarity measure based on the proposed algorithm, which can be used for graph isomorphism and graph clustering. In the experiment, we test a wide variety of classes of graphs; the results show that the algorithm has a wide range of applications rather than being limited to a specific type of graph. (paper)
Preserving Differential Privacy in Degree-Correlation based Graph Generation.
Wang, Yue; Wu, Xintao
2013-08-01
Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as cluster coefficient often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum) on tabular data. In this paper, we study the problem of enforcing edge differential privacy in graph generation. The idea is to enforce differential privacy on graph model parameters learned from the original network and then generate the graphs for releasing using the graph model with the private parameters. In particular, we develop a differential privacy preserving graph generator based on the dK-graph generation model. We first derive from the original graph various parameters (i.e., degree correlations) used in the dK-graph model, then enforce edge differential privacy on the learned parameters, and finally use the dK-graph model with the perturbed parameters to generate graphs. For the 2K-graph model, we enforce the edge differential privacy by calibrating noise based on the smooth sensitivity, rather than the global sensitivity. By doing this, we achieve the strict differential privacy guarantee with smaller magnitude noise. We conduct experiments on four real networks and compare the performance of our private dK-graph models with the stochastic Kronecker graph generation model in terms of utility and privacy tradeoff. Empirical evaluations show the developed private dK-graph generation models significantly outperform the approach based on the stochastic Kronecker generation model. PMID:24723987
Hamiltonian Strongly Regular Graphs
Brouwer, A.E.; Haemers, W.H.
2008-01-01
We give a sufficient condition for a distance-regular graph to be Hamiltonian. In particular, the Petersen graph is the only connected non-Hamiltonian strongly regular graph on fewer than 99 vertices.
Brandenburg, Franz J.; Brandes, Ulrik; Eades, Peter; Marks, Joe
2002-01-01
This report describes the Tenth Annual Graph Drawing Contest, held in conjunction with the 2003 Graph Drawing Symposium in Perugia, Italy. The purpose of the contest is to monitor and challenge the current state of the graph-drawing technology.
Alshehri, Noura O.; Muhammad Akram
2013-01-01
We introduce the concept of Cayley bipolar fuzzy graphs and investigate some of their properties. We present some interesting properties of bipolar fuzzy graphs in terms of algebraic structures. We also discuss connectedness in Cayley bipolar fuzzy graphs.
Recursive graphs with small-world scale-free properties
Comellas, Francesc; Fertin, Guillaume; Raspaud, André
2004-03-01
We discuss a category of graphs, recursive clique trees, which have small-world and scale-free properties and allow a fine tuning of the clustering and the power-law exponent of their discrete degree distribution. We determine relevant characteristics of those graphs: the diameter, degree distribution, and clustering parameter. The graphs have also an interesting recursive property, and generalize recent constructions with fixed degree distributions.
Spanning Tree Based Attribute Clustering
DEFF Research Database (Denmark)
Zeng, Yifeng; Jorge, Cordero Hernandez
2009-01-01
inconsistent edges from a maximum spanning tree by starting appropriate initial modes, therefore generating stable clusters. It discovers sound clusters through simple graph operations and achieves significant computational savings. We compare the Star Discovery algorithm against earlier attribute clustering...
DEFF Research Database (Denmark)
Dashab, Golam Reza; Kadri, Naveen Kumar; Mahdi Shariati, Mohammad;
2012-01-01
) Mixed model analysis (MMA), 2) Random haplotype model (RHM), 3) Genealogy-based mixed model (GENMIX), and 4) Bayesian variable selection (BVS). The data consisted of phenotypes of 2000 animals from 20 sire families and were genotyped with 9990 SNPs on five chromosomes. Results: Out of the eight...
Tan, Yong
2013-01-01
In this paper, author uses set theory to construct a logic model of abstract figure from binary relation. Based on the uniform quantified structure, author gives two logic system for graph traversal and graph coloring respectively, moreover shows a new method of cutting graph. Around this model, there are six algorithms in this paper including exact graph traversal, Algebra calculation of natural number, graph partition and graph coloring.
Kirstein, Roland
2005-01-01
This paper presents a modification of the inspection game: The ?Bayesian Monitoring? model rests on the assumption that judges are interested in enforcing compliant behavior and making correct decisions. They may base their judgements on an informative but imperfect signal which can be generated costlessly. In the original inspection game, monitoring is costly and generates a perfectly informative signal. While the inspection game has only one mixed strategy equilibrium, three Perfect Bayesia...
Pattern vectors from algebraic graph theory.
Wilson, Richard C; Hancock, Edwin R; Luo, Bin
2005-07-01
Graph structures have proven computationally cumbersome for pattern analysis. The reason for this is that, before graphs can be converted to pattern vectors, correspondences must be established between the nodes of structures which are potentially of different size. To overcome this problem, in this paper, we turn to the spectral decomposition of the Laplacian matrix. We show how the elements of the spectral matrix for the Laplacian can be used to construct symmetric polynomials that are permutation invariants. The coefficients of these polynomials can be used as graph features which can be encoded in a vectorial manner. We extend this representation to graphs in which there are unary attributes on the nodes and binary attributes on the edges by using the spectral decomposition of a Hermitian property matrix that can be viewed as a complex analogue of the Laplacian. To embed the graphs in a pattern space, we explore whether the vectors of invariants can be embedded in a low-dimensional space using a number of alternative strategies, including principal components analysis (PCA), multidimensional scaling (MDS), and locality preserving projection (LPP). Experimentally, we demonstrate that the embeddings result in well-defined graph clusters. Our experiments with the spectral representation involve both synthetic and real-world data. The experiments with synthetic data demonstrate that the distances between spectral feature vectors can be used to discriminate between graphs on the basis of their structure. The real-world experiments show that the method can be used to locate clusters of graphs. PMID:16013758
Brokered Graph State Quantum Computing
Benjamin, S C; Fitzsimons, J; Morton, J J L; Benjamin, Simon C.; Browne, Dan E.; Fitzsimons, Joe; Morton, John J. L.
2005-01-01
We describe a procedure for graph state quantum computing that is tailored to fully exploit the physics of optically active multi-level systems. Leveraging ideas from the literature on distributed computation together with the recent work on probabilistic cluster state synthesis, our model assigns to each physical system two logical qubits: the broker and the client. Groups of brokers negotiate new graph state fragments via a probabilistic optical protocol. Completed fragments are mapped from broker to clients via a simple state transition and measurement. The clients, whose role is to store the nascent graph state long term, remain entirely insulated from failures during the brokerage. We describe an implementation in terms of NV-centres in diamond, where brokers and clients are very naturally embodied as electron and nuclear spins.
Models of random graph hierarchies
Paluch, Robert; Suchecki, Krzysztof; Hołyst, Janusz A.
2015-10-01
We introduce two models of inclusion hierarchies: random graph hierarchy (RGH) and limited random graph hierarchy (LRGH). In both models a set of nodes at a given hierarchy level is connected randomly, as in the Erdős-Rényi random graph, with a fixed average degree equal to a system parameter c. Clusters of the resulting network are treated as nodes at the next hierarchy level and they are connected again at this level and so on, until the process cannot continue. In the RGH model we use all clusters, including those of size 1, when building the next hierarchy level, while in the LRGH model clusters of size 1 stop participating in further steps. We find that in both models the number of nodes at a given hierarchy level h decreases approximately exponentially with h. The height of the hierarchy H, i.e. the number of all hierarchy levels, increases logarithmically with the system size N, i.e. with the number of nodes at the first level. The height H decreases monotonically with the connectivity parameter c in the RGH model and it reaches a maximum for a certain c max in the LRGH model. The distribution of separate cluster sizes in the LRGH model is a power law with an exponent about - 1.25. The above results follow from approximate analytical calculations and have been confirmed by numerical simulations.
Preserving Differential Privacy in Degree-Correlation based Graph Generation
Directory of Open Access Journals (Sweden)
Yue Wang
2013-08-01
Full Text Available Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as cluster coefficient often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum on tabular data. In this paper, we study the problem of enforcing edge differential privacy in graph generation. The idea is to enforce differential privacy on graph model parameters learned from the original network and then generate the graphs for releasing using the graph model with the private parameters. In particular, we develop a differential privacy preserving graph generator based on the dK-graph generation model. We first derive from the original graph various parameters (i.e., degree correlations used in the dK-graph model, then enforce edge differential privacy on the learned parameters, and finally use the dKgraph model with the perturbed parameters to generate graphs. For the 2K-graph model, we enforce the edge differential privacy by calibrating noise based on the smooth sensitivity, rather than the global sensitivity. By doing this, we achieve the strict differential privacy guarantee with smaller magnitude noise. We conduct experiments on four real networks and compare the performance of our private dK-graph models with the stochastic Kronecker graph generation model in terms of utility and privacy tradeoff. Empirical evaluations show the developed private dK-graph generation models significantly outperform the approach based on the stochastic Kronecker generation model.
Operations on Graphs Increasing Some Graph Parameters
Kelmans, Alexander
2011-01-01
In this partly expository paper we discuss and describe some of our old and recent results on partial orders on the set (m,n)-graphs (i.e. graphs with n vertices and m edges) and some operations on graphs that are monotone with respect to these partial orders. The partial orders under consideration include those related with some Laplacian characteristics of graphs as well as with some probabilistic characteristics of graphs with randomly deleted edges. Section 2 provides some notions, notation, and simple observations. Section 3 contains some basic facts on the Laplacian polynomial of a graph. Section 4 describes various graph operation and their properties. In Section 5 we introduce some partial orders on the set of (m,n)-graphs related, in particular, with the graph Laplacian and the graph reliability (Laplacian posets and reliability posets}). Section 6 contains some old and recent results on the monotonicity of some graph operations with respect to Laplacian posets. Section 7 and 8 include some old and r...
Diot, Emilie; Gavoille, Cyril
In this paper we investigate the structural properties of k-path separable graphs, that are the graphs that can be separated by a set of k shortest paths. We identify several graph families having such path separability, and we show that this property is closed under minor taking. In particular we establish a list of forbidden minors for 1-path separable graphs.
Khovanova, Tanya
2008-01-01
I show how to associate a Clifford algebra to a graph. I describe the structure of these Clifford graph algebras and provide many examples and pictures. I describe which graphs correspond to isomorphic Clifford algebras and also discuss other related sets of graphs. This construction can be used to build models of representations of simply-laced compact Lie groups.
Irregular Bipolar Fuzzy Graphs
Samanta, Sovan; Pal, Madhumangal
2012-01-01
In this paper, we define irregular bipolar fuzzy graphs and its various classifications. Size of regular bipolar fuzzy graphs is derived. The relation between highly and neighbourly irregular bipolar fuzzy graphs are established. Some basic theorems related to the stated graphs have also been presented.
A model of language inflection graphs
Fukś, Henryk; Cao, Yi
2015-01-01
Inflection graphs are highly complex networks representing relationships between inflectional forms of words in human languages. For so-called synthetic languages, such as Latin or Polish, they have particularly interesting structure due to abundance of inflectional forms. We construct the simplest form of inflection graphs, namely a bipartite graph in which one group of vertices corresponds to dictionary headwords and the other group to inflected forms encountered in a given text. We then study projection of this graph on the set of headwords. The projection decomposes into a large number of connected components, to be called word groups. Distribution of sizes of word group exhibits some remarkable properties, resembling cluster distribution in a lattice percolation near the critical point. We propose a simple model which produces graphs of this type, reproducing the desired component distribution and other topological features.
Bessiere, Pierre; Ahuactzin, Juan Manuel; Mekhnacha, Kamel
2013-01-01
Probability as an Alternative to Boolean LogicWhile logic is the mathematical foundation of rational reasoning and the fundamental principle of computing, it is restricted to problems where information is both complete and certain. However, many real-world problems, from financial investments to email filtering, are incomplete or uncertain in nature. Probability theory and Bayesian computing together provide an alternative framework to deal with incomplete and uncertain data. Decision-Making Tools and Methods for Incomplete and Uncertain DataEmphasizing probability as an alternative to Boolean
Join-Graph Propagation Algorithms
Mateescu, Robert; Kask, Kalev; Gogate, Vibhav; Dechter, Rina
2014-01-01
The paper investigates parameterized approximate message-passing schemes that are based on bounded inference and are inspired by Pearl's belief propagation algorithm (BP). We start with the bounded inference mini-clustering algorithm and then move to the iterative scheme called Iterative Join-Graph Propagation (IJGP), that combines both iteration and bounded inference. Algorithm IJGP belongs to the class of Generalized Belief Propagation algorithms, a framework that allowed connections with a...
Estrada, Ernesto
2015-01-01
A generalization of the random geometric graph (RGG) model is proposed by considering a set of points uniformly and independently distributed on a rectangle of unit area instead of on a unit square \\left[0,1\\right]^{2}. The topological properties, such as connectivity, average degree, average path length and clustering, of the random rectangular graphs (RRGs) generated by this model are then studied as a function of the rectangle sides lengths a and b=1/a, and the radius r used to connect the nodes. When a=1 we recover the RGG, and when a\\rightarrow\\infty the very elongated rectangle generated resembles a one-dimensional RGG. We provided computational and analytical evidence that the topological properties of the RRG differ significantly from those of the RGG. The connectivity of the RRG depends not only on the number of nodes as in the case of the RGG, but also on the side length of the rectangle. As the rectangle is more elongated the critical radius for connectivity increases following first a power-law an...
Trotignon, Nicolas
2013-01-01
Perfect graphs were defined by Claude Berge in the 1960s. They are important objects for graph theory, linear programming and combinatorial optimization. Claude Berge made a conjecture about them, that was proved by Chudnovsky, Robertson, Seymour and Thomas in 2002, and is now called the strong perfect graph theorem. This is a survey about perfect graphs, mostly focused on the strong perfect graph theorem.
Models of random graph hierarchies
Paluch, Robert; Holyst, Janusz
2015-01-01
We introduce two models of inclusion hierarchies: Random Graph Hierarchy (RGH) and Limited Random Graph Hierarchy (LRGH). In both models a set of nodes at a given hierarchy level is connected randomly, as in the Erd\\H{o}s-R\\'{e}nyi random graph, with a fixed average degree equal to a system parameter $c$. Clusters of the resulting network are treated as nodes at the next hierarchy level and they are connected again at this level and so on, until the process cannot continue. In the RGH model we use all clusters, including those of size $1$, when building the next hierarchy level, while in the LRGH model clusters of size $1$ stop participating in further steps. We find that in both models the number of nodes at a given hierarchy level $h$ decreases approximately exponentially with $h$. The height of the hierarchy $H$, i.e. the number of all hierarchy levels, increases logarithmically with the system size $N$, i.e. with the number of nodes at the first level. The height $H$ decreases monotonically with the conne...
Isometry on Interval-valued Fuzzy Graphs
Rashmanlou, Hossein; Pal, Madhumangal
2014-01-01
Especially in research areas of computer science such as data mining, image segmentation, clustering image capturing and networking. The interval-valued fuzzy graphs are more flexible and compatible than fuzzy graphs due to the fact that they allowed the degree of membership of a vertex to an edge to be represented by interval valued in [0,1] rather than the crisp real values between 0 and 1.
Scale-invariant geometric random graphs
Xie, Zheng; Rogers, Tim
2016-03-01
We introduce and analyze a class of growing geometric random graphs that are invariant under rescaling of space and time. Directed connections between nodes are drawn according to influence zones that depend on node position in space and time, mimicking the heterogeneity and increased specialization found in growing networks. Through calculations and numerical simulations we explore the consequences of scale invariance for geometric random graphs generated this way. Our analysis reveals a dichotomy between scale-free and Poisson distributions of in- and out-degree, the existence of a random number of hub nodes, high clustering, and unusual percolation behavior. These properties are similar to those of empirically observed web graphs.
Characteristic imsets for learning Bayesian network structure
Czech Academy of Sciences Publication Activity Database
Hemmecke, R.; Lindner, S.; Studený, Milan
2012-01-01
Roč. 53, č. 9 (2012), s. 1336-1349. ISSN 0888-613X R&D Projects: GA MŠk(CZ) 1M0572; GA ČR GA201/08/0539 Institutional support: RVO:67985556 Keywords : learning Bayesian network structure * essential graph * standard imset * characteristic imset * LP relaxation of a polytope Subject RIV: BA - General Mathematics Impact factor: 1.729, year: 2012 http://library.utia.cas.cz/separaty/2012/MTR/studeny-0382596.pdf
Applications of Graph Theory in Computer Science
Directory of Open Access Journals (Sweden)
U. Sekar
2013-11-01
Full Text Available The field of mathematics plays vital role in various fields. One of the important areas in mathematics is graph theory which is used in structural models. This structural arrangements of various objects or technologies lead to new inventions and modifications in the existing environment for enhancement in those fields. The field graph theory started its journey from the problem of Konigsberg Bridge in 1735. This paper gives an overview of the applications of graph theory in heterogeneous fields to some extent but mainly focuses on the computer science applications that uses graph theoretical concepts. Various papers based on graph theory have been studied related to scheduling concepts, computer science applications and an overview has been presented here.Graph theoretical ideas are highly utilized by computer science applications. Especially in research areas of computer science such data mining, image segmentation, clustering, image capturing, networking etc., For example a data structure can be designed in the form of tree which in turn utilized vertices and edges. Similarly modeling of network topologies can be done using graph concepts. In the same way the most important concept of graph coloring is utilized in resource allocation, scheduling. Also, paths, walks and circuits in graph theory are used in tremendous applications say traveling salesman problem, database design concepts, resource networking. This leads to the development of new algorithms and new theorems that can be used in tremendous applications. First section gives the historical background of graph theory and some applications in scheduling. Second section emphasizes how graph theory is utilized in various computer applications.
Bapat, Ravindra B
2014-01-01
This new edition illustrates the power of linear algebra in the study of graphs. The emphasis on matrix techniques is greater than in other texts on algebraic graph theory. Important matrices associated with graphs (for example, incidence, adjacency and Laplacian matrices) are treated in detail. Presenting a useful overview of selected topics in algebraic graph theory, early chapters of the text focus on regular graphs, algebraic connectivity, the distance matrix of a tree, and its generalized version for arbitrary graphs, known as the resistance matrix. Coverage of later topics include Laplacian eigenvalues of threshold graphs, the positive definite completion problem and matrix games based on a graph. Such an extensive coverage of the subject area provides a welcome prompt for further exploration. The inclusion of exercises enables practical learning throughout the book. In the new edition, a new chapter is added on the line graph of a tree, while some results in Chapter 6 on Perron-Frobenius theory are reo...
Bayesian Methods for Radiation Detection and Dosimetry
Groer, Peter G
2002-01-01
We performed work in three areas: radiation detection, external and internal radiation dosimetry. In radiation detection we developed Bayesian techniques to estimate the net activity of high and low activity radioactive samples. These techniques have the advantage that the remaining uncertainty about the net activity is described by probability densities. Graphs of the densities show the uncertainty in pictorial form. Figure 1 below demonstrates this point. We applied stochastic processes for a method to obtain Bayesian estimates of 222Rn-daughter products from observed counting rates. In external radiation dosimetry we studied and developed Bayesian methods to estimate radiation doses to an individual with radiation induced chromosome aberrations. We analyzed chromosome aberrations after exposure to gammas and neutrons and developed a method for dose-estimation after criticality accidents. The research in internal radiation dosimetry focused on parameter estimation for compartmental models from observed comp...
Learning Bayesian Networks from Data by Particle Swarm Optimization
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
Learning Bayesian network is an NP-hard problem. When the number of variables is large, the process of searching optimal network structure could be very time consuming and tends to return a structure which is local optimal. The particle swarm optimization (PSO) was introduced to the problem of learning Bayesian networks and a novel structure learning algorithm using PSO was proposed. To search in directed acyclic graphs spaces efficiently, a discrete PSO algorithm especially for structure learning was proposed based on the characteristics of Bayesian networks. The results of experiments show that our PSO based algorithm is fast for convergence and can obtain better structures compared with genetic algorithm based algorithms.
Bolta, Sandra
2014-01-01
In this BSc thesis we deal with matrix graph theory. We are interested primarily in the eigenvalues of the so-called adjacency matrix of a given graph. Because of that, we present the basic concepts and some basic results from linear algebra and a short introduction to a graph theory. We introduce the concepts of adjacency matrices, eigenvalues and the spectrum of a given graph. We investigate how the properties of a given graph reflect on its spectrum. For the well-known families of graphs w...
Directory of Open Access Journals (Sweden)
C. Dalfo
2015-10-01
Full Text Available We study a family of graphs related to the $n$-cube. The middle cube graph of parameter k is the subgraph of $Q_{2k-1}$ induced by the set of vertices whose binary representation has either $k-1$ or $k$ number of ones. The middle cube graphs can be obtained from the well-known odd graphs by doubling their vertex set. Here we study some of the properties of the middle cube graphs in the light of the theory of distance-regular graphs. In particular, we completely determine their spectra (eigenvalues and their multiplicities, and associated eigenvectors.
Dalfo, C.; Fiol, M.A.; Mitjana, M.
2015-01-01
We study a family of graphs related to the $n$-cube. The middle cube graph of parameter k is the subgraph of $Q_{2k-1}$ induced by the set of vertices whose binary representation has either $k-1$ or $k$ number of ones. The middle cube graphs can be obtained from the well-known odd graphs by doubling their vertex set. Here we study some of the properties of the middle cube graphs in the light of the theory of distance-regular graphs. In particular, we completely determine their spectra (eigenv...
Diaconis, Persi; Janson, Svante
2011-01-01
We work out the graph limit theory for dense interval graphs. The theory developed departs from the usual description of a graph limit as a symmetric function $W(x,y)$ on the unit square, with $x$ and $y$ uniform on the interval $(0,1)$. Instead, we fix a $W$ and change the underlying distribution of the coordinates $x$ and $y$. We find choices such that our limits are continuous. Connections to random interval graphs are given, including some examples. We also show a continuity result for the chromatic number and clique number of interval graphs. Some results on uniqueness of the limit description are given for general graph limits.
Approximation methods for efficient learning of Bayesian networks
Riggelsen, C
2008-01-01
This publication offers and investigates efficient Monte Carlo simulation methods in order to realize a Bayesian approach to approximate learning of Bayesian networks from both complete and incomplete data. For large amounts of incomplete data when Monte Carlo methods are inefficient, approximations are implemented, such that learning remains feasible, albeit non-Bayesian. The topics discussed are: basic concepts about probabilities, graph theory and conditional independence; Bayesian network learning from data; Monte Carlo simulation techniques; and, the concept of incomplete data. In order to provide a coherent treatment of matters, thereby helping the reader to gain a thorough understanding of the whole concept of learning Bayesian networks from (in)complete data, this publication combines in a clarifying way all the issues presented in the papers with previously unpublished work.
Pancyclic and bipancyclic graphs
George, John C; Wallis, W D
2016-01-01
This book is focused on pancyclic and bipancyclic graphs and is geared toward researchers and graduate students in graph theory. Readers should be familiar with the basic concepts of graph theory, the definitions of a graph and of a cycle. Pancyclic graphs contain cycles of all possible lengths from three up to the number of vertices in the graph. Bipartite graphs contain only cycles of even lengths, a bipancyclic graph is defined to be a bipartite graph with cycles of every even size from 4 vertices up to the number of vertices in the graph. Cutting edge research and fundamental results on pancyclic and bipartite graphs from a wide range of journal articles and conference proceedings are composed in this book to create a standalone presentation. The following questions are highlighted through the book: - What is the smallest possible number of edges in a pancyclic graph with v vertices? - When do pancyclic graphs exist with exactly one cycle of every possible length? - What is the smallest possible number of...
Graph limits and exchangeable random graphs
Diaconis, Persi; Janson, Svante
2007-01-01
We develop a clear connection between deFinetti's theorem for exchangeable arrays (work of Aldous--Hoover--Kallenberg) and the emerging area of graph limits (work of Lovasz and many coauthors). Along the way, we translate the graph theory into more classical probability.
Continuous Time Group Discovery in Dynamic Graphs
Energy Technology Data Exchange (ETDEWEB)
Miller, K; Eliassi-Rad, T
2010-11-04
With the rise in availability and importance of graphs and networks, it has become increasingly important to have good models to describe their behavior. While much work has focused on modeling static graphs, we focus on group discovery in dynamic graphs. We adapt a dynamic extension of Latent Dirichlet Allocation to this task and demonstrate good performance on two datasets. Modeling relational data has become increasingly important in recent years. Much work has focused on static graphs - that is fixed graphs at a single point in time. Here we focus on the problem of modeling dynamic (i.e. time-evolving) graphs. We propose a scalable Bayesian approach for community discovery in dynamic graphs. Our approach is based on extensions of Latent Dirichlet Allocation (LDA). LDA is a latent variable model for topic modeling in text corpora. It was extended to deal with topic changes in discrete time and later in continuous time. These models were referred to as the discrete Dynamic Topic Model (dDTM) and the continuous Dynamic Topic Model (cDTM), respectively. When adapting these models to graphs, we take our inspiration from LDA-G and SSN-LDA, applications of LDA to static graphs that have been shown to effectively factor out community structure to explain link patterns in graphs. In this paper, we demonstrate how to adapt and apply the cDTM to the task of finding communities in dynamic networks. We use link prediction to measure the quality of the discovered community structure and apply it to two different relational datasets - DBLP author-keyword and CAIDA autonomous systems relationships. We also discuss a parallel implementation of this approach using Hadoop. In Section 2, we review LDA and LDA-G. In Section 3, we review the cDTM and introduce cDTMG, its adaptation to modeling dynamic graphs. We discuss inference for the cDTM-G and details of our parallel implementation in Section 4 and present its performance on two datasets in Section 5 before concluding in
DNA甲基化微阵列的非参数贝叶斯聚类算法%Nonparametric Bayesian Clustering Methods of DNA Methylation Microarray
Institute of Scientific and Technical Information of China (English)
张林; 刘辉
2012-01-01
A model based clustering algorithm for Illumina GoldenGate microarray data is proposed in this paper. By infinite beta mixture model and by adopting Dirichlet process as prior knowledge, the cluster structure can be denned based on model and data. Simulation results demonstrate that this methodology can estimate the number of clusters, the cluster mixing weight and the own characteristic of each cluster, and can reach relatively ideal clustering effect.%面向Illumina GoldenGate甲基化微阵列数据提出了一种基于模型的聚类算法.算法通过建立贝塔无限混合模型,采用Dirichlet过程作为先验,实现了基于数据和模型的聚类结构的建立,实验结果表明该算法能够有效估计出聚类类别个数、每个聚类类别的混合权重、每个聚类类别的特征等信息,达到比较理想的聚类效果.
Homomorphisms of connectome graphs
Daugulis, Peteris
2014-01-01
We propose to study homomorphisms of connectome graphs. Homomorphisms can be studied as sequences of elementary homomorphisms - folds, which identify pairs of vertices. Several fold types are defined. Initial computation results for some connectome graphs are described.
Kandasamy, W B Vasantha
2009-01-01
For the first time we represent every finite group in the form of a graph in this book. The authors choose to call these graphs as identity graph, since the main role in obtaining the graph is played by the identity element of the group. This study is innovative because through this description one can immediately look at the graph and say the number of elements in the group G which are self-inversed. Also study of different properties, like the subgroups of a group, normal subgroups of a group, p-sylow subgroups of a group and conjugate elements of a group are carried out using the identity graph of the group in this book. This book has four chapters. The first chapter is introductory. The second chapter represents groups as graphs. In the third chapter, we have defined similar types of graphs for algebraic structures like commutative semigroups, loops, commutative groupoids and commutative rings. The final chapter poses 52 problems.
Preserving Differential Privacy in Degree-Correlation based Graph Generation
Yue Wang; Xintao Wu
2013-01-01
Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as cluster coefficient often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum) on tabular data. In this paper, we study the problem of enforcing edge differential privacy in graph generation. The idea is to enforce differential privacy on graph model parameters learned from the original network and then genera...
Network evolution driven by dynamics applied to graph coloring
International Nuclear Information System (INIS)
An evolutionary network driven by dynamics is studied and applied to the graph coloring problem. From an initial structure, both the topology and the coupling weights evolve according to the dynamics. On the other hand, the dynamics of the network are determined by the topology and the coupling weights, so an interesting structure-dynamics co-evolutionary scheme appears. By providing two evolutionary strategies, a network described by the complement of a graph will evolve into several clusters of nodes according to their dynamics. The nodes in each cluster can be assigned the same color and nodes in different clusters assigned different colors. In this way, a co-evolution phenomenon is applied to the graph coloring problem. The proposed scheme is tested on several benchmark graphs for graph coloring
Network evolution driven by dynamics applied to graph coloring
Institute of Scientific and Technical Information of China (English)
Wu Jian-She; Li Li-Guang; Wang Xiao-Hua; Yu Xin; Jiao Li-Cheng
2013-01-01
An evolutionary network driven by dynamics is studied and applied to the graph coloring problem.From an initial structure,both the topology and the coupling weights evolve according to the dynamics.On the other hand,the dynamics of the network are determined by the topology and the coupling weights,so an interesting structure-dynamics co-evolutionary scheme appears.By providing two evolutionary strategies,a network described by the complement of a graph will evolve into several clusters of nodes according to their dynamics.The nodes in each cluster can be assigned the same color and nodes in different clusters assigned different colors.In this way,a co-evolution phenomenon is applied to the graph coloring problem.The proposed scheme is tested on several benchmark graphs for graph coloring.
Conlon, David; Fox, Jacob
2012-01-01
The graph removal lemma states that any graph on n vertices with o(n^{v(H)}) copies of a fixed graph H may be made H-free by removing o(n^2) edges. Despite its innocent appearance, this lemma and its extensions have several important consequences in number theory, discrete geometry, graph theory and computer science. In this survey we discuss these lemmas, focusing in particular on recent improvements to their quantitative aspects.
Evolutionary Graph Drawing Algorithms
Institute of Scientific and Technical Information of China (English)
Huang Jing-wei; Wei Wen-fang
2003-01-01
In this paper, graph drawing algorithms based on genetic algorithms are designed for general undirected graphs and directed graphs. As being shown, graph drawing algorithms designed by genetic algorithms have the following advantages: the frames of the algorithms are unified, the method is simple, different algorithms may be attained by designing different objective functions, therefore enhance the reuse of the algorithms. Also, aesthetics or constrains may be added to satisfy different requirements.
Graph algorithms for bioinformatics
Profiti, Giuseppe
2015-01-01
Biological data are inherently interconnected: protein sequences are connected to their annotations, the annotations are structured into ontologies, and so on. While protein-protein interactions are already represented by graphs, in this work I am presenting how a graph structure can be used to enrich the annotation of protein sequences thanks to algorithms that analyze the graph topology. We also describe a novel solution to restrict the data generation needed for building such a graph, than...
Intuitionistic Fuzzy Planar Graphs
Noura Alshehri; Muhammad Akram
2014-01-01
Graph theory has numerous applications in modern sciences and technology. Atanassov introduced the concept of intuitionistic fuzzy sets as a generalization of fuzzy sets. Intuitionistic fuzzy set has shown advantages in handling vagueness and uncertainty compared to fuzzy set. In this paper, we apply the concept of intuitionistic fuzzy sets to multigraphs, planar graphs, and dual graphs. We introduce the notions of intuitionistic fuzzy multigraphs, intuitionistic fuzzy planar graphs, and intu...
Engstrom, Alexander; Noren, Patrik
2010-01-01
In combinatorial commutative algebra and algebraic statistics many toric ideals are constructed from graphs. Keeping the categorical structure of graphs in mind we give previous results a more functorial context and generalize them by introducing the ideals of graph homomorphisms. For this new class of ideals we investigate how the topology of the graphs influence the algebraic properties. We describe explicit Grobner bases for several classes, generalizing results by Hibi, Sturmfels and Sull...
Time Varying Undirected Graphs
Zhou, Shuheng; Lafferty, John; Wasserman, Larry
2008-01-01
Undirected graphs are often used to describe high dimensional distributions. Under sparsity conditions, the graph can be estimated using $\\ell_1$ penalization methods. However, current methods assume that the data are independent and identically distributed. If the distribution, and hence the graph, evolves over time then the data are not longer identically distributed. In this paper, we show how to estimate the sequence of graphs for non-identically distributed data, where the distribution e...
Bollobas, Bela
2010-01-01
We show that if G is a graph of sufficiently large order n containing as many r-cliques as the r-partite Turan graph of order n; then for some C>0 G has more than Cn^(r-1) (r+1)-cliques sharing a common edge unless G is isomorphic to the the r-partite Turan graph of order n. This structural result generalizes a previous result that has been useful in extremal graph theory.
Cho, Minsu; Alahari, Karteek; Ponce, Jean
2013-01-01
International audience Many tasks in computer vision are formulated as graph matching problems. Despite the NP-hard nature of the problem, fast and accurate approximations have led to significant progress in a wide range of applications. Learning graph models from observed data, however, still remains a challenging issue. This paper presents an effective scheme to parameterize a graph model, and learn its structural attributes for visual object matching. For this, we propose a graph repres...
Thulasiraman, K
2011-01-01
This adaptation of an earlier work by the authors is a graduate text and professional reference on the fundamentals of graph theory. It covers the theory of graphs, its applications to computer networks and the theory of graph algorithms. Also includes exercises and an updated bibliography.
Sudakov, Benny; Vu, Van
2007-01-01
In this paper, we initiate a systematic study of graph resilience. The (local) resilience of a graph G with respect to a property P measures how much one has to change G (locally) in order to destroy P. Estimating the resilience leads to many new and challenging problems. Here we focus on random and pseudo-random graphs and prove several sharp results.
Bound entanglement in tree graphs
International Nuclear Information System (INIS)
In this paper, we discuss the entanglement properties of graph-diagonal states, with particular emphasis on calculating the threshold for the transition between the presence and absence of entanglement (i.e. the separability point). Special consideration is made of the thermal states of trees, including the linear cluster state. We characterize the type of entanglement present, and describe the optimal entanglement witnesses and their implementation on a quantum computer, up to an additive approximation. In the case of general graphs, we invoke a relation with the partition function of the classical Ising model, thereby intimating a connection to computational complexity theoretic tasks. Finally, we show that the entanglement is robust to some classes of local perturbations.
The weighted random graph model
Garlaschelli, Diego
2009-07-01
We introduce the weighted random graph (WRG) model, which represents the weighted counterpart of the Erdos-Renyi random graph and provides fundamental insights into more complicated weighted networks. We find analytically that the WRG is characterized by a geometric weight distribution, a binomial degree distribution and a negative binomial strength distribution. We also characterize exactly the percolation phase transitions associated with edge removal and with the appearance of weighted subgraphs of any order and intensity. We find that even this completely null model displays a percolation behaviour similar to what is observed in real weighted networks, implying that edge removal cannot be used to detect community structure empirically. By contrast, the analysis of clustering successfully reveals different patterns between the WRG and real networks.
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
Rezaei, Seyed Saeed Changiz
2013-01-01
The entropy of a graph is a functional depending both on the graph itself and on a probability distribution on its vertex set. This graph functional originated from the problem of source coding in information theory and was introduced by J. K\\"{o}rner in 1973. Although the notion of graph entropy has its roots in information theory, it was proved to be closely related to some classical and frequently studied graph theoretic concepts. For example, it provides an equivalent definition for a gra...
Da Lozzo, Giordano; Rutter, Ignaz
2015-01-01
In this paper we introduce a notion of planarity for graphs that are presented in a streaming fashion. A $\\textit{streamed graph}$ is a stream of edges $e_1,e_2,...,e_m$ on a vertex set $V$. A streamed graph is $\\omega$-$\\textit{stream planar}$ with respect to a positive integer window size $\\omega$ if there exists a sequence of planar topological drawings $\\Gamma_i$ of the graphs $G_i=(V,\\{e_j \\mid i\\leq j < i+\\omega\\})$ such that the common graph $G^{i}_\\cap=G_i\\cap G_{i+1}$ is drawn the sa...
Gross, Jonathan L
2003-01-01
The Handbook of Graph Theory is the most comprehensive single-source guide to graph theory ever published. Best-selling authors Jonathan Gross and Jay Yellen assembled an outstanding team of experts to contribute overviews of more than 50 of the most significant topics in graph theory-including those related to algorithmic and optimization approaches as well as ""pure"" graph theory. They then carefully edited the compilation to produce a unified, authoritative work ideal for ready reference.Designed and edited with non-experts in mind, the Handbook of Graph Theory makes information easy to fi
Lo, Allan
2010-01-01
The main focus of this thesis is to evaluate $k_r(n,\\delta)$, the minimal number of $r$-cliques in graphs with $n$ vertices and minimum degree~$\\delta$. A fundamental result in Graph Theory states that a triangle-free graph of order $n$ has at most $n^2/4$ edges. Hence, a triangle-free graph has minimum degree at most $n/2$, so if $k_3(n,\\delta) =0$ then $\\delta \\le n/2$. For $n/2 \\leq \\delta \\leq 4n/5$, I have evaluated $k_r(n,\\delta)$ and determined the structures of the extremal graphs. F...
Simplicial complexes of graphs
Jonsson, Jakob
2008-01-01
A graph complex is a finite family of graphs closed under deletion of edges. Graph complexes show up naturally in many different areas of mathematics, including commutative algebra, geometry, and knot theory. Identifying each graph with its edge set, one may view a graph complex as a simplicial complex and hence interpret it as a geometric object. This volume examines topological properties of graph complexes, focusing on homotopy type and homology. Many of the proofs are based on Robin Forman's discrete version of Morse theory. As a byproduct, this volume also provides a loosely defined toolbox for attacking problems in topological combinatorics via discrete Morse theory. In terms of simplicity and power, arguably the most efficient tool is Forman's divide and conquer approach via decision trees; it is successfully applied to a large number of graph and digraph complexes.
Symmetrical extensions of graphs
International Nuclear Information System (INIS)
We study symmetrical extensions of graphs, with special emphasis on symmetrical and Aut0(Λd)-symmetrical extensions of d-dimensional grids Λd by finite graphs. These topics are of interest in group theory and graph theory and possibly also in crystallography and some branches of physics. We prove the existence of a connected locally finite graph admitting infinitely many symmetrical extensions by a fixed finite graph. On the other hand, we prove that the number of symmetrical and Aut0(Λd)-symmetrical extensions of the d-dimensional grid Λd by a finite graph is finite in several interesting cases. Moreover, for every positive integer d we construct all Aut0(Λd)-symmetrical extensions of the d-dimensional grid Λd by two-vertex graphs
Macroscopic Models of Clique Tree Growth for Bayesian Networks
National Aeronautics and Space Administration — In clique tree clustering, inference consists of propagation in a clique tree compiled from a Bayesian network. In this paper, we develop an analytical approach to...
面向个性化推荐的两层混合图模型%Hybrid Graph Model with Two Layers for Personalized Recommendation
Institute of Scientific and Technical Information of China (English)
张少中; 陈德人
2009-01-01
A hybrid graph model for personalized recom-mendation,which is based on small world network and Bayesian network,is presented.The hybrid graph model has two-layers.The bottom level means user's layer and the upper one means merchandise's layer.The user's layer is an undirected arcs graph,which describes the relation of the user's nodes by small world network.The undirected arcs inside the connected nodes of user's layer mean the similarity of the preference of users.These arcs are weighted by relational strength.The weight represents node's similarity or link's strength and intensity.Nodes in the same group are more similar to each other or more strongly connected.Users in a produce to others.It is connected by directed links,which means an implicated definition among merchandises,a user that purchase certain merchandise also tends to purchase another.The properties and content of merchandise can be used to show the similarity of the merchandise.The relations between user's layer and merchandise's layer are connected by directed links.The start nede of the directed links is a user node in user's layer belonging to some node group,which is gained by small world network.The end node of links is the node of some merchandise of the merchandise's layer.The directed links between the user's layer and the merchandise's layer are connected based on trade information of users.The strength of the relation between users and merchandises can be denoted by the probability parameter.The probability parameter shows a possibility of some users selecting for some merchandises. Firstly,algorithms for users clustering and for analysis of new user interest are presented to construct a hybrid graph model.Two important characteristic parameters,which are in small-world network,are introduced.These are characteristic path length and clustering coefficient.New user interest analysis is to judge which clustering group is the best match by calculating the distance of the new user node to
Gelman, Andrew; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B
2013-01-01
FUNDAMENTALS OF BAYESIAN INFERENCEProbability and InferenceSingle-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian ApproachesHierarchical ModelsFUNDAMENTALS OF BAYESIAN DATA ANALYSISModel Checking Evaluating, Comparing, and Expanding ModelsModeling Accounting for Data Collection Decision AnalysisADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional ApproximationsREGRESSION MODELS Introduction to Regression Models Hierarchical Linear
Yuan, Ying; MacKinnon, David P.
2009-01-01
This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...
Learning Bayesian Networks from Correlated Data
Bae, Harold; Monti, Stefano; Montano, Monty; Steinberg, Martin H.; Perls, Thomas T.; Sebastiani, Paola
2016-05-01
Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures.
Bayesian networks in educational assessment
Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M
2015-01-01
Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...
Use of Clustering to Assist Recognition in Computer Vision
Grashei, Ole Kristian Braut
2013-01-01
In computer vision many problems are of non-deterministic polynomial time complexity. One of these problems is graph matching. Suboptimal solutions have been proposed to efficiently do graph matching. This thesis investigates the use of unsupervised learning to cluster structured graph data in polynomial time. Clustering was done on attributed graph nodes and attributed graph node-arc-node triplets, and meaningful results were demonstrated. Self-organizing maps and the minimum message length ...
Bayesian Games with Intentions
Bjorndahl, Adam; Halpern, Joseph Y.; Pass, Rafael
2016-01-01
We show that standard Bayesian games cannot represent the full spectrum of belief-dependent preferences. However, by introducing a fundamental distinction between intended and actual strategies, we remove this limitation. We define Bayesian games with intentions, generalizing both Bayesian games and psychological games, and prove that Nash equilibria in psychological games correspond to a special class of equilibria as defined in our setting.
Feige, Uriel
2012-01-01
The factor graph of an instance of a symmetric constraint satisfaction problem on n Boolean variables and m constraints (CSPs such as k-SAT, k-AND, k-LIN) is a bipartite graph describing which variables appear in which constraints. The factor graph describes the instance up to the polarity of the variables, and hence there are up to 2km instances of the CSP that share the same factor graph. It is well known that factor graphs with certain structural properties make the underlying CSP easier to either solve exactly (e.g., for tree structures) or approximately (e.g., for planar structures). We are interested in the following question: is there a factor graph for which if one can solve every instance of the CSP with this particular factor graph, then one can solve every instance of the CSP regardless of the factor graph (and similarly, for approximation)? We call such a factor graph universal. As one needs different factor graphs for different values of n and m, this gives rise to the notion of a family of unive...
DEFF Research Database (Denmark)
Böcker, S.; Baumbach, Jan
2013-01-01
The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side. The...... algorithms for biological problems. © 2013 Springer-Verlag....... problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications of these...
GRAD: On Graph Database Modeling
Ghrab, Amine; Romero, Oscar; Skhiri, Sabri; Vaisman, Alejandro; Zimányi, Esteban
2016-01-01
Graph databases have emerged as the fundamental technology underpinning trendy application domains where traditional databases are not well-equipped to handle complex graph data. However, current graph databases support basic graph structures and integrity constraints with no standard algebra. In this paper, we introduce GRAD, a native and generic graph database model. GRAD goes beyond traditional graph database models, which support simple graph structures and constraints. Instead, GRAD pres...
Learning Local Components to Understand Large Bayesian Networks
DEFF Research Database (Denmark)
Zeng, Yifeng; Xiang, Yanping; Cordero, Jorge;
2009-01-01
(domain experts) to extract accurate information from a large Bayesian network due to dimensional difficulty. We define a formulation of local components and propose a clustering algorithm to learn such local components given complete data. The algorithm groups together most inter-relevant attributes...... in a domain. We evaluate its performance on three benchmark Bayesian networks and provide results in support. We further show that the learned components may represent local knowledge more precisely in comparison to the full Bayesian networks when working with a small amount of data.......Bayesian networks are known for providing an intuitive and compact representation of probabilistic information and allowing the creation of models over a large and complex domain. Bayesian learning and reasoning are nontrivial for a large Bayesian network. In parallel, it is a tough job for users...
Fujie, Futaba
2014-01-01
Covering Walks in Graphs is aimed at researchers and graduate students in the graph theory community and provides a comprehensive treatment on measures of two well studied graphical properties, namely Hamiltonicity and traversability in graphs. This text looks into the famous Kӧnigsberg Bridge Problem, the Chinese Postman Problem, the Icosian Game and the Traveling Salesman Problem as well as well-known mathematicians who were involved in these problems. The concepts of different spanning walks with examples and present classical results on Hamiltonian numbers and upper Hamiltonian numbers of graphs are described; in some cases, the authors provide proofs of these results to illustrate the beauty and complexity of this area of research. Two new concepts of traceable numbers of graphs and traceable numbers of vertices of a graph which were inspired by and closely related to Hamiltonian numbers are introduced. Results are illustrated on these two concepts and the relationship between traceable concepts and...
Arrighi, Pablo
2012-01-01
We generalize the theory of Cellular Automata to arbitrary, time-varying graphs. In other words we formalize, and prove theorems about, the intuitive idea of a labelled graph which evolves in time - but under the natural constraint that information can only ever be transmitted at a bounded speed, with respect to the distance given by the graph. The notion of translation-invariance is also generalized. The definition we provide for these `causal graph dynamics' is simple and axiomatic. The theorems we provide also show that it is robust. For instance, causal graph dynamics are stable under composition and under restriction to radius one. In the finite case some fundamental facts of Cellular Automata theory carry through: causal graph dynamics admit a characterization as continuous functions and they are stable under inversion. The provided examples suggest a wide range of applications of this mathematical object, from complex systems science to theoretical physics. Keywords: Dynamical networks, Boolean network...
Applications of graph theory in protein structure identification.
Yan, Yan; Zhang, Shenggui; Wu, Fang-Xiang
2011-01-01
There is a growing interest in the identification of proteins on the proteome wide scale. Among different kinds of protein structure identification methods, graph-theoretic methods are very sharp ones. Due to their lower costs, higher effectiveness and many other advantages, they have drawn more and more researchers' attention nowadays. Specifically, graph-theoretic methods have been widely used in homology identification, side-chain cluster identification, peptide sequencing and so on. This paper reviews several methods in solving protein structure identification problems using graph theory. We mainly introduce classical methods and mathematical models including homology modeling based on clique finding, identification of side-chain clusters in protein structures upon graph spectrum, and de novo peptide sequencing via tandem mass spectrometry using the spectrum graph model. In addition, concluding remarks and future priorities of each method are given. PMID:22165974
International Nuclear Information System (INIS)
We consider the scattering theory for the Schroedinger operator -Dx2+V(x) on the graphs made of one-dimensional wires connected to external leads. We derive two expressions for the scattering matrix on arbitrary graphs. One involves matrices that couple arcs (oriented bonds), the other involves matrices that couple vertices. We discuss a simple way to tune the coupling between the graph and the leads. The efficiency of the formalism is demonstrated with a few known examples. (author)
Cho, Ilwoo
2006-01-01
In this paper, we defined three kinds of measures depending on the given finite directed graphs. For the given finite directed graph, we can construct the free semigroupoid, the diagram set and the reduced diagram set, as algebraic structures determined by the admissibility on the graph. The power sets of such structures are sigma algebras of them, respectively. On them, we can define a measure. The pain purpose of this paper is to introduce those three measures and observe some properties of...
Chartrand, Gary
1984-01-01
Graph theory is used today in the physical sciences, social sciences, computer science, and other areas. Introductory Graph Theory presents a nontechnical introduction to this exciting field in a clear, lively, and informative style. Author Gary Chartrand covers the important elementary topics of graph theory and its applications. In addition, he presents a large variety of proofs designed to strengthen mathematical techniques and offers challenging opportunities to have fun with mathematics. Ten major topics - profusely illustrated - include: Mathematical Models, Elementary Concepts of Grap
Directory of Open Access Journals (Sweden)
Radu Buzatu
2015-11-01
Full Text Available We study some properties of minimum convex covers and minimum convex partitions of simple graphs. We establish existence of graphs with fixed number of minimum convex covers and minimum convex partitions. It is known that convex $p$-cover problem is NP-complete for $p\\geq3$ [5]. We prove that this problem is NP-complete in the case $p=2$. Also, we study covers and partitions of graphs when respective sets are nontrivial convex.
Product of bipolar fuzzy graphs and their degree
Rashmanlou, Hossein; Samanta, Sovan; Pal, Madhumangal; Borzooei, Rajab Ali
2016-01-01
The concepts of graph theory are applied in many areas of computer science including data mining, image segmentation, clustering, image capturing and networking. It is also known that lots of uncertainties occur in these areas. To handle the uncertainty that occurs in graph theory, fuzzy graph theory is successfully used in many problems. A bipolar fuzzy set is a generalization of the fuzzy set. In this paper, two new operations on bipolar fuzzy graphs, viz. normal product and tensor product, are defined. Also, the degrees of the vertices of the resultant graphs which are obtained from two given bipolar fuzzy graphs ? and ? using the operations Cartesian product, composition, tensor and normal product are determined.
Directory of Open Access Journals (Sweden)
Alberto Apostolico
2009-08-01
Full Text Available The Web Graph is a large-scale graph that does not fit in main memory, so that lossless compression methods have been proposed for it. This paper introduces a compression scheme that combines efficient storage with fast retrieval for the information in a node. The scheme exploits the properties of the Web Graph without assuming an ordering of the URLs, so that it may be applied to more general graphs. Tests on some datasets of use achieve space savings of about 10% over existing methods.
Graph separators, with applications
Rosenberg, Arnold L
2001-01-01
This text is devoted to techniques for obtaining upper and lower bounds on the sizes of graph separators - upper bounds being obtained via decomposition algorithms. The book surveys the main approaches to obtaining good graph separations, while its main focus is on techniques for deriving lower bounds on the sizes of graph separators. This asymmetry in focus reflects the perception that the work on upper bounds, or algorithms, for graph separation is much better represented in the standard theory literature than is the work on lower bounds, which we perceive as being much more scattered throug
Bradford, Robert; Chmutov, Sergei
2011-01-01
We introduce an additional structure on ribbon graphs, arrow structure. We extend the Bollob\\'as-Riordan polynomial to ribbon graph with this structure. The extended polynomial satisfies the contraction-deletion relations and naturally behaves with respect to the partial duality of ribbon graphs. We construct an arrow ribbon graph from a virtual link whose extended Bollob\\'as-Riordan polynomial specializes to the arrow polynomial of the virtual link recently introduced by H.Dye and L.Kauffman. This result generalizes the classical Thistlethwaite theorem to the arrow polynomial of virtual links.
Gelfand, I M; Shnol, E E
2002-01-01
The second in a series of systematic studies by a celebrated mathematician I. M. Gelfand and colleagues, this volume presents students with a well-illustrated sequence of problems and exercises designed to illuminate the properties of functions and graphs. Since readers do not have the benefit of a blackboard on which a teacher constructs a graph, the authors abandoned the customary use of diagrams in which only the final form of the graph appears; instead, the book's margins feature step-by-step diagrams for the complete construction of each graph. The first part of the book employs simple fu
Energy Technology Data Exchange (ETDEWEB)
Lothian, Josh [ORNL; Powers, Sarah S [ORNL; Sullivan, Blair D [ORNL; Baker, Matthew B [ORNL; Schrock, Jonathan [ORNL; Poole, Stephen W [ORNL
2013-12-01
The benchmarking effort within the Extreme Scale Systems Center at Oak Ridge National Laboratory seeks to provide High Performance Computing benchmarks and test suites of interest to the DoD sponsor. The work described in this report is a part of the effort focusing on graph generation. A previously developed benchmark, SystemBurn, allowed the emulation of dierent application behavior profiles within a single framework. To complement this effort, similar capabilities are desired for graph-centric problems. This report examines existing synthetic graph generator implementations in preparation for further study on the properties of their generated synthetic graphs.
Diaconis, Persi; Holmes, Susan; Janson, Svante
2012-01-01
We work out the graph limit theory for dense interval graphs. The theory developed departs from the usual description of a graph limit as a symmetric function $W(x,y)$ on the unit square, with $x$ and $y$ uniform on the interval $(0,1)$. Instead, we fix a $W$ and change the underlying distribution of the coordinates $x$ and $y$. We find choices such that our limits are continuous. Connections to random interval graphs are given, including some examples. We also show a continuity result for th...
DEFF Research Database (Denmark)
Thomassen, Carsten
2014-01-01
We prove a general result on graph factors modulo k . A special case says that, for each natural number k , every (12k−7)-edge-connected graph with an even number of vertices contains a spanning subgraph in which each vertex has degree congruent to k modulo 2k.......We prove a general result on graph factors modulo k . A special case says that, for each natural number k , every (12k−7)-edge-connected graph with an even number of vertices contains a spanning subgraph in which each vertex has degree congruent to k modulo 2k....
Creating more effective graphs
Robbins, Naomi B
2012-01-01
A succinct and highly readable guide to creating effective graphs The right graph can be a powerful tool for communicating information, improving a presentation, or conveying your point in print. If your professional endeavors call for you to present data graphically, here's a book that can help you do it more effectively. Creating More Effective Graphs gives you the basic knowledge and techniques required to choose and create appropriate graphs for a broad range of applications. Using real-world examples everyone can relate to, the author draws on her years of experience in gr
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Clique percolation in random graphs
Li, Ming; Deng, Youjin; Wang, Bing-Hong
2015-10-01
As a generation of the classical percolation, clique percolation focuses on the connection of cliques in a graph, where the connection of two k cliques means that they share at least l graphs, which gives not only the exact solutions of the critical point, but also the corresponding order parameter. Based on this, we prove theoretically that the fraction ψ of cliques in the giant clique cluster always makes a continuous phase transition as the classical percolation. However, the fraction ϕ of vertices in the giant clique cluster for l >1 makes a step-function-like discontinuous phase transition in the thermodynamic limit and a continuous phase transition for l =1 . More interesting, our analysis shows that at the critical point, the order parameter ϕc for l >1 is neither 0 nor 1, but a constant depending on k and l . All these theoretical findings are in agreement with the simulation results, which give theoretical support and clarification for previous simulation studies of clique percolation.
Kim, Suh-Ryung; Park, Boram; Sano, Yoshio
2011-01-01
The competition graph of a digraph $D$ is a (simple undirected) graph which has the same vertex set as $D$ and has an edge between $x$ and $y$ if and only if there exists a vertex $v$ in $D$ such that $(x,v)$ and $(y,v)$ are arcs of $D$. For any graph $G$, $G$ together with sufficiently many isolated vertices is the competition graph of some acyclic digraph. The competition number $k(G)$ of $G$ is the smallest number of such isolated vertices. In general, it is hard to compute the competition number $k(G)$ for a graph $G$ and it has been one of the important research problems in the study of competition graphs. Opsut~[1982] suggested that the edge clique cover number $\\theta_E(G)$ should be closely related to $k(G)$ by showing $\\theta_E(G)-|V(G)|+2 \\leq k(G) \\leq \\theta_E(G)$. In this note, we study on these inequalities. We first show that for any positive integer $m$ satisfying $2 \\leq m \\leq |V(G)|$, there is a graph $G$ satisfying $k(G)=\\theta_E(G)-|V(G)|+m$ and characterize a graph $G$ satisfying $k(G)=\\...
Belkhechine, Houmem; Elayech, Mohamed Baka
2010-01-01
Given a (directed) graph G=(V,A), a subset X of V is an interval of G provided that for any a, b\\in X and x\\in V-X, (a,x)\\in A if and only if (b,x)\\in A and (x,a)\\in A if and only if (x,b)\\in A. For example, \\emptyset, \\{x\\} (x \\in V) and V are intervals of G, called trivial intervals. A graph, all the intervals of which are trivial, is indecomposable; otherwise, it is decomposable. A vertex x of an indecomposable graph is critical if G-x is decomposable. In 1993, J.H. Schmerl and W.T. Trotter characterized the indecomposable graphs, all the vertices of which are critical, called critical graphs. In this article, we characterize the indecomposable graphs which admit a single non critical vertex, that we call (-1)-critical graphs.} This gives an answer to a question asked by Y. Boudabbous and P. Ille in a recent article studying the critical vertices in an indecomposable graph.
Bollobas, Bela; Nikiforov, Vladimir
2004-01-01
A book of size $q$ is a set of $q$ triangles sharing a common edge. We study the size of the maximal book in a graph as a function of the number of its edges. In particular, we answer two questions of Erdos about graphs that are union of triangles.
Johnson, Millie
1997-01-01
Graphs from media sources and questions developed from them can be used in the middle school mathematics classroom. Graphs depict storage temperature on a milk carton; air pressure measurements on a package of shock absorbers; sleep-wake patterns of an infant; a dog's breathing patterns; and the angle, velocity, and radius of a leaning bicyclist…
Perepelitsa, VA; Sergienko, [No Value; Kochkarov, AM
1999-01-01
Definitions of prefractal and fractal graphs are introduced, and they are used to formulate mathematical models in different fields of knowledge. The topicality of fractal-graph recognition from the point of view, of fundamental improvement in the efficiency of the solution of algorithmic problems i
Moment graphs and representations
DEFF Research Database (Denmark)
Jantzen, Jens Carsten
2012-01-01
Moment graphs and sheaves on moment graphs are basically combinatorial objects that have be used to describe equivariant intersectiion cohomology. In these lectures we are going to show that they can be used to provide a direct link from this cohomology to the representation theory of simple Lie...
DEFF Research Database (Denmark)
Husfeldt, Thore
2015-01-01
This chapter presents an introduction to graph colouring algorithms. The focus is on vertex-colouring algorithms that work for general classes of graphs with worst-case performance guarantees in a sequential model of computation. The presentation aims to demonstrate the breadth of available...... techniques and is organized by algorithmic paradigm....
Graphs: Associated Markov Chains
Murthy, Garimella Rama
2012-01-01
In this research paper, weighted / unweighted, directed / undirected graphs are associated with interesting Discrete Time Markov Chains (DTMCs) as well as Continuous Time Markov Chains (CTMCs). The equilibrium / transient behaviour of such Markov chains is studied. Also entropy dynamics (Shannon entropy) of certain structured Markov chains is investigated. Finally certain structured graphs and the associated Markov chains are studied.
Shaw, Jean M.
1984-01-01
Reasons for having students make graphs are noted. Then specific graphing topics and materials appropriate for young learners are presented, including life-sized, floor, clothespin, felt-face, block, and magnetic graphs, and polls of pupils. (MNS)
Isomorphism on fuzzy soft graphs
Ćelik, Yıldıray
2016-04-01
In this paper, we define the notion of fuzzy soft graph and then introduce the concepts of homomorphism, isomorphism, weak isomorphism, co-weak isomorphism of fuzzy soft graphs. Also, we study some properties of isomorphism on fuzzy soft graphs.
Directory of Open Access Journals (Sweden)
Muhammad Akram
2015-12-01
Full Text Available Mathematical modelling, analysis and computing of problems with uncertainty is one of the hottest areas in interdisciplinary research involving applied mathematics, computational intelligence and decision sciences. It is worth noting that uncertainty arises from various domains has very different nature and cannot be captured within a single mathematical framework. Molodtsov’s soft sets provide us a new way of coping with uncertainty from the viewpoint of parameterization. In this paper, we introduce the concepts of soft graphs, vertex-induced soft graphs, edge-induced soft graphs and describe some operations on soft graphs by presenting several examples to demonstrate these new concepts. Finally, we investigate some fundamental properties of soft graphs.
Generalized connectivity of graphs
Li, Xueliang
2016-01-01
Noteworthy results, proof techniques, open problems and conjectures in generalized (edge-) connectivity are discussed in this book. Both theoretical and practical analyses for generalized (edge-) connectivity of graphs are provided. Topics covered in this book include: generalized (edge-) connectivity of graph classes, algorithms, computational complexity, sharp bounds, Nordhaus-Gaddum-type results, maximum generalized local connectivity, extremal problems, random graphs, multigraphs, relations with the Steiner tree packing problem and generalizations of connectivity. This book enables graduate students to understand and master a segment of graph theory and combinatorial optimization. Researchers in graph theory, combinatorics, combinatorial optimization, probability, computer science, discrete algorithms, complexity analysis, network design, and the information transferring models will find this book useful in their studies.
Coloring fuzzy circular interval graphs
Eisenbrand, Friedrich; Niemeier, Martin
2012-01-01
Computing the weighted coloring number of graphs is a classical topic in combinatorics and graph theory. Recently these problems have again attracted a lot of attention for the class of quasi-line graphs and more specifically fuzzy circular interval graphs. The problem is NP-complete for quasi-line graphs. For the subclass of fuzzy circular interval graphs however, one can compute the weighted coloring number in polynomial time using recent results of Chudnovsky and Ovetsky and of King and Re...
Algorithms and Complexity Results for Exact Bayesian Structure Learning
Ordyniak, Sebastian
2012-01-01
Bayesian structure learning is the NP-hard problem of discovering a Bayesian network that optimally represents a given set of training data. In this paper we study the computational worst-case complexity of exact Bayesian structure learning under graph theoretic restrictions on the super-structure. The super-structure (a concept introduced by Perrier, Imoto, and Miyano, JMLR 2008) is an undirected graph that contains as subgraphs the skeletons of solution networks. Our results apply to several variants of score-based Bayesian structure learning where the score of a network decomposes into local scores of its nodes. Results: We show that exact Bayesian structure learning can be carried out in non-uniform polynomial time if the super-structure has bounded treewidth and in linear time if in addition the super-structure has bounded maximum degree. We complement this with a number of hardness results. We show that both restrictions (treewidth and degree) are essential and cannot be dropped without loosing uniform ...
Scale-invariant geometric random graphs
Xie, Zheng
2015-01-01
We introduce and analyze a class of growing geometric random graphs that are invariant under rescaling of space and time. Directed connections between nodes are drawn according to an influence zone that depends on node position in space and time, capturing the heterogeneity and increased specialization found in growing networks. Through calculations and numerical simulations we explore the consequences of scale-invariance for geometric graphs generated this way. Our analysis reveals a dichotomy between scale-free and Poisson distributions of in- and out-degree, the existence of a random number of hub nodes, high clustering, and unusual percolation behaviour. Moreover, we show how these properties provide a good fit to those of empirically observed web graphs.
Application of Graph Coloring to Biological Networks
Khor, Susan
2009-01-01
We explore the application of graph coloring to biological networks, specifically protein-protein interaction (PPI) networks. First, we find that given similar conditions (i.e. number of nodes, number of links, degree distribution and clustering), fewer colors are needed to color disassortative (high degree nodes tend to connect to low degree nodes and vice versa) than assortative networks. Fewer colors create fewer independent sets which in turn imply higher concurrency potential for a network. Since PPI networks tend to be disassortative, we suggest that in addition to functional specificity and stability proposed previously by Maslov and Sneppen (Science 296, 2002), the disassortative nature of PPI networks may promote the ability of cells to perform multiple, crucial and functionally diverse tasks concurrently. Second, since graph coloring is closely related to the presence of cliques in a graph, the significance of node coloring information to the problem of identifying protein complexes, i.e. dense subg...
Rubin, Donald B.
1981-01-01
The Bayesian bootstrap is the Bayesian analogue of the bootstrap. Instead of simulating the sampling distribution of a statistic estimating a parameter, the Bayesian bootstrap simulates the posterior distribution of the parameter; operationally and inferentially the methods are quite similar. Because both methods of drawing inferences are based on somewhat peculiar model assumptions and the resulting inferences are generally sensitive to these assumptions, neither method should be applied wit...
Coenocline reconstruction using graph theory and Bayesian probability data generator
Czech Academy of Sciences Publication Activity Database
Čejchan, Petr
Krakow : Royal Society, 2007. s. 1-2. [United Kingdom -Visegrad Frontiers of Science Symposium /4./. 21.02.2007-23.02.2007, Krakow] Institutional research plan: CEZ:AV0Z30130516 Keywords : coenocline * gradient anaysis * palaeoecology Subject RIV: DB - Geology ; Mineralogy
Bayesian statistics an introduction
Lee, Peter M
2012-01-01
Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee’s book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as wel
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Bidimensionality and Geometric Graphs
Fomin, Fedor V; Saurabh, Saket
2011-01-01
In this paper we use several of the key ideas from Bidimensionality to give a new generic approach to design EPTASs and subexponential time parameterized algorithms for problems on classes of graphs which are not minor closed, but instead exhibit a geometric structure. In particular we present EPTASs and subexponential time parameterized algorithms for Feedback Vertex Set, Vertex Cover, Connected Vertex Cover, Diamond Hitting Set, on map graphs and unit disk graphs, and for Cycle Packing and Minimum-Vertex Feedback Edge Set on unit disk graphs. Our results are based on the recent decomposition theorems proved by Fomin et al [SODA 2011], and our algorithms work directly on the input graph. Thus it is not necessary to compute the geometric representations of the input graph. To the best of our knowledge, these results are previously unknown, with the exception of the EPTAS and a subexponential time parameterized algorithm on unit disk graphs for Vertex Cover, which were obtained by Marx [ESA 2005] and Alber and...
New concepts of fuzzy planar graphs
Sovan Samanta; Anita Pal; Madhumangal Pal
2014-01-01
Fuzzy planar graph is an important subclass of fuzzy graph. Fuzzy planar graphs and its several properties are presented. A very close association of fuzzy planar graph is fuzzy dual graph. This is also defined and several properties of it are studied. Isomorphism on fuzzy graphs are well defined in literature. Isomorphic relation between fuzzy planar graph and its dual graph are established.
Feature Reduction in Graph Analysis
Directory of Open Access Journals (Sweden)
Punpiti Piamsa-nga
2008-08-01
Full Text Available A common approach to improve medical image classification is to add more features to the classifiers; however, this increases the time required for preprocessing raw data and training the classifiers, and the increase in features is not always beneficial. The number of commonly used features in the literature for training of image feature classifiers is over 50. Existing algorithms for selecting a subset of available features for image analysis fail to adequately eliminate redundant features. This paper presents a new selection algorithm based on graph analysis of interactions among features and between features to classifier decision. A modification of path analysis is done by applying regression analysis, multiple logistic and posterior Bayesian inference in order to eliminate features that provide the same contributions. A database of 113 mammograms from the Mammographic Image Analysis Society was used in the experiments. Tested on two classifiers Ã¢Â€Â“ ANN and logistic regression Ã¢Â€Â“ cancer detection accuracy (true positive and false-positive rates using a 13-feature set selected by our algorithm yielded substantially similar accuracy as using a 26-feature set selected by SFS and results using all 50-features. However, the 13-feature greatly reduced the amount of computation needed.
D'Hoker, Eric; Gurdogan, Omer; Vanhove, Pierre
2015-01-01
We consider properties of modular graph functions, which are non-holomorphic modular functions associated with the Feynman graphs for a conformal scalar field theory on a two-dimensional torus. Such functions arise, for example, in the low energy expansion of genus-one Type II superstring amplitudes. We demonstrate that these functions are sums, with rational coefficients, of special values of single-valued elliptic multiple polylogarithms, which will be introduced in this paper. This insight suggests the many interrelations between these modular graph functions (a few of which were motivated in an earlier paper) may be obtained as a consequence of identities involving elliptic polylogarithms.
Eliahou, Shalom; Harary, Frank; Kauffman, Louis H.
2005-01-01
This paper is an exploration of simple four-regular graphs in the plane (i.e. loopless and with no more than one edge between any two nodes). Such graphs are fundamental to the theory of knots and links in three dimensional space, and their planar diagrams. We dedicate this paper to Frank Harary (1921 -- 2005) whose fascination with graphs of knots inspired this work and with whom we had the pleasure of developing this paper. We prove that for v (the number of nodes) greater than or equal to ...
Bollobas, Bela
2004-01-01
The ever-expanding field of extremal graph theory encompasses a diverse array of problem-solving methods, including applications to economics, computer science, and optimization theory. This volume, based on a series of lectures delivered to graduate students at the University of Cambridge, presents a concise yet comprehensive treatment of extremal graph theory.Unlike most graph theory treatises, this text features complete proofs for almost all of its results. Further insights into theory are provided by the numerous exercises of varying degrees of difficulty that accompany each chapter. A
Hegedüs, Gábor
2014-01-01
Explicit construction of Ramsey graphs has remained a challenging open problem for a long time. Frankl--Wilson \\cite{FW}, Alon \\cite{A} and Grolmusz \\cite{G2} gave the best explicit constructions of graphs on $m$ vertices with no clique or independent set of size $m^{(1+o(1))\\frac{1}{4}\\frac{\\log m}{\\log \\log m}}$. We describe here an explicit construction which produces for every integer $m>1$ a graph on at least $m^{(1+o(1))\\frac{1}{3}\\frac{\\log m}{\\log \\log m}}$ vertices containing neither...
Gross, Jonathan L; Zhang, Ping
2013-01-01
In the ten years since the publication of the best-selling first edition, more than 1,000 graph theory papers have been published each year. Reflecting these advances, Handbook of Graph Theory, Second Edition provides comprehensive coverage of the main topics in pure and applied graph theory. This second edition-over 400 pages longer than its predecessor-incorporates 14 new sections. Each chapter includes lists of essential definitions and facts, accompanied by examples, tables, remarks, and, in some cases, conjectures and open problems. A bibliography at the end of each chapter provides an ex
Martin, Jeremy L.
2003-01-01
A picture P of a graph G = (V,E) consists of a point P(v) for each vertex v in V and a line P(e) for each edge e in E, all lying in the projective plane over a field k and subject to containment conditions corresponding to incidence in G. A graph variety is an algebraic set whose points parametrize pictures of G. We consider three kinds of graph varieties: the picture space X(G) of all pictures, the picture variety V(G), an irreducible component of X(G) of dimension 2|V|, defined as the closu...
Exner, Pavel
2012-01-01
We discuss ways in which momentum operators can be introduced on an oriented metric graph. A necessary condition appears to the balanced property, or a matching between the numbers of incoming and outgoing edges; we show that a graph without an orientation, locally finite and at most countably infinite, can made balanced oriented \\emph{iff} the degree of each vertex is even. On such graphs we construct families of momentum operators; we analyze their spectra and associated unitary groups. We also show that the unique continuation principle does not hold here.
Rodriguez, Marko A.; Neubauer, Peter
2010-01-01
A graph is a structure composed of a set of vertices (i.e.nodes, dots) connected to one another by a set of edges (i.e.links, lines). The concept of a graph has been around since the late 19$^\\text{th}$ century, however, only in recent decades has there been a strong resurgence in both theoretical and applied graph research in mathematics, physics, and computer science. In applied computing, since the late 1960s, the interlinked table structure of the relational database has been the predomin...
Application of graph colouring to biological networks.
Khor, S
2010-05-01
The author explores the application of graph colouring to biological networks, specifically protein-protein interaction (PPI) networks. First, the author finds that given similar conditions (i.e. graph size, degree distribution and clustering), fewer colours are needed to colour disassortative than assortative networks. Fewer colours create fewer independent sets which in turn imply higher concurrency potential for a network. Since PPI networks tend to be disassortative, the author suggests that in addition to functional specificity and stability proposed previously by Maslov and Sneppen (Science, 296, 2002), the disassortative nature of PPI networks may promote the ability of cells to perform multiple, crucial and functionally diverse tasks concurrently. Second, because graph colouring is closely related to the presence of cliques in a graph, the significance of node colouring information to the problem of identifying protein complexes (dense subgraphs in PPI networks), is investigated. The author finds that for PPI networks where 1-11% of nodes participate in at least one identified protein complex, such as H. sapien, DSATUR (a well-known complete graph colouring algorithm) node colouring information can improve the quality (homogeneity and separation) of initial candidate complexes. This finding may help improve existing protein complex detection methods, and/or suggest new methods. [Includes supplementary material]. PMID:20499999
Inductive construction of 2-connected graphs for calculating the virial coefficients
International Nuclear Information System (INIS)
In this paper we give a method for constructing systematically all simple 2-connected graphs with n vertices from the set of simple 2-connected graphs with n - 1 vertices, by means of two operations: subdivision of an edge and addition of a vertex. The motivation of our study comes from the theory of non-ideal gases and, more specifically, from the virial equation of state. It is a known result of statistical mechanics that the coefficients in the virial equation of state are sums over labeled 2-connected graphs. These graphs correspond to clusters of particles. Thus, theoretically, the virial coefficients of any order can be calculated by means of 2-connected graphs used in the virial coefficient of the previous order. Our main result gives a method for constructing inductively all simple 2-connected graphs, by induction on the number of vertices. Moreover, the two operations we are using maintain the correspondence between graphs and clusters of particles.
Application of Bayesian Network Learning Methods to Land Resource Evaluation
Institute of Scientific and Technical Information of China (English)
HUANG Jiejun; HE Xiaorong; WAN Youchuan
2006-01-01
Bayesian network has a powerful ability for reasoning and semantic representation, which combined with qualitative analysis and quantitative analysis, with prior knowledge and observed data, and provides an effective way to deal with prediction, classification and clustering. Firstly, this paper presented an overview of Bayesian network and its characteristics, and discussed how to learn a Bayesian network structure from given data, and then constructed a Bayesian network model for land resource evaluation with expert knowledge and the dataset. The experimental results based on the test dataset are that evaluation accuracy is 87.5%, and Kappa index is 0.826. All these prove the method is feasible and efficient, and indicate that Bayesian network is a promising approach for land resource evaluation.
Line-graphs of cubic graphs are normal
Patakfalvi, Zsolt
2006-01-01
A graph is called normal if its vertex set can be covered by cliques and also by stable sets, such that every such clique and stable set have non-empty intersection. This notion is due to Korner, who introduced the class of normal graphs as an extension of the class of perfect graphs. Normality has also relevance in information theory. Here we prove, that the line graphs of cubic graphs are normal.
An approximate algorithm for median graph computation using graph embedding
Ferrer Sumsi, Miquel; Valveny, Ernest; Serratosa Casanelles, Francesc; Riesen, Kaspar; Bunke, Horst
2008-01-01
Graphs are powerful data structures that have many attractive properties for object representation. However, some basic operations are difficult to define and implement, for instance, how to obtain a representative of a set of graphs. The median graph has been defined for that purpose, but existing algorithms are computationally complex and have a very limited applicability. In this paper we propose a new approach for the computation of the median graph based on graph embedding in vector spac...
House of Graphs: a database of interesting graphs
Brinkmann, Gunnar; Coolsaet, Kris; Goedgebeur, Jan; Melot, Hadrien
2012-01-01
In this note we present House of Graphs (http://hog.grinvin.org) which is a new database of graphs. The key principle is to have a searchable database and offer -- next to complete lists of some graph classes -- also a list of special graphs that already turned out to be interesting and relevant in the study of graph theoretic problems or as counterexamples to conjectures. This list can be extended by users of the database.
Directory of Open Access Journals (Sweden)
Haynes Teresa W.
2014-08-01
Full Text Available A path π = (v1, v2, . . . , vk+1 in a graph G = (V,E is a downhill path if for every i, 1 ≤ i ≤ k, deg(vi ≥ deg(vi+1, where deg(vi denotes the degree of vertex vi ∈ V. The downhill domination number equals the minimum cardinality of a set S ⊆ V having the property that every vertex v ∈ V lies on a downhill path originating from some vertex in S. We investigate downhill domination numbers of graphs and give upper bounds. In particular, we show that the downhill domination number of a graph is at most half its order, and that the downhill domination number of a tree is at most one third its order. We characterize the graphs obtaining each of these bounds
Extremal Infinite Graph Theory
Stein, Maya
2011-01-01
We survey various aspects of infinite extremal graph theory and prove several new results. The lead role play the parameters connectivity and degree. This includes the end degree. Many open problems are suggested.
Ballif, Serge C
2011-01-01
We generalize the notion of orthogonal latin squares to colorings of simple graphs. We define two $n$-colorings of a graph to be \\emph{orthogonal} if no ordered pair of colors occurs more than once when the two colorings of each vertex are listed as an ordered pair. We show that the usual bounds on the maximum size of a certain set of orthogonal latin structures such as latin squares, row latin squares, equi-$n$ squares, single diagonal latin squares, double diagonal latin squares, and sudoku squares are a special cases of bounds on orthogonal colorings of graphs. We also show that the problem of finding a transversal in a latin square of order $n$ is equivalent to finding an $n$-clique in a particular graph.
Alspach, BR
1985-01-01
This volume deals with a variety of problems involving cycles in graphs and circuits in digraphs. Leading researchers in this area present here 3 survey papers and 42 papers containing new results. There is also a collection of unsolved problems.
Frühwirth-Schnatter, Sylvia
1990-01-01
In the paper at hand we apply it to Bayesian statistics to obtain "Fuzzy Bayesian Inference". In the subsequent sections we will discuss a fuzzy valued likelihood function, Bayes' theorem for both fuzzy data and fuzzy priors, a fuzzy Bayes' estimator, fuzzy predictive densities and distributions, and fuzzy H.P.D .-Regions. (author's abstract)
Yuan, Ying; MacKinnon, David P.
2009-01-01
In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…
Categorical constructions in graph theory
Dana May Latch; Richard T. Bumby
1986-01-01
This paper presents some graph-theoretic questions from the viewpoint of the portion of category theory which has become common knowledge. In particular, the reader is encouraged to consider whether there is only one natural category of graphs and how theories of directed graphs and undirected graphs are related.
Categorical constructions in graph theory
Directory of Open Access Journals (Sweden)
Dana May Latch
1986-03-01
Full Text Available This paper presents some graph-theoretic questions from the viewpoint of the portion of category theory which has become common knowledge. In particular, the reader is encouraged to consider whether there is only one natural category of graphs and how theories of directed graphs and undirected graphs are related.
Local Interaction on Random Graphs
Directory of Open Access Journals (Sweden)
Hans Haller
2010-08-01
Full Text Available We analyze dynamic local interaction in population games where the local interaction structure (modeled as a graph can change over time: A stochastic process generates a random sequence of graphs. This contrasts with models where the initial interaction structure (represented by a deterministic graph or the realization of a random graph cannot change over time.
Spectral characterizations of propeller graphs
Liu, Xiaogang; Zhou, Sanming
2012-01-01
A propeller graph is obtained from an $\\infty$-graph by attaching a path to the vertex of degree four, where an $\\infty$-graph consists of two cycles with precisely one common vertex. In this paper, we prove that all propeller graphs are determined by their Laplacian spectra as well as their signless Laplacian spectra.
A Semantic Graph Query Language
Energy Technology Data Exchange (ETDEWEB)
Kaplan, I L
2006-10-16
Semantic graphs can be used to organize large amounts of information from a number of sources into one unified structure. A semantic query language provides a foundation for extracting information from the semantic graph. The graph query language described here provides a simple, powerful method for querying semantic graphs.
SOMBI: Bayesian identification of parameter relations in unstructured cosmological data
Frank, Philipp; Enßlin, Torsten A
2016-01-01
This work describes the implementation and application of a correlation determination method based on Self Organizing Maps and Bayesian Inference (SOMBI). SOMBI aims to automatically identify relations between different observed parameters in unstructured cosmological or astrophysical surveys by automatically identifying data clusters in high-dimensional datasets via the Self Organizing Map neural network algorithm. Parameter relations are then revealed by means of a Bayesian inference within respective identified data clusters. Specifically such relations are assumed to be parametrized as a polynomial of unknown order. The Bayesian approach results in a posterior probability distribution function for respective polynomial coefficients. To decide which polynomial order suffices to describe correlation structures in data, we include a method for model selection, the Bayesian Information Criterion, to the analysis. The performance of the SOMBI algorithm is tested with mock data. As illustration we also provide ...
Estrada Roger, Ernesto; Gago Álvarez, Silvia
2009-01-01
The concept of golden spectral graphs is introduced and some of their general properties reported. Golden spectral graphs are those having a golden proportion for the spectral ratios defined on the basis of the spectral gap, spectral spread and the difference between the second largest and the smallest eigenvalue of the adjacency matrix. They are good expanders and display excellent synchronizability. Here we report some new construction methods as well as several of their topological pa...
Quinn, Christopher J.; Kiyavash, Negar; Coleman, Todd P.
2012-01-01
We propose a graphical model for representing networks of stochastic processes, the minimal generative model graph. It is based on reduced factorizations of the joint distribution over time. We show that under appropriate conditions, it is unique and consistent with another type of graphical model, the directed information graph, which is based on a generalization of Granger causality. We demonstrate how directed information quantifies Granger causality in a particular sequential prediction s...
Interaction Graphs: Exponentials
Seiller, Thomas
2013-01-01
This paper is the fourth of a series exposing a systematic combinatorial approach to Girard's Geometry of Interaction program. This program aims at obtaining particular realizability models for linear logic that accounts for the dynamics of cut-elimination. This fourth paper tackles the complex issue of defining exponential connectives in this framework. In order to succeed in this, we use the notion of graphings, a generalization of graphs which was defined in earlier work. We explain how we...
Cascades on clique-based graphs
Hackett, Adam
2013-01-01
We present an analytical approach to determining the expected cascade size in a broad range of dynamical models on the class of highly-clustered random graphs introduced in [J. P. Gleeson, Phys. Rev. E 80, 036107 (2009)]. A condition for the existence of global cascades is also derived. Applications of this approach include analyses of percolation, and Watts's model. We show how our techniques can be used to study the effects of in-group bias in cascades on social networks.
An algebraic analysis of the graph modularity
Fasino, Dario; Tudisco, Francesco
2013-01-01
One of the most relevant tasks in network analysis is the detection of community structures, or clustering. Most popular techniques for community detection are based on the maximization of a quality function called modularity, which in turn is based upon particular quadratic forms associated to a real symmetric modularity matrix $M$, defined in terms of the adjacency matrix and a rank one null model matrix. That matrix could be posed inside the set of relevant matrices involved in graph theor...
Bollobás, Béla
1998-01-01
The time has now come when graph theory should be part of the education of every serious student of mathematics and computer science, both for its own sake and to enhance the appreciation of mathematics as a whole. This book is an in-depth account of graph theory, written with such a student in mind; it reflects the current state of the subject and emphasizes connections with other branches of pure mathematics. The volume grew out of the author's earlier book, Graph Theory -- An Introductory Course, but its length is well over twice that of its predecessor, allowing it to reveal many exciting new developments in the subject. Recognizing that graph theory is one of several courses competing for the attention of a student, the book contains extensive descriptive passages designed to convey the flavor of the subject and to arouse interest. In addition to a modern treatment of the classical areas of graph theory such as coloring, matching, extremal theory, and algebraic graph theory, the book presents a detailed ...
Arrighi, Pablo
2016-01-01
Consider a graph having quantum systems lying at each node. Suppose that the whole thing evolves in discrete time steps, according to a global, unitary causal operator. By causal we mean that information can only propagate at a bounded speed, with respect to the distance given by the graph. Suppose, moreover, that the graph itself is subject to the evolution, and may be driven to be in a quantum superposition of graphs---in accordance to the superposition principle. We show that these unitary causal operators must decompose as a finite-depth circuit of local unitary gates. This unifies a result on Quantum Cellular Automata with another on Reversible Causal Graph Dynamics. Along the way we formalize a notion of causality which is valid in the context of quantum superpositions of time-varying graphs, and has a number of good properties. Keywords: Quantum Lattice Gas Automata, Block-representation, Curtis-Hedlund-Lyndon, No-signalling, Localizability, Quantum Gravity, Quantum Graphity, Causal Dynamical Triangula...
Parsing optical scanned 3D data by Bayesian inference
Xiong, Hanwei; Xu, Jun; Xu, Chenxi; Pan, Ming
2015-10-01
Optical devices are always used to digitize complex objects to get their shapes in form of point clouds. The results have no semantic meaning about the objects, and tedious process is indispensable to segment the scanned data to get meanings. The reason for a person to perceive an object correctly is the usage of knowledge, so Bayesian inference is used to the goal. A probabilistic And-Or-Graph is used as a unified framework of representation, learning, and recognition for a large number of object categories, and a probabilistic model defined on this And-Or-Graph is learned from a relatively small training set per category. Given a set of 3D scanned data, the Bayesian inference constructs a most probable interpretation of the object, and a semantic segment is obtained from the part decomposition. Some examples are given to explain the method.
Strong Domination in Fuzzy Graphs
O.T. Manjusha; M.S. Sunitha
2015-01-01
In this paper, the concept of strong domination number is introduced by using membership values of strong arcs in fuzzy graphs. The strong domination number γs of complete fuzzy graph and complete bipartite fuzzy graph is determined and bounds is obtained for the strong domination number of fuzzy graphs. Also the relationship between the strong domination number of a fuzzy graph and that of its complement are discussed.
Strong Domination in Fuzzy Graphs
Directory of Open Access Journals (Sweden)
O.T. Manjusha
2015-09-01
Full Text Available In this paper, the concept of strong domination number is introduced by using membership values of strong arcs in fuzzy graphs. The strong domination number γs of complete fuzzy graph and complete bipartite fuzzy graph is determined and bounds is obtained for the strong domination number of fuzzy graphs. Also the relationship between the strong domination number of a fuzzy graph and that of its complement are discussed.
Weighted graph algorithms with Python
Kapanowski, A.; Gałuszka, Ł.
2015-01-01
Python implementation of selected weighted graph algorithms is presented. The minimal graph interface is defined together with several classes implementing this interface. Graph nodes can be any hashable Python objects. Directed edges are instances of the Edge class. Graphs are instances of the Graph class. It is based on the adjacency-list representation, but with fast lookup of nodes and neighbors (dict-of-dict structure). Other implementations of this class are also possible. In this work,...
Greedy Friensdhip Decompositions of Graphs
Teresa Sousa
2011-01-01
A graph that consists of t cliques sharing a vertex v is said to be a t-friendship graph with center v. A friendship graph is a graph that is t-friendship for some . We solve the problem of finding the best upper bound for the size of a greedy 2-friendship decomposition and a greedy friendship decomposition of graphs of order n.
Applications of graph theory in protein structure identification
Zhang Shenggui; Yan Yan; Wu Fang-Xiang
2011-01-01
Abstract There is a growing interest in the identification of proteins on the proteome wide scale. Among different kinds of protein structure identification methods, graph-theoretic methods are very sharp ones. Due to their lower costs, higher effectiveness and many other advantages, they have drawn more and more researchers’ attention nowadays. Specifically, graph-theoretic methods have been widely used in homology identification, side-chain cluster identification, peptide sequencing and so ...
Extending partial representations of function graphs and permutation graphs
Klavík, Pavel; Krawczyk, Tomasz; Walczak, Bartosz
2012-01-01
Function graphs are graphs representable by intersections of continuous real-valued functions on the interval [0,1] and are known to be exactly the complements of comparability graphs. As such they are recognizable in polynomial time. Function graphs generalize permutation graphs, which arise when all functions considered are linear. We focus on the problem of extending partial representations, which generalizes the recognition problem. We observe that for permutation graphs an easy extension of Golumbic's comparability graph recognition algorithm can be exploited. This approach fails for function graphs. Nevertheless, we present a polynomial-time algorithm for extending a partial representation of a graph by functions defined on the entire interval [0,1] provided for some of the vertices. On the other hand, we show that if a partial representation consists of functions defined on subintervals of [0,1], then the problem of extending this representation to functions on the entire interval [0,1] becomes NP-comp...
Bayesian biclustering of gene expression data
Liu Jun S; Gu Jiajun
2008-01-01
Abstract Background Biclustering of gene expression data searches for local patterns of gene expression. A bicluster (or a two-way cluster) is defined as a set of genes whose expression profiles are mutually similar within a subset of experimental conditions/samples. Although several biclustering algorithms have been studied, few are based on rigorous statistical models. Results We developed a Bayesian biclustering model (BBC), and implemented a Gibbs sampling procedure for its statistical in...
Eksentrik Digraph Dari Graph Bintang, Graph Bintang Rangkap Dua, Dan Graph Bipartit Lengkap.
Pasaribu, Sri Agustina
2011-01-01
Graph theory represents the science branch getting a lot of attention because representing especial items in researching where the models from graph theory are a lot of used in so many applications like internal issue transportation, communications network, electrics network, computer science, mapping problem of road in a town, etc. Eccentric Digraph of graph G , ED(G) defined as a graph having same vertex with the vertex gathering of graph G or V(ED(G)) = V (G) , where there a...
Bayesian Methods for Radiation Detection and Dosimetry
International Nuclear Information System (INIS)
We performed work in three areas: radiation detection, external and internal radiation dosimetry. In radiation detection we developed Bayesian techniques to estimate the net activity of high and low activity radioactive samples. These techniques have the advantage that the remaining uncertainty about the net activity is described by probability densities. Graphs of the densities show the uncertainty in pictorial form. Figure 1 below demonstrates this point. We applied stochastic processes for a method to obtain Bayesian estimates of 222Rn-daughter products from observed counting rates. In external radiation dosimetry we studied and developed Bayesian methods to estimate radiation doses to an individual with radiation induced chromosome aberrations. We analyzed chromosome aberrations after exposure to gammas and neutrons and developed a method for dose-estimation after criticality accidents. The research in internal radiation dosimetry focused on parameter estimation for compartmental models from observed compartmental activities. From the estimated probability densities of the model parameters we were able to derive the densities for compartmental activities for a two compartment catenary model at different times. We also calculated the average activities and their standard deviation for a simple two compartment model
The STAPL Parallel Graph Library
Harshvardhan,
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
Directory of Open Access Journals (Sweden)
Vassilis Giakoumakis
1997-12-01
Full Text Available We study the P 4-tidy graphs, a new class defined by Rusu [30] in order to illustrate the notion of P 4-domination in perfect graphs. This class strictly contains the P 4-extendible graphs and the P 4-lite graphs defined by Jamison & Olariu in [19] and [23] and we show that the P 4-tidy graphs and P 4-lite graphs are closely related. Note that the class of P 4-lite graphs is a class of brittle graphs strictly containing the P 4-sparse graphs defined by Hoang in [14]. McConnel & Spinrad [2] and independently Cournier & Habib [5] have shown that the modular decomposition tree of any graph is computable in linear time. For recognizing in linear time P 4-tidy graphs, we apply a method introduced by Giakoumakis in [9] and Giakoumakis & Fouquet in [6] using modular decomposition of graphs and we propose linear algorithms for optimization problems on such graphs, as clique number, stability number, chromatic number and scattering number. We show that the Hamiltonian Path Problem is linear for this class of graphs. Our study unifies and generalizes previous results of Jamison & Olariu ([18], [21], [22], Hochstattler & Schindler[16], Jung [25] and Hochstattler & Tinhofer [15].
Quantitative graph theory mathematical foundations and applications
Dehmer, Matthias
2014-01-01
The first book devoted exclusively to quantitative graph theory, Quantitative Graph Theory: Mathematical Foundations and Applications presents and demonstrates existing and novel methods for analyzing graphs quantitatively. Incorporating interdisciplinary knowledge from graph theory, information theory, measurement theory, and statistical techniques, this book covers a wide range of quantitative-graph theoretical concepts and methods, including those pertaining to real and random graphs such as:Comparative approaches (graph similarity or distance)Graph measures to characterize graphs quantitat
Hyperbolicity in Median Graphs
Indian Academy of Sciences (India)
José M Sigarreta
2013-11-01
If is a geodesic metric space and $x_1,x_2,x_3\\in X$, a geodesic triangle $T=\\{x_1,x_2,x_3\\}$ is the union of the three geodesics $[x_1 x_2],[x_2 x_3]$ and $[x_3 x_1]$ in . The space is -hyperbolic (in the Gromov sense) if any side of is contained in a -neighborhood of the union of the two other sides, for every geodesic triangle in . If is hyperbolic, we denote by () the sharp hyperbolicity constant of , i.e.,$(X)=\\inf\\{≥ 0: X \\quad\\text{is}\\quad -\\text{hyperbolic}\\}$. In this paper we study the hyperbolicity of median graphs and we also obtain some results about general hyperbolic graphs. In particular, we prove that a median graph is hyperbolic if and only if its bigons are thin.
Iacovacci, Jacopo
2015-01-01
Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of visibility graph motifs, smaller substructures that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated to general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable to distinguish among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification a...
White, AT
1985-01-01
The field of topological graph theory has expanded greatly in the ten years since the first edition of this book appeared. The original nine chapters of this classic work have therefore been revised and updated. Six new chapters have been added, dealing with: voltage graphs, non-orientable imbeddings, block designs associated with graph imbeddings, hypergraph imbeddings, map automorphism groups and change ringing.Thirty-two new problems have been added to this new edition, so that there are now 181 in all; 22 of these have been designated as ``difficult'''' and 9 as ``unsolved''''. Three of the four unsolved problems from the first edition have been solved in the ten years between editions; they are now marked as ``difficult''''.
Roux, Stéphane Le
2007-01-01
The quest for optimal/stable paths in graphs has gained attention in a few practical or theoretical areas. To take part in this quest this chapter adopts an equilibrium-oriented approach that is abstract and general: it works with (quasi-arbitrary) arc-labelled digraphs, and it assumes very little about the structure of the sought paths and the definition of equilibrium, \\textit{i.e.} optimality/stability. In this setting, this chapter presents a sufficient condition for equilibrium existence for every graph; it also presents a necessary condition for equilibrium existence for every graph. The necessary condition does not imply the sufficient condition a priori. However, the chapter pinpoints their logical difference and thus identifies what work remains to be done. Moreover, the necessary and the sufficient conditions coincide when the definition of optimality relates to a total order, which provides a full-equivalence property. These results are applied to network routing.
Energy Technology Data Exchange (ETDEWEB)
Petkova, V.B. [Technische Univ. Clausthal, Clausthal-Zellerfeld (Germany). Arnold-Sommerfeld Inst. fuer Mathematische Physik (ASI)]|[Institute for Nuclear Research and Nuclear Energy, Tzarigradsko Chaussee 72, 1784 Sofia (Bulgaria); Zuber, J.B. [CEA, Service de Physique Theorique de Saclay, F-91191 Gif sur Yvette Cedex (France)
1996-03-18
In this paper, we pursue the discussion of the connections between rational conformal field theories (CFT) and graphs. We generalise our recent work on the relations of operator product algebra (OPA) structure constants of sl(2) theories with the Pasquier algebra attached to the graph. We show that in a variety of CFT`s built on sl(n) (typically conformal embeddings and orbifolds), similar considerations enable one to write a linear system satisfied by the matrix elements of the Pasquier algebra in terms of conformal data (quantum dimensions and fusion coefficients). In some cases this provides sufficient information for the determination of all the eigenvectors of an adjacency matrix, and hence of a graph. (orig.).
Erickson, Lindsay
2010-01-01
The game of Nim as played on graphs was introduced in Nim on Graphs I and extended in Nim on Graphs II by Masahiko Fukuyama. His papers detail the calculation of Grundy numbers for graphs under specific circumstances. We extend these results and introduce the strategy for even cycles. This paper examines a more general class of graphs by restricting the edge weight to one. We provide structural conditions for which there exist a winning strategy. This yields the solution for the complete graph.
Feynman motives of banana graphs
Aluffi, Paolo
2008-01-01
We consider the infinite family of Feynman graphs known as the ``banana graphs'' and compute explicitly the classes of the corresponding graph hypersurfaces in the Grothendieck ring of varieties as well as their Chern--Schwartz--MacPherson classes, using the classical Cremona transformation and the dual graph, and a blowup formula for characteristic classes. We outline the interesting similarities between these operations and we give formulae for cones obtained by simple operations on graphs. We formulate a positivity conjecture for characteristic classes of graph hypersurfaces and discuss briefly the effect of passing to noncommutative spacetime.
Graph theory and interconnection networks
Hsu, Lih-Hsing
2008-01-01
The advancement of large scale integrated circuit technology has enabled the construction of complex interconnection networks. Graph theory provides a fundamental tool for designing and analyzing such networks. Graph Theory and Interconnection Networks provides a thorough understanding of these interrelated topics. After a brief introduction to graph terminology, the book presents well-known interconnection networks as examples of graphs, followed by in-depth coverage of Hamiltonian graphs. Different types of problems illustrate the wide range of available methods for solving such problems. The text also explores recent progress on the diagnosability of graphs under various models.
Market Segmentation Using Bayesian Model Based Clustering
Van Hattum, P.
2009-01-01
This dissertation deals with two basic problems in marketing, that are market segmentation, which is the grouping of persons who share common aspects, and market targeting, which is focusing your marketing efforts on one or more attractive market segments. For the grouping of persons who share commo
Clustering in Complex Directed Networks
Fagiolo, Giorgio
2006-01-01
Many empirical networks display an inherent tendency to cluster, i.e. to form circles of connected nodes. This feature is typically measured by the clustering coefficient (CC). The CC, originally introduced for binary, undirected graphs, has been recently generalized to weighted, undirected networks. Here we extend the CC to the case of (binary and weighted) directed networks and we compute its expected value for random graphs. We distinguish between CCs that count all directed triangles in t...
Kisku, Dakshina Ranjan; Sing, Jamuna Kanta
2010-01-01
This paper presents a feature level fusion approach which uses the improved K-medoids clustering algorithm and isomorphic graph for face and palmprint biometrics. Partitioning around medoids (PAM) algorithm is used to partition the set of n invariant feature points of the face and palmprint images into k clusters. By partitioning the face and palmprint images with scale invariant features SIFT points, a number of clusters is formed on both the images. Then on each cluster, an isomorphic graph is drawn. In the next step, the most probable pair of graphs is searched using iterative relaxation algorithm from all possible isomorphic graphs for a pair of corresponding face and palmprint images. Finally, graphs are fused by pairing the isomorphic graphs into augmented groups in terms of addition of invariant SIFT points and in terms of combining pair of keypoint descriptors by concatenation rule. Experimental results obtained from the extensive evaluation show that the proposed feature level fusion with the improve...
Granade, Christopher; Cory, D G
2015-01-01
In recent years, Bayesian methods have been proposed as a solution to a wide range of issues in quantum state and process tomography. State-of- the-art Bayesian tomography solutions suffer from three problems: numerical intractability, a lack of informative prior distributions, and an inability to track time-dependent processes. Here, we solve all three problems. First, we use modern statistical methods, as pioneered by Husz\\'ar and Houlsby and by Ferrie, to make Bayesian tomography numerically tractable. Our approach allows for practical computation of Bayesian point and region estimators for quantum states and channels. Second, we propose the first informative priors on quantum states and channels. Finally, we develop a method that allows online tracking of time-dependent states and estimates the drift and diffusion processes affecting a state. We provide source code and animated visual examples for our methods.
Graph-based knowledge representation computational foundations of conceptual graphs
Chein, Michel; Chein, Michel
2008-01-01
In addressing the question of how far it is possible to go in knowledge representation and reasoning through graphs, the authors cover basic conceptual graphs, computational aspects, and kernel extensions. The basic mathematical notions are summarized.
SOME RESULTS ON CIRCULAR PERFECT GRAPHS AND PERFECT GRAPHS
Institute of Scientific and Technical Information of China (English)
XU Baogang
2005-01-01
An r-circular coloring of a graph G is a map f from V(G) to the set of open unit intervals of an Euclidean circle of length r,such that f(u) ∩ f(v) = φ whenever uv ∈ E(G).Circular perfect graphs are defined analogously to perfect graphs by means of two parameters,the circular chromatic number and the circular clique number.In this paper,we study the properties of circular perfect graphs.We give (1) a necessary condition for a graph to be circular perfect,(2) some circular critical imperfect graphs,and (3) a characterization of graphs with the property that each of their induced subgraphs has circular clique number the same as its clique number,and then the two conjectures that are equivalent to the perfect graph conjecture.
Biniaz, Ahmad; Maheshwari, Anil; Smid, Michiel
2014-01-01
Given a set $P$ of $n$ points in the plane, the order-$k$ Gabriel graph on $P$, denoted by $k$-$GG$, has an edge between two points $p$ and $q$ if and only if the closed disk with diameter $pq$ contains at most $k$ points of $P$, excluding $p$ and $q$. We study matching problems in $k$-$GG$ graphs. We show that a Euclidean bottleneck perfect matching of $P$ is contained in $10$-$GG$, but $8$-$GG$ may not have any Euclidean bottleneck perfect matching. In addition we show that $0$-$GG$ has a m...
Stevanovic, Dragan
2014-01-01
Spectral Radius of Graphs provides a thorough overview of important results on the spectral radius of adjacency matrix of graphs that have appeared in the literature in the preceding ten years, most of them with proofs, and including some previously unpublished results of the author. The primer begins with a brief classical review, in order to provide the reader with a foundation for the subsequent chapters. Topics covered include spectral decomposition, the Perron-Frobenius theorem, the Rayleigh quotient, the Weyl inequalities, and the Interlacing theorem. From this introduction, the
Bayesian exploratory factor analysis
Gabriella Conti; Sylvia Frühwirth-Schnatter; James Heckman; Rémi Piatek
2014-01-01
This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identifi cation criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study c...
Bayesian Exploratory Factor Analysis
Conti, Gabriella; Frühwirth-Schnatter, Sylvia; Heckman, James J.; Piatek, Rémi
2014-01-01
This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study co...
Bayesian Exploratory Factor Analysis
Gabriella Conti; Sylvia Fruehwirth-Schnatter; Heckman, James J.; Remi Piatek
2014-01-01
This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on \\emph{ad hoc} classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo s...
Bayesian exploratory factor analysis
Conti, Gabriella; Frühwirth-Schnatter, Sylvia; Heckman, James J.; Piatek, Rémi
2014-01-01
This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo st...
Bayesian exploratory factor analysis
Conti, Gabriella; Frühwirth-Schnatter, Sylvia; Heckman, James; Piatek, Rémi
2014-01-01
This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study co...
Carbonetto, Peter; Kisynski, Jacek; De Freitas, Nando; Poole, David L
2012-01-01
The Bayesian Logic (BLOG) language was recently developed for defining first-order probability models over worlds with unknown numbers of objects. It handles important problems in AI, including data association and population estimation. This paper extends BLOG by adopting generative processes over function spaces - known as nonparametrics in the Bayesian literature. We introduce syntax for reasoning about arbitrary collections of objects, and their properties, in an intuitive manner. By expl...
Bayesian default probability models
Andrlíková, Petra
2014-01-01
This paper proposes a methodology for default probability estimation for low default portfolios, where the statistical inference may become troublesome. The author suggests using logistic regression models with the Bayesian estimation of parameters. The piecewise logistic regression model and Box-Cox transformation of credit risk score is used to derive the estimates of probability of default, which extends the work by Neagu et al. (2009). The paper shows that the Bayesian models are more acc...
Topics in graph theory graphs and their Cartesian product
Imrich, Wilfried; Rall, Douglas F
2008-01-01
From specialists in the field, you will learn about interesting connections and recent developments in the field of graph theory by looking in particular at Cartesian products-arguably the most important of the four standard graph products. Many new results in this area appear for the first time in print in this book. Written in an accessible way, this book can be used for personal study in advanced applications of graph theory or for an advanced graph theory course.
Handbook of graph grammars and computing by graph transformation
Engels, G; Kreowski, H J; Rozenberg, G
1999-01-01
Graph grammars originated in the late 60s, motivated by considerations about pattern recognition and compiler construction. Since then, the list of areas which have interacted with the development of graph grammars has grown quite impressively. Besides the aforementioned areas, it includes software specification and development, VLSI layout schemes, database design, modeling of concurrent systems, massively parallel computer architectures, logic programming, computer animation, developmental biology, music composition, visual languages, and many others.The area of graph grammars and graph tran
Analyzing and adapting graph algorithms for large persistent graphs
Larsson, Patrik
2008-01-01
In this work, the graph database Neo4j developed by Neo Technology is presented together with some of it's functionality when it comes to accessing data as a graph. This type of data access brings the possibility to implement common graph algorithms on top of Neo4j. Examples of such algorithms are presented together with their theoretical backgrounds. These are mainly algorithms for finding shortest paths and algorithms for different graph measures such as centrality measures. The implementat...
Handbook of graph grammars and computing by graph transformations
Ehrig, H J Kreowski
1999-01-01
Graph grammars originated in the late 60s, motivated by considerations about pattern recognition and compiler construction. Since then, the list of areas which have interacted with the development of graph grammars has grown quite impressively. Besides the aforementioned areas, it includes software specification and development, VLSI layout schemes, database design, modeling of concurrent systems, massively parallel computer architectures, logic programming, computer animation, developmental biology, music composition, visual languages, and many others.The area of graph grammars and graph tran
Multiplicative Zagreb indices and coindices of some derived graphs
Bommanahal Basavanagoud; Shreekant Patil
2016-01-01
In this note, we obtain the expressions for multiplicative Zagreb indices and coindices of derived graphs such as a line graph, subdivision graph, vertex-semitotal graph, edge-semitotal graph, total graph and paraline graph.
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream. PMID:27047326
Supervised learning on graphs of spatio-temporal similarity in satellite image sequences
Héas, Patrick
2007-01-01
High resolution satellite image sequences are multidimensional signals composed of spatio-temporal patterns associated to numerous and various phenomena. Bayesian methods have been previously proposed in (Heas and Datcu, 2005) to code the information contained in satellite image sequences in a graph representation using Bayesian methods. Based on such a representation, this paper further presents a supervised learning methodology of semantics associated to spatio-temporal patterns occurring in satellite image sequences. It enables the recognition and the probabilistic retrieval of similar events. Indeed, graphs are attached to statistical models for spatio-temporal processes, which at their turn describe physical changes in the observed scene. Therefore, we adjust a parametric model evaluating similarity types between graph patterns in order to represent user-specific semantics attached to spatio-temporal phenomena. The learning step is performed by the incremental definition of similarity types via user-prov...
Zeta functions of quantum graphs
Harrison, J M
2009-01-01
Spectral problems on quantum graphs are a topic of current interest. Progress has been made with questions of spectral statistics, the spectral determinant, inverse problems, the relationship with operators on thin manifolds, Anderson localization, manipulation of the graph spectrum and many other important areas. In this article, however, we turn to a relatively untouched area of the spectral theory of quantum graphs and construct and analyze the spectral zeta function. Understanding the spectral zeta function has been a notable omission from the analysis of quantum graphs which is particularly striking as the Ihara-Selberg zeta function is known to play a fundamental role in the understanding of the spectral theory of combinatorial graphs and it is known that quantum and combinatorial graph spectra are related. In this article we construct zeta functions of quantum graphs using a contour integral technique based on the argument principle. We start by considering the special case of the star graph with Neuma...
Bayesian network learning for natural hazard assessments
Vogel, Kristin
2016-04-01
Even though quite different in occurrence and consequences, from a modelling perspective many natural hazards share similar properties and challenges. Their complex nature as well as lacking knowledge about their driving forces and potential effects make their analysis demanding. On top of the uncertainty about the modelling framework, inaccurate or incomplete event observations and the intrinsic randomness of the natural phenomenon add up to different interacting layers of uncertainty, which require a careful handling. Thus, for reliable natural hazard assessments it is crucial not only to capture and quantify involved uncertainties, but also to express and communicate uncertainties in an intuitive way. Decision-makers, who often find it difficult to deal with uncertainties, might otherwise return to familiar (mostly deterministic) proceedings. In the scope of the DFG research training group „NatRiskChange" we apply the probabilistic framework of Bayesian networks for diverse natural hazard and vulnerability studies. The great potential of Bayesian networks was already shown in previous natural hazard assessments. Treating each model component as random variable, Bayesian networks aim at capturing the joint distribution of all considered variables. Hence, each conditional distribution of interest (e.g. the effect of precautionary measures on damage reduction) can be inferred. The (in-)dependencies between the considered variables can be learned purely data driven or be given by experts. Even a combination of both is possible. By translating the (in-)dependences into a graph structure, Bayesian networks provide direct insights into the workings of the system and allow to learn about the underlying processes. Besides numerous studies on the topic, learning Bayesian networks from real-world data remains challenging. In previous studies, e.g. on earthquake induced ground motion and flood damage assessments, we tackled the problems arising with continuous variables
Coloring geographical threshold graphs
Energy Technology Data Exchange (ETDEWEB)
Bradonjic, Milan [Los Alamos National Laboratory; Percus, Allon [Los Alamos National Laboratory; Muller, Tobias [EINDHOVEN UNIV. OF TECH
2008-01-01
We propose a coloring algorithm for sparse random graphs generated by the geographical threshold graph (GTG) model, a generalization of random geometric graphs (RGG). In a GTG, nodes are distributed in a Euclidean space, and edges are assigned according to a threshold function involving the distance between nodes as well as randomly chosen node weights. The motivation for analyzing this model is that many real networks (e.g., wireless networks, the Internet, etc.) need to be studied by using a 'richer' stochastic model (which in this case includes both a distance between nodes and weights on the nodes). Here, we analyze the GTG coloring algorithm together with the graph's clique number, showing formally that in spite of the differences in structure between GTG and RGG, the asymptotic behavior of the chromatic number is identical: {chi}1n 1n n / 1n n (1 + {omicron}(1)). Finally, we consider the leading corrections to this expression, again using the coloring algorithm and clique number to provide bounds on the chromatic number. We show that the gap between the lower and upper bound is within C 1n n / (1n 1n n){sup 2}, and specify the constant C.
Beckmann, Charlene E.; Rozanski, Kara
1999-01-01
Presents a lesson that uses a motion detector in order for students to experience the interplay between motion and its graphical representation of the slope. Focuses on the change in the appearance of the graph with regard to changing speed. (ASK)
Domination Game Critical Graphs
Directory of Open Access Journals (Sweden)
Bujtás Csilla
2015-11-01
Full Text Available The domination game is played on a graph G by two players who alternately take turns by choosing a vertex such that in each turn at least one previously undominated vertex is dominated. The game is over when each vertex becomes dominated. One of the players, namely Dominator, wants to finish the game as soon as possible, while the other one wants to delay the end. The number of turns when Dominator starts the game on G and both players play optimally is the graph invariant γg(G, named the game domination number. Here we study the γg-critical graphs which are critical with respect to vertex predomination. Besides proving some general properties, we characterize γg-critical graphs with γg = 2 and with γg = 3, moreover for each n we identify the (infinite class of all γg-critical ones among the nth powers CnN of cycles. Along the way we determine γg(CnN for all n and N. Results of a computer search for γg-critical trees are presented and several problems and research directions are also listed.
Czech Academy of Sciences Publication Activity Database
Markl, Martin; Voronov, A.A.
Boston : Birkhäuser, 2009 - (Tschinkel, Y.; Zahrin, Y.), s. 249-282 ISBN 978-0-8176-4746-9. - (Progress in Mathematics. Vol. 270) R&D Projects: GA ČR GA201/02/1390 Institutional research plan: CEZ:AV0Z10190503 Keywords : PROP * graph * complex * cohomology Subject RIV: BA - General Mathematics
Action recognition in video using a spatial-temporal graph-based feature representation
Jargalsaikhan, Iveel; Little, Suzanne; Trichet, Remi; O'Connor, Noel E.
2015-01-01
We propose a video graph based human action recognition framework. Given an input video sequence, we extract spatio-temporal local features and construct a video graph to incorporate appearance and motion constraints to reflect the spatio-temporal dependencies among features. them. In particular, we extend a popular dbscan density-based clustering algorithm to form an intuitive video graph. During training, we estimate a linear SVM classifier using the standard Bag-of-words method. Duri...
On converting community detection algorithms for fuzzy graphs in Neo4j
Drakopoulos, Georgios; Kanavos, Andreas; Makris, Christos; Megalooikonomou, Vasileios
2016-01-01
An essential feature of large scale free graphs, such as the Web, protein-to-protein interaction, brain connectivity, and social media graphs, is that they tend to form recursive communities. The latter are densely connected vertex clusters exhibiting quick local information dissemination and processing. Under the fuzzy graph model vertices are fixed while each edge exists with a given probability according to a membership function. This paper presents Fuzzy Walktrap and Fuzzy Newman-Girvan, ...
Bayesian feature weighting for unsupervised learning, with application to object recognition
Carbonetto, Peter; De Freitas, Nando; Gustafson, Paul; Thompson, Natalie
2003-01-01
We present a method for variable selection/weighting in an unsupervised learning context using Bayesian shrinkage. The basis for the model parameters and cluster assignments can be computed simultaneous using an efficient EM algorithm. Applying our Bayesian shrinkage model to a complex problem in object recognition (Duygulu, Barnard, de Freitas and Forsyth 2002), our experiments yied good results.
An Analytical Study of Computation and Communication Tradeoffs in Distributed Graph
Directory of Open Access Journals (Sweden)
Amirreza Abdolrashidi
2015-12-01
Full Text Available Distributed vertex-centric graph processing systems such as Pregel, Giraph and GPS have acquired significant popularity in recent years. Although the manner in which graph data is partitioned and placed on the computational nodes has considerable impact on the performance of the vertex-centric graph processing cluster, there are very few comprehensive studies on this topic. Towards enhancing our understanding of this important factor, in this paper, we propose a novel model for analyzing the performance of such clusters. Using three graph algorithms as case studies, we also characterize the inherent tradeoff between the computational load distribution and the communication overheads of a BSP cluster. This paper also reports a detailed experimental study investigating the performance of commonly-used graph partitioning mechanisms with respect to their computational load distribution characteristics and the associated communication overheads.
Presentations of Graph Braid Groups
Farley, Daniel; Sabalka, Lucas
2009-01-01
Let G be a graph. The (unlabeled) configuration space of n points on G is the space of all n-element subsets of G. The fundamental group of such a configuration space is called a graph braid group. We use a version of discrete Morse theory to compute presentations of all graph braid groups, for all finite connected graphs G and all natural numbers n.
Omid Zahiri; Rajab Ali Borzooei
2012-01-01
We associate a graph to any subset Y of a BCI-algebra X and denote it by G(Y). Then we find the set of all connected components of G(X) and verify the relation between X and G(X), when X is commutative BCI-algebra or G(X) is complete graph or n-star graph. Finally, we attempt to investigate the relation between some operations on graph and some operations on BCI-algebras.
Logspace computations in graph products
Diekert, Volker; Kausch, Jonathan
2013-01-01
We consider three important and well-studied algorithmic problems in group theory: the word, geodesic, and conjugacy problem. We show transfer results from individual groups to graph products. We concentrate on logspace complexity because the challenge is actually in small complexity classes, only. The most difficult transfer result is for the conjugacy problem. We have a general result for graph products, but even in the special case of a graph group the result is new. Graph groups are close...
Performance Introspection of Graph Databases
Macko, Peter; Margo, Daniel Wyatt; Seltzer, Margo I.
2013-01-01
The explosion of graph data in social and biological networks, recommendation systems, provenance databases, etc. makes graph storage and processing of paramount importance. We present a performance introspection framework for graph databases, PIG, which provides both a toolset and methodology for understanding graph database performance. PIG consists of a hierarchical collection of benchmarks that compose to produce performance models; the models provide a way to illuminate the strengths and...
Temporal Representation in Semantic Graphs
Energy Technology Data Exchange (ETDEWEB)
Levandoski, J J; Abdulla, G M
2007-08-07
A wide range of knowledge discovery and analysis applications, ranging from business to biological, make use of semantic graphs when modeling relationships and concepts. Most of the semantic graphs used in these applications are assumed to be static pieces of information, meaning temporal evolution of concepts and relationships are not taken into account. Guided by the need for more advanced semantic graph queries involving temporal concepts, this paper surveys the existing work involving temporal representations in semantic graphs.
Neural networks and graph theory
Institute of Scientific and Technical Information of China (English)
许进; 保铮
2002-01-01
The relationships between artificial neural networks and graph theory are considered in detail. The applications of artificial neural networks to many difficult problems of graph theory, especially NP-complete problems, and the applications of graph theory to artificial neural networks are discussed. For example graph theory is used to study the pattern classification problem on the discrete type feedforward neural networks, and the stability analysis of feedback artificial neural networks etc.
Directory of Open Access Journals (Sweden)
Kumar Abhishek
2012-08-01
Full Text Available A {it set-indexer} of a given graph $G = (V, E$ is an assignment $f$ of distinct nonempty subsets of a finite nonempty 'ground set' $X$ of cardinality $n$ to the vertices of $G$ so that the values $f^{oplus}(e, e=uv in E,$ obtained as the symmetric differences $f(u oplus f(v$ of the subsets $f(u$ and $f(v$ of $X,$ are all distinct. A set-indexer $f$ of a graph $G,$ is called a {it segregation} of $X$ on $G$ if the sets $f(V(G = {f(u: u in V(G}$ and $f^{oplus}(E(G = {f^{oplus}(e: e in E(G}$ are disjoint, and if, in addition, their union is the set $Y(X = mathcal{P}(X-{emptyset}$ of all the nonempty subsets of $X$ where $mathcal{P}(X$ denotes the power set of $X,$ then $f$ is called a {it set-sequential labeling} of $G.$ A graph is hence called {it set-sequential} if it admits a set-sequential labeling with respect to some 'ground set' $X.$ A set-indexer $f$ of a $(p, q$-graph $G = (V, E$ is called a {it set-graceful labeling} of $G$ if there exists nonempty ground set $X$ such that $f^{oplus}(E = mathcal{P}(X-{emptyset}$ and $G$ is {it set-graceful} if it admits a set-graceful labeling. In this report we provide a complete characterization of set-sequential caterpillar of diameter four. We also a provide a new necessary condition for a graph to be set-sequential.
Hopkins, Brian
2004-01-01
The interconnected world of actors and movies is a familiar, rich example for graph theory. This paper gives the history of the "Kevin Bacon Game" and makes extensive use of a Web site to analyze the underlying graph. The main content is the classroom development of the weighted average to determine the best choice of "center" for the graph. The…
Defferrard, Michaël
2014-01-01
The project goal was to explore the applications of spectral graph theory to address the inpainting problem of large missing chunks. We used a non-local patch graph representation of the image and proposed a structure detector which leverages the graph representation and influences the fill-order of our exemplar-based algorithm. Our method achieved state-of-the-art performances.
Cooper, Colin; Frieze, Alan
The aim of this article is to discuss some of the notions and applications of random walks on finite graphs, especially as they apply to random graphs. In this section we give some basic definitions, in Section 2 we review applications of random walks in computer science, and in Section 3 we focus on walks in random graphs.
Submanifolds Weakly Associated with Graphs
Indian Academy of Sciences (India)
A Carriazo; L M Fernández; A Rodríguez-Hidalgo
2009-06-01
We establish an interesting link between differential geometry and graph theory by defining submanifolds weakly associated with graphs. We prove that, in a local sense, every submanifold satisfies such an association, and other general results. Finally, we study submanifolds associated with graphs either in low dimensions or belonging to some special families.
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Luo, Xiangfeng
2015-12-01
Graph mining has been a popular research area because of its numerous application scenarios. Many unstructured and structured data can be represented as graphs, such as, documents, chemical molecular structures, and images. However, an issue in relation to current research on graphs is that they cannot adequately discover the topics hidden in graph-structured data which can be beneficial for both the unsupervised learning and supervised learning of the graphs. Although topic models have proved to be very successful in discovering latent topics, the standard topic models cannot be directly applied to graph-structured data due to the "bag-of-word" assumption. In this paper, an innovative graph topic model (GTM) is proposed to address this issue, which uses Bernoulli distributions to model the edges between nodes in a graph. It can, therefore, make the edges in a graph contribute to latent topic discovery and further improve the accuracy of the supervised and unsupervised learning of graphs. The experimental results on two different types of graph datasets show that the proposed GTM outperforms the latent Dirichlet allocation on classification by using the unveiled topics of these two models to represent graphs. PMID:25616091
Hamiltonian formulation of bond graphs
Golo, Goran; Schaft, van der Arjan; Breedveld, Peter C.; Maschke, Bernhard M.; Johansson, R.; Rantzer, A.
2003-01-01
This paper deals with the mathematical formulation of bond graphs. It is proven that the power continuous part of bond graphs, the junction structure, can be associated with a Dirac structure and that the equations describing a bond graph model correspond to a port Hamiltonian system. The conditions
Akram, Muhammad; 10.1016/j.camwa.2010.11.004
2012-01-01
We define the Cartesian product, composition, union and join on interval-valued fuzzy graphs and investigate some of their properties. We also introduce the notion of interval-valued fuzzy complete graphs and present some properties of self complementary and self weak complementary interval-valued fuzzy complete graphs.
Editing graphs for maximum effect
Energy Technology Data Exchange (ETDEWEB)
Murphy, P.W.; Rhiner, R.W.
1991-01-08
The paper contains over eighty rules for editing graphs, arranged under nine major headings in a logical sequence for editing all the graphs in a manuscript. It is excerpted from a monograph used at the Lawrence Livermore National Laboratory to train beginning technical editors in editing graphs; a corresponding Hypercard stack is also used in this training. 6 refs., 4 figs.
Mining and Indexing Graph Databases
Yuan, Dayu
2013-01-01
Graphs are widely used to model structures and relationships of objects in various scientific and commercial fields. Chemical molecules, proteins, malware system-call dependencies and three-dimensional mechanical parts are all modeled as graphs. In this dissertation, we propose to mine and index those graph data to enable fast and scalable search.…
Generalized Ramsey numbers for graphs
Zhang, Yanbo
2015-01-01
This thesis contains new contributions to Ramsey theory, in particular results that establish exact values of graph Ramsey numbers that were unknown to date. Given two graphs F and H, the Ramsey number R(F,H) is the smallest integer N such that, for any graph G of order N, either G contains F as a s
Asteroidal Quadruples in non Rooted Path Graphs
Directory of Open Access Journals (Sweden)
Gutierrez Marisa
2015-11-01
Full Text Available A directed path graph is the intersection graph of a family of directed subpaths of a directed tree. A rooted path graph is the intersection graph of a family of directed subpaths of a rooted tree. Rooted path graphs are directed path graphs. Several characterizations are known for directed path graphs: one by forbidden induced subgraphs and one by forbidden asteroids. It is an open problem to find such characterizations for rooted path graphs. For this purpose, we are studying in this paper directed path graphs that are non rooted path graphs. We prove that such graphs always contain an asteroidal quadruple.
New concepts of fuzzy planar graphs
Directory of Open Access Journals (Sweden)
Sovan Samanta
2014-01-01
Full Text Available Fuzzy planar graph is an important subclass of fuzzy graph. Fuzzy planar graphs and its several properties are presented. A very close association of fuzzy planar graph is fuzzy dual graph. This is also defined and several properties of it are studied. Isomorphism on fuzzy graphs are well defined in literature. Isomorphic relation between fuzzy planar graph and its dual graph are established.
A General Statistical Framework for Assessing Categorical Clustering in Free Recall.
Hubert, Lawrence J.; Levin, Joel R.
A graph-theoretic paradigm is used to generalize the common measures of categorical clustering in free recall based on the number of observed repetitions. Two graphs are defined: a graph G that characterizes the a priori structure of the item set defined by a researcher, and a graph R that characterizes a subject's protocol. Two indices of…
Pivovarov, G. B.; Trunov, S. E.
2005-01-01
SPIRES is the largest database of scientific papers in the subject field of high energy and nuclear physics. It contains information on the citation graph of more than half a million of papers (vertexes of the citation graph). We outline the EqRank algorithm designed to cluster vertexes of directed graphs, and present the results of EqRank application to the SPIRES citation graph. The hierarchical clustering of SPIRES yielded by EqRank is used to set up a web service, which is also outlined.
Lean Algebraic Multigrid (LAMG): Fast Graph Laplacian Linear Solver
Livne, Oren E
2011-01-01
Laplacian matrices of graphs arise in large-scale computational applications such as machine learning; spectral clustering of images, genetic data and web pages; transportation network flows; electrical resistor circuits; and elliptic partial differential equations discretized on unstructured grids with finite elements. A Lean Algebraic Multigrid (LAMG) solver of the linear system Ax=b is presented, where A is a graph Laplacian. LAMG's run time and storage are linear in the number of graph edges. LAMG consists of a setup phase, in which a sequence of increasingly-coarser Laplacian systems is constructed, and an iterative solve phase using multigrid cycles. General graphs pose algorithmic challenges not encountered in traditional applications of algebraic multigrid. LAMG combines a lean piecewise-constant interpolation, judicious node aggregation based on a new node proximity definition, and an energy correction of the coarse-level systems. This results in fast convergence and substantial overhead and memory s...
Kim, Jongkwang; Wilhelm, Thomas
2008-04-01
Many papers published in recent years show that real-world graphs G(n,m) ( n nodes, m edges) are more or less “complex” in the sense that different topological features deviate from random graphs. Here we narrow the definition of graph complexity and argue that a complex graph contains many different subgraphs. We present different measures that quantify this complexity, for instance C1e, the relative number of non-isomorphic one-edge-deleted subgraphs (i.e. DECK size). However, because these different subgraph measures are computationally demanding, we also study simpler complexity measures focussing on slightly different aspects of graph complexity. We consider heuristically defined “product measures”, the products of two quantities which are zero in the extreme cases of a path and clique, and “entropy measures” quantifying the diversity of different topological features. The previously defined network/graph complexity measures Medium Articulation and Offdiagonal complexity ( OdC) belong to these two classes. We study OdC measures in some detail and compare it with our new measures. For all measures, the most complex graph G has a medium number of edges, between the edge numbers of the minimum and the maximum connected graph n-1graph complexity measures are characterized with the help of different example graphs. For all measures the corresponding time complexity is given. Finally, we discuss the complexity of 33 real-world graphs of different biological, social and economic systems with the six computationally most simple measures (including OdC). The complexities of the real graphs are compared with average complexities of two different random graph versions: complete random graphs (just fixed n,m) and rewired graphs with fixed node degrees.
Modelling of JET diagnostics using Bayesian Graphical Models
Energy Technology Data Exchange (ETDEWEB)
Svensson, J. [IPP Greifswald, Greifswald (Germany); Ford, O. [Imperial College, London (United Kingdom); McDonald, D.; Hole, M.; Nessi, G. von; Meakins, A.; Brix, M.; Thomsen, H.; Werner, A.; Sirinelli, A.
2011-07-01
The mapping between physics parameters (such as densities, currents, flows, temperatures etc) defining the plasma 'state' under a given model and the raw observations of each plasma diagnostic will 1) depend on the particular physics model used, 2) is inherently probabilistic, from uncertainties on both observations and instrumental aspects of the mapping, such as calibrations, instrument functions etc. A flexible and principled way of modelling such interconnected probabilistic systems is through so called Bayesian graphical models. Being an amalgam between graph theory and probability theory, Bayesian graphical models can simulate the complex interconnections between physics models and diagnostic observations from multiple heterogeneous diagnostic systems, making it relatively easy to optimally combine the observations from multiple diagnostics for joint inference on parameters of the underlying physics model, which in itself can be represented as part of the graph. At JET about 10 diagnostic systems have to date been modelled in this way, and has lead to a number of new results, including: the reconstruction of the flux surface topology and q-profiles without any specific equilibrium assumption, using information from a number of different diagnostic systems; profile inversions taking into account the uncertainties in the flux surface positions and a substantial increase in accuracy of JET electron density and temperature profiles, including improved pedestal resolution, through the joint analysis of three diagnostic systems. It is believed that the Bayesian graph approach could potentially be utilised for very large sets of diagnostics, providing a generic data analysis framework for nuclear fusion experiments, that would be able to optimally utilize the information from multiple diagnostics simultaneously, and where the explicit graph representation of the connections to underlying physics models could be used for sophisticated model testing. This
Graph Theory and Its Application in Educational Research: A Review and Integration.
Tatsuoka, Maurice M.
1986-01-01
A nontechnical exposition of graph theory is presented, followed by survey of the literature on applications of graph theory in research in education and related disciplines. Applications include order-theoretic studies of the dimensionality of data sets, the investigation of hierarchical structures in various domains, and cluster analysis.…
Entropy and graph based modelling of document coherence using discourse entities
DEFF Research Database (Denmark)
Petersen, Casper; Lioma, Christina; Simonsen, Jakob Grue; Larsen, Birger
represents text as a graph of discourse entities, linked by different relations, such as their distance or adjacency in text. We use several graph topology metrics to approximate different aspects of the discourse flow that can indicate coherence, such as the average clustering or betweenness of discourse...
Unsupervised document clustering by weighted combination
González Pellicer, Edgar; Turmo Borras, Jorge
2006-01-01
This report proposes a novel unsupervised document clustering approach based on weighted combination of individual clusterings. Two non-weighted combination methods are adapted to work in a weighted fashion: a graph based method and a probability based one. The performance of the weighted approach is evaluated on real-world collections, and compared to that of individual clustering and non-weighted combination. The results of this evaluation confirm that graph based weighted combination consi...
Spectral fluctuations of quantum graphs
International Nuclear Information System (INIS)
We prove the Bohigas-Giannoni-Schmit conjecture in its most general form for completely connected simple graphs with incommensurate bond lengths. We show that for graphs that are classically mixing (i.e., graphs for which the spectrum of the classical Perron-Frobenius operator possesses a finite gap), the generating functions for all (P,Q) correlation functions for both closed and open graphs coincide (in the limit of infinite graph size) with the corresponding expressions of random-matrix theory, both for orthogonal and for unitary symmetry
Harary, Frank
2015-01-01
Presented in 1962-63 by experts at University College, London, these lectures offer a variety of perspectives on graph theory. Although the opening chapters form a coherent body of graph theoretic concepts, this volume is not a text on the subject but rather an introduction to the extensive literature of graph theory. The seminar's topics are geared toward advanced undergraduate students of mathematics.Lectures by this volume's editor, Frank Harary, include ""Some Theorems and Concepts of Graph Theory,"" ""Topological Concepts in Graph Theory,"" ""Graphical Reconstruction,"" and other introduc
Graph Homomorphisms for Quantum Players
Mancinska, Laura; Roberson, David
2014-01-01
A homomorphism from a graph X to a graph Y is an adjacency preserving mapping f:V(X) -> V(Y). We consider a nonlocal game in which Alice and Bob are trying to convince a verifier with certainty that a graph X admits a homomorphism to Y. This is a generalization of the well-studied graph coloring game. Via systematic study of quantum homomorphisms we prove new results for graph coloring. Most importantly, we show that the Lovász theta number of the complement lower bounds the...
Cops, robbers, and infinite graphs
Lehner, Florian
2014-01-01
Cops and robbers is a game between two players, where one tries to catch the other by moving along the edges of a graph. It is well known that on a finite graph the cop has a winning strategy if and only if the graph is constructible and that finiteness is necessary for this result. We propose the notion of weakly cop-win graphs, a winning criterion for infinite graphs which could lead to a generalisation. In fact, we generalise one half of the result, that is, we prove that every constructib...
Resolvability in Circulant Graphs
Institute of Scientific and Technical Information of China (English)
Muhammad SALMAN; Imran JAVAID; Muhammad Anwar CHAUDHRY
2012-01-01
A set W of the vertices of a connected graph G is called a resolving set for G if for every two distinct vertices u,v ∈ V(G) there is a vertex w ∈ W such that d(u,w) ≠ d(v,w).A resolving set of minimum cardinality is called a metric basis for G and the number of vertices in a metric basis is called the metric dimension of G,denoted by dim(G).For a vertex u of G and a subset S of V(G),the distance between u and S is the number mins∈s d(u,s).A k-partition H ={S1,S2,...,Sk} of V(G) is called a resolving partition if for every two distinct vertices u,v ∈ V(G) there is a set Si in Π such that d(u,Si) ≠ d(v,Si).The minimum k for which there is a resolving k-partition of V(G) is called the partition dimension of G,denoted by pd(G).The circulant graph is a graph with vertex set Zn,an additive group ofintegers modulo n,and two vertices labeled i and j adjacent if and only if i - j (mod n) ∈ C,where C C Zn has the property that C =-C and 0(∈) C.The circulant graph is denoted by Xn,△ where A =|C|.In this paper,we study the metric dimension of a family of circulant graphs Xn,3 with connection set C ={1,-n/2,n - 1} and prove that dim(Xn,3) is independent of choice of n by showing that 3 for all n =0 (mod 4),dim(X,n,3) ={ 4 for all n =2 (mod 4).We also study the partition dimension of a family of circulant graphs Xn,4 with connection set C ={±1,±2} and prove that pd(Xn,4) is independent of choice of n and show that pd(X5,4) =5 and 3 forall odd n≥9,pd(Xn,4) ={ 4 for all even n ≥ 6 and n =7.
A Selective Fuzzy Clustering Ensemble Algorithm
Kai Li; Peng Li
2013-01-01
To improve the performance of clustering ensemble method, a selective fuzzy clustering ensemble algorithm is proposed. It mainly includes selection of clustering ensemble members and combination of clustering results. In the process of member selection, measure method is defined to select the better clustering members. Then some selected clustering members are viewed as hyper-graph in order to select the more influential hyper-edges (or features) and to weight the selected features. For proce...
Methods for co-clustering: a review
Brault, Vincent; Lomet, Aurore
2015-01-01
Co-clustering aims to identify block patterns in a data table, from a joint clustering of rows and columns. This problem has been studied since 1965, with recent interests in various fields, ranging from graph analysis, machine learning, data mining and genomics. Several variants have been proposed with diverse names: bi-clustering, block clustering, cross-clustering, or simultaneous clustering. We propose here a review of these methods in order to describe, compare and discuss the different ...
Reproducibility of graph metrics in fMRI networks
Directory of Open Access Journals (Sweden)
Qawi K Telesford
2010-12-01
Full Text Available The reliability of graph metrics calculated in network analysis is essential to the interpretation of complex network organization. These graph metrics are used to deduce the small-world properties in networks. In this study, we investigated the test-retest reliability of graph metrics from functional magnetic resonance imaging (fMRI data collected for two runs in 45 healthy older adults. Graph metrics were calculated on data for both runs and compared using intraclass correlation coefficient (ICC statistics and Bland-Altman (BA plots. ICC scores describe the level of absolute agreement between two measurements and provide a measure of reproducibility. For mean graph metrics, ICC scores were high for clustering coefficient (ICC=0.86, global efficiency (ICC=0.83, path length (ICC=0.79, and local efficiency (ICC=0.75; the ICC score for degree was found to be low (ICC=0.29. ICC scores were also used to generate reproducibility maps in brain space to test voxel-wise reproducibility for unsmoothed and smoothed data. Reproducibility was uniform across the brain for global efficiency and path length, but was only high in network hubs for clustering coefficient, local efficiency and degree. BA plots were used to test the measurement repeatability of all graph metrics. All graph metrics fell within the limits for repeatability. Together, these results suggest that with exception of degree, mean graph metrics are reproducible and suitable for clinical studies. Further exploration is warranted to better understand reproducibility across the brain on a voxel-wise basis.
Directory of Open Access Journals (Sweden)
John J. Lattanzio
2010-01-01
Full Text Available Problem statement: The vertex double-critical conjecture that the only vertex double-critical graph is the complete graph has remained unresolved for over forty years. The edge analogue of this conjecture has been proved. Approach: It was observed that if the chromatic number decreases by two upon the removal of a 2-matching, then the 2-matching comprises four vertices which determine an induced subgraph isomorphic to the complete graph on four vertices. This observation was generalized to t-matchings. Results: In this note, it has been shown that the only edge double-critical graph is the complete graph. Conclusion/Recommendations: An alternate proof that the only edge double-critical graph is the complete graph has been obtained. Moreover, the result has been obtained independently.