WorldWideScience

Sample records for bayesian graph clustering

  1. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

    KAUST Repository

    Xu, Zhiqiang

    2017-02-16

    Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

  2. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

    KAUST Repository

    Xu, Zhiqiang; Cheng, James; Xiao, Xiaokui; Fujimaki, Ryohei; Muraoka, Yusuke

    2017-01-01

    Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

  3. CORECLUSTER: A Degeneracy Based Graph Clustering Framework

    OpenAIRE

    Giatsidis , Christos; Malliaros , Fragkiskos; Thilikos , Dimitrios M. ,; Vazirgiannis , Michalis

    2014-01-01

    International audience; Graph clustering or community detection constitutes an important task forinvestigating the internal structure of graphs, with a plethora of applications in several domains. Traditional tools for graph clustering, such asspectral methods, typically suffer from high time and space complexity. In thisarticle, we present \\textsc{CoreCluster}, an efficient graph clusteringframework based on the concept of graph degeneracy, that can be used along withany known graph clusteri...

  4. A cluster algorithm for graphs

    NARCIS (Netherlands)

    S. van Dongen

    2000-01-01

    textabstractA cluster algorithm for graphs called the emph{Markov Cluster algorithm (MCL~algorithm) is introduced. The algorithm provides basically an interface to an algebraic process defined on stochastic matrices, called the MCL~process. The graphs may be both weighted (with nonnegative weight)

  5. A new cluster algorithm for graphs

    NARCIS (Netherlands)

    S. van Dongen

    1998-01-01

    textabstractA new cluster algorithm for graphs called the emph{Markov Cluster algorithm ($MCL$ algorithm) is introduced. The graphs may be both weighted (with nonnegative weight) and directed. Let~$G$~be such a graph. The $MCL$ algorithm simulates flow in $G$ by first identifying $G$ in a

  6. A local search for a graph clustering problem

    Science.gov (United States)

    Navrotskaya, Anna; Il'ev, Victor

    2016-10-01

    In the clustering problems one has to partition a given set of objects (a data set) into some subsets (called clusters) taking into consideration only similarity of the objects. One of most visual formalizations of clustering is graph clustering, that is grouping the vertices of a graph into clusters taking into consideration the edge structure of the graph whose vertices are objects and edges represent similarities between the objects. In the graph k-clustering problem the number of clusters does not exceed k and the goal is to minimize the number of edges between clusters and the number of missing edges within clusters. This problem is NP-hard for any k ≥ 2. We propose a polynomial time (2k-1)-approximation algorithm for graph k-clustering. Then we apply a local search procedure to the feasible solution found by this algorithm and hold experimental research of obtained heuristics.

  7. Performance criteria for graph clustering and Markov cluster experiments

    NARCIS (Netherlands)

    S. van Dongen

    2000-01-01

    textabstractIn~[1] a cluster algorithm for graphs was introduced called the Markov cluster algorithm or MCL~algorithm. The algorithm is based on simulation of (stochastic) flow in graphs by means of alternation of two operators, expansion and inflation. The results in~[2] establish an intrinsic

  8. Bayesian Decision Theoretical Framework for Clustering

    Science.gov (United States)

    Chen, Mo

    2011-01-01

    In this thesis, we establish a novel probabilistic framework for the data clustering problem from the perspective of Bayesian decision theory. The Bayesian decision theory view justifies the important questions: what is a cluster and what a clustering algorithm should optimize. We prove that the spectral clustering (to be specific, the…

  9. Personalized PageRank Clustering: A graph clustering algorithm based on random walks

    Science.gov (United States)

    A. Tabrizi, Shayan; Shakery, Azadeh; Asadpour, Masoud; Abbasi, Maziar; Tavallaie, Mohammad Ali

    2013-11-01

    Graph clustering has been an essential part in many methods and thus its accuracy has a significant effect on many applications. In addition, exponential growth of real-world graphs such as social networks, biological networks and electrical circuits demands clustering algorithms with nearly-linear time and space complexity. In this paper we propose Personalized PageRank Clustering (PPC) that employs the inherent cluster exploratory property of random walks to reveal the clusters of a given graph. We combine random walks and modularity to precisely and efficiently reveal the clusters of a graph. PPC is a top-down algorithm so it can reveal inherent clusters of a graph more accurately than other nearly-linear approaches that are mainly bottom-up. It also gives a hierarchy of clusters that is useful in many applications. PPC has a linear time and space complexity and has been superior to most of the available clustering algorithms on many datasets. Furthermore, its top-down approach makes it a flexible solution for clustering problems with different requirements.

  10. Graph coarsening and clustering on the GPU

    NARCIS (Netherlands)

    Fagginger Auer, B.O.; Bisseling, R.H.

    2013-01-01

    Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained sharedmemory parallel graph

  11. Graph-based clustering and data visualization algorithms

    CERN Document Server

    Vathy-Fogarassy, Ágnes

    2013-01-01

    This work presents a data visualization technique that combines graph-based topology representation and dimensionality reduction methods to visualize the intrinsic data structure in a low-dimensional vector space. The application of graphs in clustering and visualization has several advantages. A graph of important edges (where edges characterize relations and weights represent similarities or distances) provides a compact representation of the entire complex data set. This text describes clustering and visualization methods that are able to utilize information hidden in these graphs, based on

  12. Image interpolation via graph-based Bayesian label propagation.

    Science.gov (United States)

    Xianming Liu; Debin Zhao; Jiantao Zhou; Wen Gao; Huifang Sun

    2014-03-01

    In this paper, we propose a novel image interpolation algorithm via graph-based Bayesian label propagation. The basic idea is to first create a graph with known and unknown pixels as vertices and with edge weights encoding the similarity between vertices, then the problem of interpolation converts to how to effectively propagate the label information from known points to unknown ones. This process can be posed as a Bayesian inference, in which we try to combine the principles of local adaptation and global consistency to obtain accurate and robust estimation. Specially, our algorithm first constructs a set of local interpolation models, which predict the intensity labels of all image samples, and a loss term will be minimized to keep the predicted labels of the available low-resolution (LR) samples sufficiently close to the original ones. Then, all of the losses evaluated in local neighborhoods are accumulated together to measure the global consistency on all samples. Moreover, a graph-Laplacian-based manifold regularization term is incorporated to penalize the global smoothness of intensity labels, such smoothing can alleviate the insufficient training of the local models and make them more robust. Finally, we construct a unified objective function to combine together the global loss of the locally linear regression, square error of prediction bias on the available LR samples, and the manifold regularization term. It can be solved with a closed-form solution as a convex optimization problem. Experimental results demonstrate that the proposed method achieves competitive performance with the state-of-the-art image interpolation algorithms.

  13. Statistical mechanics of semi-supervised clustering in sparse graphs

    International Nuclear Information System (INIS)

    Ver Steeg, Greg; Galstyan, Aram; Allahverdyan, Armen E

    2011-01-01

    We theoretically study semi-supervised clustering in sparse graphs in the presence of pair-wise constraints on the cluster assignments of nodes. We focus on bi-cluster graphs and study the impact of semi-supervision for varying constraint density and overlap between the clusters. Recent results for unsupervised clustering in sparse graphs indicate that there is a critical ratio of within-cluster and between-cluster connectivities below which clusters cannot be recovered with better than random accuracy. The goal of this paper is to examine the impact of pair-wise constraints on the clustering accuracy. Our results suggest that the addition of constraints does not provide automatic improvement over the unsupervised case. When the density of the constraints is sufficiently small, their only impact is to shift the detection threshold while preserving the criticality. Conversely, if the density of (hard) constraints is above the percolation threshold, the criticality is suppressed and the detection threshold disappears

  14. Co-clustering directed graphs to discover asymmetries and directional communities.

    Science.gov (United States)

    Rohe, Karl; Qin, Tai; Yu, Bin

    2016-10-21

    In directed graphs, relationships are asymmetric and these asymmetries contain essential structural information about the graph. Directed relationships lead to a new type of clustering that is not feasible in undirected graphs. We propose a spectral co-clustering algorithm called di-sim for asymmetry discovery and directional clustering. A Stochastic co-Blockmodel is introduced to show favorable properties of di-sim To account for the sparse and highly heterogeneous nature of directed networks, di-sim uses the regularized graph Laplacian and projects the rows of the eigenvector matrix onto the sphere. A nodewise asymmetry score and di-sim are used to analyze the clustering asymmetries in the networks of Enron emails, political blogs, and the Caenorhabditis elegans chemical connectome. In each example, a subset of nodes have clustering asymmetries; these nodes send edges to one cluster, but receive edges from another cluster. Such nodes yield insightful information (e.g., communication bottlenecks) about directed networks, but are missed if the analysis ignores edge direction.

  15. Finding the optimal Bayesian network given a constraint graph

    Directory of Open Access Journals (Sweden)

    Jacob M. Schreiber

    2017-07-01

    Full Text Available Despite recent algorithmic improvements, learning the optimal structure of a Bayesian network from data is typically infeasible past a few dozen variables. Fortunately, domain knowledge can frequently be exploited to achieve dramatic computational savings, and in many cases domain knowledge can even make structure learning tractable. Several methods have previously been described for representing this type of structural prior knowledge, including global orderings, super-structures, and constraint rules. While super-structures and constraint rules are flexible in terms of what prior knowledge they can encode, they achieve savings in memory and computational time simply by avoiding considering invalid graphs. We introduce the concept of a “constraint graph” as an intuitive method for incorporating rich prior knowledge into the structure learning task. We describe how this graph can be used to reduce the memory cost and computational time required to find the optimal graph subject to the encoded constraints, beyond merely eliminating invalid graphs. In particular, we show that a constraint graph can break the structure learning task into independent subproblems even in the presence of cyclic prior knowledge. These subproblems are well suited to being solved in parallel on a single machine or distributed across many machines without excessive communication cost.

  16. Multilayer Spectral Graph Clustering via Convex Layer Aggregation: Theory and Algorithms

    OpenAIRE

    Chen, Pin-Yu; Hero, Alfred O.

    2017-01-01

    Multilayer graphs are commonly used for representing different relations between entities and handling heterogeneous data processing tasks. Non-standard multilayer graph clustering methods are needed for assigning clusters to a common multilayer node set and for combining information from each layer. This paper presents a multilayer spectral graph clustering (SGC) framework that performs convex layer aggregation. Under a multilayer signal plus noise model, we provide a phase transition analys...

  17. Beyond Low-Rank Representations: Orthogonal clustering basis reconstruction with optimized graph structure for multi-view spectral clustering.

    Science.gov (United States)

    Wang, Yang; Wu, Lin

    2018-07-01

    Low-Rank Representation (LRR) is arguably one of the most powerful paradigms for Multi-view spectral clustering, which elegantly encodes the multi-view local graph/manifold structures into an intrinsic low-rank self-expressive data similarity embedded in high-dimensional space, to yield a better graph partition than their single-view counterparts. In this paper we revisit it with a fundamentally different perspective by discovering LRR as essentially a latent clustered orthogonal projection based representation winged with an optimized local graph structure for spectral clustering; each column of the representation is fundamentally a cluster basis orthogonal to others to indicate its members, which intuitively projects the view-specific feature representation to be the one spanned by all orthogonal basis to characterize the cluster structures. Upon this finding, we propose our technique with the following: (1) We decompose LRR into latent clustered orthogonal representation via low-rank matrix factorization, to encode the more flexible cluster structures than LRR over primal data objects; (2) We convert the problem of LRR into that of simultaneously learning orthogonal clustered representation and optimized local graph structure for each view; (3) The learned orthogonal clustered representations and local graph structures enjoy the same magnitude for multi-view, so that the ideal multi-view consensus can be readily achieved. The experiments over multi-view datasets validate its superiority, especially over recent state-of-the-art LRR models. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. Non-parametric Bayesian graph models reveal community structure in resting state fMRI

    DEFF Research Database (Denmark)

    Andersen, Kasper Winther; Madsen, Kristoffer H.; Siebner, Hartwig Roman

    2014-01-01

    Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian...... models for node clustering in complex networks. In particular, we test their ability to predict unseen data and their ability to reproduce clustering across datasets. The three generative models considered are the Infinite Relational Model (IRM), Bayesian Community Detection (BCD), and the Infinite...... between clusters. BCD restricts the between-cluster link probabilities to be strictly lower than within-cluster link probabilities to conform to the community structure typically seen in social networks. IDM only models a single between-cluster link probability, which can be interpreted as a background...

  19. Cluster Tails for Critical Power-Law Inhomogeneous Random Graphs

    Science.gov (United States)

    van der Hofstad, Remco; Kliem, Sandra; van Leeuwaarden, Johan S. H.

    2018-04-01

    Recently, the scaling limit of cluster sizes for critical inhomogeneous random graphs of rank-1 type having finite variance but infinite third moment degrees was obtained in Bhamidi et al. (Ann Probab 40:2299-2361, 2012). It was proved that when the degrees obey a power law with exponent τ \\in (3,4), the sequence of clusters ordered in decreasing size and multiplied through by n^{-(τ -2)/(τ -1)} converges as n→ ∞ to a sequence of decreasing non-degenerate random variables. Here, we study the tails of the limit of the rescaled largest cluster, i.e., the probability that the scaling limit of the largest cluster takes a large value u, as a function of u. This extends a related result of Pittel (J Combin Theory Ser B 82(2):237-269, 2001) for the Erdős-Rényi random graph to the setting of rank-1 inhomogeneous random graphs with infinite third moment degrees. We make use of delicate large deviations and weak convergence arguments.

  20. Locating sources within a dense sensor array using graph clustering

    Science.gov (United States)

    Gerstoft, P.; Riahi, N.

    2017-12-01

    We develop a model-free technique to identify weak sources within dense sensor arrays using graph clustering. No knowledge about the propagation medium is needed except that signal strengths decay to insignificant levels within a scale that is shorter than the aperture. We then reinterpret the spatial coherence matrix of a wave field as a matrix whose support is a connectivity matrix of a graph with sensors as vertices. In a dense network, well-separated sources induce clusters in this graph. The geographic spread of these clusters can serve to localize the sources. The support of the covariance matrix is estimated from limited-time data using a hypothesis test with a robust phase-only coherence test statistic combined with a physical distance criterion. The latter criterion ensures graph sparsity and thus prevents clusters from forming by chance. We verify the approach and quantify its reliability on a simulated dataset. The method is then applied to data from a dense 5200 element geophone array that blanketed of the city of Long Beach (CA). The analysis exposes a helicopter traversing the array and oil production facilities.

  1. Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

    Directory of Open Access Journals (Sweden)

    Theis Fabian J

    2010-10-01

    Full Text Available Abstract Background Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. Results Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. Conclusions In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning

  2. A Clustering Graph Generator

    Energy Technology Data Exchange (ETDEWEB)

    Winlaw, Manda [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); De Sterck, Hans [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sanders, Geoffrey [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-10-26

    In very simple terms a network can be de ned as a collection of points joined together by lines. Thus, networks can be used to represent connections between entities in a wide variety of elds including engi- neering, science, medicine, and sociology. Many large real-world networks share a surprising number of properties, leading to a strong interest in model development research and techniques for building synthetic networks have been developed, that capture these similarities and replicate real-world graphs. Modeling these real-world networks serves two purposes. First, building models that mimic the patterns and prop- erties of real networks helps to understand the implications of these patterns and helps determine which patterns are important. If we develop a generative process to synthesize real networks we can also examine which growth processes are plausible and which are not. Secondly, high-quality, large-scale network data is often not available, because of economic, legal, technological, or other obstacles [7]. Thus, there are many instances where the systems of interest cannot be represented by a single exemplar network. As one example, consider the eld of cybersecurity, where systems require testing across diverse threat scenarios and validation across diverse network structures. In these cases, where there is no single exemplar network, the systems must instead be modeled as a collection of networks in which the variation among them may be just as important as their common features. By developing processes to build synthetic models, so-called graph generators, we can build synthetic networks that capture both the essential features of a system and realistic variability. Then we can use such synthetic graphs to perform tasks such as simulations, analysis, and decision making. We can also use synthetic graphs to performance test graph analysis algorithms, including clustering algorithms and anomaly detection algorithms.

  3. Chemical graph-theoretic cluster expansions

    International Nuclear Information System (INIS)

    Klein, D.J.

    1986-01-01

    A general computationally amenable chemico-graph-theoretic cluster expansion method is suggested as a paradigm for incorporation of chemical structure concepts in a systematic manner. The cluster expansion approach is presented in a formalism general enough to cover a variety of empirical, semiempirical, and even ab initio applications. Formally such approaches for the utilization of chemical structure-related concepts may be viewed as discrete analogues of Taylor series expansions. The efficacy of the chemical structure concepts then is simply bound up in the rate of convergence of the cluster expansions. In many empirical applications, e.g., boiling points, chromatographic separation coefficients, and biological activities, this rate of convergence has been observed to be quite rapid. More note will be made here of quantum chemical applications. Relations to questions concerning size extensivity of energies and size consistency of wave functions are addressed

  4. Learning Bayesian network structure: towards the essential graph by integer linear programming tools

    Czech Academy of Sciences Publication Activity Database

    Studený, Milan; Haws, D.

    2014-01-01

    Roč. 55, č. 4 (2014), s. 1043-1071 ISSN 0888-613X R&D Projects: GA ČR GA13-20012S Institutional support: RVO:67985556 Keywords : learning Bayesian network structure * integer linear programming * characteristic imset * essential graph Subject RIV: BA - General Mathematics Impact factor: 2.451, year: 2014 http://library.utia.cas.cz/separaty/2014/MTR/studeny-0427002.pdf

  5. Generating Realistic Labelled, Weighted Random Graphs

    Directory of Open Access Journals (Sweden)

    Michael Charles Davis

    2015-12-01

    Full Text Available Generative algorithms for random graphs have yielded insights into the structure and evolution of real-world networks. Most networks exhibit a well-known set of properties, such as heavy-tailed degree distributions, clustering and community formation. Usually, random graph models consider only structural information, but many real-world networks also have labelled vertices and weighted edges. In this paper, we present a generative model for random graphs with discrete vertex labels and numeric edge weights. The weights are represented as a set of Beta Mixture Models (BMMs with an arbitrary number of mixtures, which are learned from real-world networks. We propose a Bayesian Variational Inference (VI approach, which yields an accurate estimation while keeping computation times tractable. We compare our approach to state-of-the-art random labelled graph generators and an earlier approach based on Gaussian Mixture Models (GMMs. Our results allow us to draw conclusions about the contribution of vertex labels and edge weights to graph structure.

  6. An effective trust-based recommendation method using a novel graph clustering algorithm

    Science.gov (United States)

    Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin

    2015-10-01

    Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.

  7. GraphCrunch 2: Software tool for network modeling, alignment and clustering.

    Science.gov (United States)

    Kuchaiev, Oleksii; Stevanović, Aleksandar; Hayes, Wayne; Pržulj, Nataša

    2011-01-19

    Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL") for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other existing tool. Finally, GraphCruch 2 implements an

  8. GraphCrunch 2: Software tool for network modeling, alignment and clustering

    Directory of Open Access Journals (Sweden)

    Hayes Wayne

    2011-01-01

    Full Text Available Abstract Background Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. Results We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL" for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other

  9. Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering

    Directory of Open Access Journals (Sweden)

    Xin Tian

    2017-06-01

    Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.

  10. Clustering cliques for graph-based summarization of the biomedical research literature

    DEFF Research Database (Denmark)

    Zhang, Han; Fiszman, Marcelo; Shin, Dongwook

    2013-01-01

    Background: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts).Results: Sem......Rep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm...

  11. Cluster consensus in discrete-time networks of multiagents with inter-cluster nonidentical inputs.

    Science.gov (United States)

    Han, Yujuan; Lu, Wenlian; Chen, Tianping

    2013-04-01

    In this paper, cluster consensus of multiagent systems is studied via inter-cluster nonidentical inputs. Here, we consider general graph topologies, which might be time-varying. The cluster consensus is defined by two aspects: intracluster synchronization, the state at which differences between each pair of agents in the same cluster converge to zero, and inter-cluster separation, the state at which agents in different clusters are separated. For intra-cluster synchronization, the concepts and theories of consensus, including the spanning trees, scramblingness, infinite stochastic matrix product, and Hajnal inequality, are extended. As a result, it is proved that if the graph has cluster spanning trees and all vertices self-linked, then the static linear system can realize intra-cluster synchronization. For the time-varying coupling cases, it is proved that if there exists T > 0 such that the union graph across any T-length time interval has cluster spanning trees and all graphs has all vertices self-linked, then the time-varying linear system can also realize intra-cluster synchronization. Under the assumption of common inter-cluster influence, a sort of inter-cluster nonidentical inputs are utilized to realize inter-cluster separation, such that each agent in the same cluster receives the same inputs and agents in different clusters have different inputs. In addition, the boundedness of the infinite sum of the inputs can guarantee the boundedness of the trajectory. As an application, we employ a modified non-Bayesian social learning model to illustrate the effectiveness of our results.

  12. Unsupervised active learning based on hierarchical graph-theoretic clustering.

    Science.gov (United States)

    Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve

    2009-10-01

    Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.

  13. Bayesian Graphical Models

    DEFF Research Database (Denmark)

    Jensen, Finn Verner; Nielsen, Thomas Dyhre

    2016-01-01

    Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes...

  14. Bayesian analysis for exponential random graph models using the adaptive exchange sampler

    KAUST Repository

    Jin, Ick Hoon

    2013-01-01

    Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the issue of intractable normalizing constants encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.

  15. Bond percolation on a class of correlated and clustered random graphs

    International Nuclear Information System (INIS)

    Allard, A; Hébert-Dufresne, L; Noël, P-A; Marceau, V; Dubé, L J

    2012-01-01

    We introduce a formalism for computing bond percolation properties of a class of correlated and clustered random graphs. This class of graphs is a generalization of the configuration model where nodes of different types are connected via different types of hyperedges, edges that can link more than two nodes. We argue that the multitype approach coupled with the use of clustered hyperedges can reproduce a wide spectrum of complex patterns, and thus enhances our capability to model real complex networks. As an illustration of this claim, we use our formalism to highlight unusual behaviours of the size and composition of the components (small and giant) in a synthetic, albeit realistic, social network. (paper)

  16. A cluster expansion approach to exponential random graph models

    International Nuclear Information System (INIS)

    Yin, Mei

    2012-01-01

    The exponential family of random graphs are among the most widely studied network models. We show that any exponential random graph model may alternatively be viewed as a lattice gas model with a finite Banach space norm. The system may then be treated using cluster expansion methods from statistical mechanics. In particular, we derive a convergent power series expansion for the limiting free energy in the case of small parameters. Since the free energy is the generating function for the expectations of other random variables, this characterizes the structure and behavior of the limiting network in this parameter region

  17. Bayesian versus frequentist statistical inference for investigating a one-off cancer cluster reported to a health department

    Directory of Open Access Journals (Sweden)

    Wills Rachael A

    2009-05-01

    Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.

  18. Modification of MSDR algorithm and ITS implementation on graph clustering

    Science.gov (United States)

    Prastiwi, D.; Sugeng, K. A.; Siswantining, T.

    2017-07-01

    Maximum Standard Deviation Reduction (MSDR) is a graph clustering algorithm to minimize the distance variation within a cluster. In this paper we propose a modified MSDR by replacing one technical step in MSDR which uses polynomial regression, with a new and simpler step. This leads to our new algorithm called Modified MSDR (MMSDR). We implement the new algorithm to separate a domestic flight network of an Indonesian airline into two large clusters. Further analysis allows us to discover a weak link in the network, which should be improved by adding more flights.

  19. Comparison of Bayesian clustering and edge detection methods for inferring boundaries in landscape genetics

    Science.gov (United States)

    Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.

    2011-01-01

    Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.

  20. Cluster tails for critical power-law inhomogeneous random graphs

    NARCIS (Netherlands)

    van der Hofstad, R.; Kliem, S.; van Leeuwaarden, J.S.H.

    2018-01-01

    Recently, the scaling limit of cluster sizes for critical inhomogeneous random graphs of rank-1 type having finite variance but infinite third moment degrees was obtained in Bhamidi et al. (Ann Probab 40:2299–2361, 2012). It was proved that when the degrees obey a power law with exponent τ∈ (3 , 4)

  1. Learning Based Approach for Optimal Clustering of Distributed Program's Call Flow Graph

    Science.gov (United States)

    Abofathi, Yousef; Zarei, Bager; Parsa, Saeed

    Optimal clustering of call flow graph for reaching maximum concurrency in execution of distributable components is one of the NP-Complete problems. Learning automatas (LAs) are search tools which are used for solving many NP-Complete problems. In this paper a learning based algorithm is proposed to optimal clustering of call flow graph and appropriate distributing of programs in network level. The algorithm uses learning feature of LAs to search in state space. It has been shown that the speed of reaching to solution increases remarkably using LA in search process, and it also prevents algorithm from being trapped in local minimums. Experimental results show the superiority of proposed algorithm over others.

  2. FUZZY CLUSTERING BASED BAYESIAN FRAMEWORK TO PREDICT MENTAL HEALTH PROBLEMS AMONG CHILDREN

    Directory of Open Access Journals (Sweden)

    M R Sumathi

    2017-04-01

    Full Text Available According to World Health Organization, 10-20% of children and adolescents all over the world are experiencing mental disorders. Correct diagnosis of mental disorders at an early stage improves the quality of life of children and avoids complicated problems. Various expert systems using artificial intelligence techniques have been developed for diagnosing mental disorders like Schizophrenia, Depression, Dementia, etc. This study focuses on predicting basic mental health problems of children, like Attention problem, Anxiety problem, Developmental delay, Attention Deficit Hyperactivity Disorder (ADHD, Pervasive Developmental Disorder(PDD, etc. using the machine learning techniques, Bayesian Networks and Fuzzy clustering. The focus of the article is on learning the Bayesian network structure using a novel Fuzzy Clustering Based Bayesian network structure learning framework. The performance of the proposed framework was compared with the other existing algorithms and the experimental results have shown that the proposed framework performs better than the earlier algorithms.

  3. Identification of Voxels Confounded by Venous Signals Using Resting-State fMRI Functional Connectivity Graph Clustering

    Directory of Open Access Journals (Sweden)

    Klaudius eKalcher

    2015-12-01

    Full Text Available Identifying venous voxels in fMRI datasets is important to increase the specificity of fMRI analyses to microvasculature in the vicinity of the neural processes triggering the BOLD response. This is, however, difficult to achieve in particular in typical studies where magnitude images of BOLD EPI are the only data available. In this study, voxelwise functional connectivity graphs were computed on minimally preprocessed low TR (333 ms multiband resting-state fMRI data, using both high positive and negative correlations to define edges between nodes (voxels. A high correlation threshold for binarization ensures that most edges in the resulting sparse graph reflect the high coherence of signals in medium to large veins. Graph clustering based on the optimization of modularity was then employed to identify clusters of coherent voxels in this graph, and all clusters of 50 or more voxels were then interpreted as corresponding to medium to large veins. Indeed, a comparison with SWI reveals that 75.6 ± 5.9% of voxels within these large clusters overlap with veins visible in the SWI image or lie outside the brain parenchyma. Some of the remainingdifferences between the two modalities can be explained by imperfect alignment or geometric distortions between the two images. Overall, the graph clustering based method for identifying venous voxels has a high specificity as well as the additional advantages of being computed in the same voxel grid as the fMRI dataset itself and not needingany additional data beyond what is usually acquired (and exported in standard fMRI experiments.

  4. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    International Nuclear Information System (INIS)

    Ellefsen, Karl J.; Smith, David B.

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples. - Highlights: • We evaluate a clustering procedure by applying it to geochemical data. • The procedure generates a hierarchy of clusters. • Different levels of the hierarchy show geochemical processes at different spatial scales. • The clustering method is Bayesian finite mixture modeling. • Model parameters are estimated with Hamiltonian Monte Carlo sampling.

  5. Low-Complexity Bayesian Estimation of Cluster-Sparse Channels

    KAUST Repository

    Ballal, Tarig

    2015-09-18

    This paper addresses the problem of channel impulse response estimation for cluster-sparse channels under the Bayesian estimation framework. We develop a novel low-complexity minimum mean squared error (MMSE) estimator by exploiting the sparsity of the received signal profile and the structure of the measurement matrix. It is shown that due to the banded Toeplitz/circulant structure of the measurement matrix, a channel impulse response, such as underwater acoustic channel impulse responses, can be partitioned into a number of orthogonal or approximately orthogonal clusters. The orthogonal clusters, the sparsity of the channel impulse response and the structure of the measurement matrix, all combined, result in a computationally superior realization of the MMSE channel estimator. The MMSE estimator calculations boil down to simpler in-cluster calculations that can be reused in different clusters. The reduction in computational complexity allows for a more accurate implementation of the MMSE estimator. The proposed approach is tested using synthetic Gaussian channels, as well as simulated underwater acoustic channels. Symbol-error-rate performance and computation time confirm the superiority of the proposed method compared to selected benchmark methods in systems with preamble-based training signals transmitted over clustersparse channels.

  6. Low-Complexity Bayesian Estimation of Cluster-Sparse Channels

    KAUST Repository

    Ballal, Tarig; Al-Naffouri, Tareq Y.; Ahmed, Syed

    2015-01-01

    This paper addresses the problem of channel impulse response estimation for cluster-sparse channels under the Bayesian estimation framework. We develop a novel low-complexity minimum mean squared error (MMSE) estimator by exploiting the sparsity of the received signal profile and the structure of the measurement matrix. It is shown that due to the banded Toeplitz/circulant structure of the measurement matrix, a channel impulse response, such as underwater acoustic channel impulse responses, can be partitioned into a number of orthogonal or approximately orthogonal clusters. The orthogonal clusters, the sparsity of the channel impulse response and the structure of the measurement matrix, all combined, result in a computationally superior realization of the MMSE channel estimator. The MMSE estimator calculations boil down to simpler in-cluster calculations that can be reused in different clusters. The reduction in computational complexity allows for a more accurate implementation of the MMSE estimator. The proposed approach is tested using synthetic Gaussian channels, as well as simulated underwater acoustic channels. Symbol-error-rate performance and computation time confirm the superiority of the proposed method compared to selected benchmark methods in systems with preamble-based training signals transmitted over clustersparse channels.

  7. On the Informativeness of Dominant and Co-Dominant Genetic Markers for Bayesian Supervised Clustering

    DEFF Research Database (Denmark)

    Guillot, Gilles; Carpentier-Skandalis, Alexandra

    2011-01-01

    We study the accuracy of a Bayesian supervised method used to cluster individuals into genetically homogeneous groups on the basis of dominant or codominant molecular markers. We provide a formula relating an error criterion to the number of loci used and the number of clusters. This formula...

  8. Bayesian networks and food security - An introduction

    NARCIS (Netherlands)

    Stein, A.

    2004-01-01

    This paper gives an introduction to Bayesian networks. Networks are defined and put into a Bayesian context. Directed acyclical graphs play a crucial role here. Two simple examples from food security are addressed. Possible uses of Bayesian networks for implementation and further use in decision

  9. Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination.

    Science.gov (United States)

    Yau, Christopher; Holmes, Chris

    2011-07-01

    We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.

  10. Graph embedding with rich information through heterogeneous graph

    KAUST Repository

    Sun, Guolei

    2017-01-01

    Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most

  11. GraphAlignment: Bayesian pairwise alignment of biological networks

    Directory of Open Access Journals (Sweden)

    Kolář Michal

    2012-11-01

    Full Text Available Abstract Background With increased experimental availability and accuracy of bio-molecular networks, tools for their comparative and evolutionary analysis are needed. A key component for such studies is the alignment of networks. Results We introduce the Bioconductor package GraphAlignment for pairwise alignment of bio-molecular networks. The alignment incorporates information both from network vertices and network edges and is based on an explicit evolutionary model, allowing inference of all scoring parameters directly from empirical data. We compare the performance of our algorithm to an alternative algorithm, Græmlin 2.0. On simulated data, GraphAlignment outperforms Græmlin 2.0 in several benchmarks except for computational complexity. When there is little or no noise in the data, GraphAlignment is slower than Græmlin 2.0. It is faster than Græmlin 2.0 when processing noisy data containing spurious vertex associations. Its typical case complexity grows approximately as O(N2.6. On empirical bacterial protein-protein interaction networks (PIN and gene co-expression networks, GraphAlignment outperforms Græmlin 2.0 with respect to coverage and specificity, albeit by a small margin. On large eukaryotic PIN, Græmlin 2.0 outperforms GraphAlignment. Conclusions The GraphAlignment algorithm is robust to spurious vertex associations, correctly resolves paralogs, and shows very good performance in identification of homologous vertices defined by high vertex and/or interaction similarity. The simplicity and generality of GraphAlignment edge scoring makes the algorithm an appropriate choice for global alignment of networks.

  12. Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.

    Science.gov (United States)

    Li, Zhaonan; Xu, Xinyi; Shen, Junshan

    2017-11-10

    In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.

  13. Bayesian Nonparametric Clustering for Positive Definite Matrices.

    Science.gov (United States)

    Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos

    2016-05-01

    Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.

  14. GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.

    Science.gov (United States)

    Schulz, Tizian; Stoye, Jens; Doerr, Daniel

    2018-05-08

    Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.

  15. Determining open cluster membership. A Bayesian framework for quantitative member classification

    Science.gov (United States)

    Stott, Jonathan J.

    2018-01-01

    Aims: My goal is to develop a quantitative algorithm for assessing open cluster membership probabilities. The algorithm is designed to work with single-epoch observations. In its simplest form, only one set of program images and one set of reference images are required. Methods: The algorithm is based on a two-stage joint astrometric and photometric assessment of cluster membership probabilities. The probabilities were computed within a Bayesian framework using any available prior information. Where possible, the algorithm emphasizes simplicity over mathematical sophistication. Results: The algorithm was implemented and tested against three observational fields using published survey data. M 67 and NGC 654 were selected as cluster examples while a third, cluster-free, field was used for the final test data set. The algorithm shows good quantitative agreement with the existing surveys and has a false-positive rate significantly lower than the astrometric or photometric methods used individually.

  16. Bayes Academy - An Educational Game for Learning Bayesian Networks

    OpenAIRE

    Sotala, Kaj

    2015-01-01

    This thesis describes the development of 'Bayes Academy', an educational game which aims to teach an understanding of Bayesian networks. A Bayesian network is a directed acyclic graph describing a joint probability distribution function over n random variables, where each node in the graph represents a random variable. To find a way to turn this subject into an interesting game, this work draws on the theoretical background of meaningful play. Among other requirements, actions in the game...

  17. Practical graph mining with R

    CERN Document Server

    Hendrix, William; Jenkins, John; Padmanabhan, Kanchana; Chakraborty, Arpan

    2014-01-01

    Practical Graph Mining with R presents a "do-it-yourself" approach to extracting interesting patterns from graph data. It covers many basic and advanced techniques for the identification of anomalous or frequently recurring patterns in a graph, the discovery of groups or clusters of nodes that share common patterns of attributes and relationships, the extraction of patterns that distinguish one category of graphs from another, and the use of those patterns to predict the category of new graphs. Hands-On Application of Graph Data Mining Each chapter in the book focuses on a graph mining task, such as link analysis, cluster analysis, and classification. Through applications using real data sets, the book demonstrates how computational techniques can help solve real-world problems. The applications covered include network intrusion detection, tumor cell diagnostics, face recognition, predictive toxicology, mining metabolic and protein-protein interaction networks, and community detection in social networks. De...

  18. A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.

    Science.gov (United States)

    Mo, Qianxing; Shen, Ronglai; Guo, Cui; Vannucci, Marina; Chan, Keith S; Hilsenbeck, Susan G

    2018-01-01

    Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. AGGLOMERATIVE CLUSTERING OF SOUND RECORD SPEECH SEGMENTS BASED ON BAYESIAN INFORMATION CRITERION

    Directory of Open Access Journals (Sweden)

    O. Yu. Kydashev

    2013-01-01

    Full Text Available This paper presents the detailed description of agglomerative clustering system implementation for speech segments based on Bayesian information criterion. Numerical experiment results with different acoustic features, as well as the full and diagonal covariance matrices application are given. The error rate DER equal to 6.4% for audio records of radio «Svoboda» was achieved by means of designed system.

  20. Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.

    Science.gov (United States)

    Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A

    2018-01-30

    Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Constructing a graph of connections in clustering algorithm of complex objects

    Directory of Open Access Journals (Sweden)

    Татьяна Шатовская

    2015-05-01

    Full Text Available The article describes the results of modifying the algorithm Chameleon. Hierarchical multi-level algorithm consists of several phases: the construction of the count, coarsening, the separation and recovery. Each phase can be used various approaches and algorithms. The main aim of the work is to study the quality of the clustering of different sets of data using a set of algorithms combinations at different stages of the algorithm and improve the stage of construction by the optimization algorithm of k choice in the graph construction of k of nearest neighbors

  2. Identifying Key Drivers of Return Reversal with Dynamical Bayesian Factor Graph.

    Directory of Open Access Journals (Sweden)

    Shuai Zhao

    Full Text Available In the stock market, return reversal occurs when investors sell overbought stocks and buy oversold stocks, reversing the stocks' price trends. In this paper, we develop a new method to identify key drivers of return reversal by incorporating a comprehensive set of factors derived from different economic theories into one unified dynamical Bayesian factor graph. We then use the model to depict factor relationships and their dynamics, from which we make some interesting discoveries about the mechanism behind return reversals. Through extensive experiments on the US stock market, we conclude that among the various factors, the liquidity factors consistently emerge as key drivers of return reversal, which is in support of the theory of liquidity effect. Specifically, we find that stocks with high turnover rates or high Amihud illiquidity measures have a greater probability of experiencing return reversals. Apart from the consistent drivers, we find other drivers of return reversal that generally change from year to year, and they serve as important characteristics for evaluating the trends of stock returns. Besides, we also identify some seldom discussed yet enlightening inter-factor relationships, one of which shows that stocks in Finance and Insurance industry are more likely to have high Amihud illiquidity measures in comparison with those in other industries. These conclusions are robust for return reversals under different thresholds.

  3. Community detection by graph Voronoi diagrams

    Science.gov (United States)

    Deritei, Dávid; Lázár, Zsolt I.; Papp, István; Járai-Szabó, Ferenc; Sumi, Róbert; Varga, Levente; Ravasz Regan, Erzsébet; Ercsey-Ravasz, Mária

    2014-06-01

    Accurate and efficient community detection in networks is a key challenge for complex network theory and its applications. The problem is analogous to cluster analysis in data mining, a field rich in metric space-based methods. Common to these methods is a geometric, distance-based definition of clusters or communities. Here we propose a new geometric approach to graph community detection based on graph Voronoi diagrams. Our method serves as proof of principle that the definition of appropriate distance metrics on graphs can bring a rich set of metric space-based clustering methods to network science. We employ a simple edge metric that reflects the intra- or inter-community character of edges, and a graph density-based rule to identify seed nodes of Voronoi cells. Our algorithm outperforms most network community detection methods applicable to large networks on benchmark as well as real-world networks. In addition to offering a computationally efficient alternative for community detection, our method opens new avenues for adapting a wide range of data mining algorithms to complex networks from the class of centroid- and density-based clustering methods.

  4. Network evolution driven by dynamics applied to graph coloring

    International Nuclear Information System (INIS)

    Wu Jian-She; Li Li-Guang; Yu Xin; Jiao Li-Cheng; Wang Xiao-Hua

    2013-01-01

    An evolutionary network driven by dynamics is studied and applied to the graph coloring problem. From an initial structure, both the topology and the coupling weights evolve according to the dynamics. On the other hand, the dynamics of the network are determined by the topology and the coupling weights, so an interesting structure-dynamics co-evolutionary scheme appears. By providing two evolutionary strategies, a network described by the complement of a graph will evolve into several clusters of nodes according to their dynamics. The nodes in each cluster can be assigned the same color and nodes in different clusters assigned different colors. In this way, a co-evolution phenomenon is applied to the graph coloring problem. The proposed scheme is tested on several benchmark graphs for graph coloring

  5. A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method.

    Science.gov (United States)

    Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol

    2007-11-27

    A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature. Our extensive experimental results show the approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. Our coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries.

  6. Graph embedding with rich information through heterogeneous graph

    KAUST Repository

    Sun, Guolei

    2017-11-12

    Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most existing algorithms for graph embedding only rely on the topology information and fail to use the copious information in nodes as well as edges. As a result, their performance for many tasks may not be satisfactory. In this thesis, we proposed a novel and general framework for graph embedding with rich text information (GERI) through constructing a heterogeneous network, in which we integrate node and edge content information with graph topology. Specially, we designed a novel biased random walk to explore the constructed heterogeneous network with the notion of flexible neighborhood. Our sampling strategy can compromise between BFS and DFS local search on heterogeneous graph. To further improve our algorithm, we proposed semi-supervised GERI (SGERI), which learns graph embedding in an discriminative manner through heterogeneous network with label information. The efficacy of our method is demonstrated by extensive comparison experiments with 9 baselines over multi-label and multi-class classification on various datasets including Citeseer, Cora, DBLP and Wiki. It shows that GERI improves the Micro-F1 and Macro-F1 of node classification up to 10%, and SGERI improves GERI by 5% in Wiki.

  7. Efficient computation of k-Nearest Neighbour Graphs for large high-dimensional data sets on GPU clusters.

    Directory of Open Access Journals (Sweden)

    Ali Dashti

    Full Text Available This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU. The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.

  8. Adaptation of Chain Event Graphs for use with Case-Control Studies in Epidemiology.

    Science.gov (United States)

    Keeble, Claire; Thwaites, Peter Adam; Barber, Stuart; Law, Graham Richard; Baxter, Paul David

    2017-09-26

    Case-control studies are used in epidemiology to try to uncover the causes of diseases, but are a retrospective study design known to suffer from non-participation and recall bias, which may explain their decreased popularity in recent years. Traditional analyses report usually only the odds ratio for given exposures and the binary disease status. Chain event graphs are a graphical representation of a statistical model derived from event trees which have been developed in artificial intelligence and statistics, and only recently introduced to the epidemiology literature. They are a modern Bayesian technique which enable prior knowledge to be incorporated into the data analysis using the agglomerative hierarchical clustering algorithm, used to form a suitable chain event graph. Additionally, they can account for missing data and be used to explore missingness mechanisms. Here we adapt the chain event graph framework to suit scenarios often encountered in case-control studies, to strengthen this study design which is time and financially efficient. We demonstrate eight adaptations to the graphs, which consist of two suitable for full case-control study analysis, four which can be used in interim analyses to explore biases, and two which aim to improve the ease and accuracy of analyses. The adaptations are illustrated with complete, reproducible, fully-interpreted examples, including the event tree and chain event graph. Chain event graphs are used here for the first time to summarise non-participation, data collection techniques, data reliability, and disease severity in case-control studies. We demonstrate how these features of a case-control study can be incorporated into the analysis to provide further insight, which can help to identify potential biases and lead to more accurate study results.

  9. Approximation methods for efficient learning of Bayesian networks

    CERN Document Server

    Riggelsen, C

    2008-01-01

    This publication offers and investigates efficient Monte Carlo simulation methods in order to realize a Bayesian approach to approximate learning of Bayesian networks from both complete and incomplete data. For large amounts of incomplete data when Monte Carlo methods are inefficient, approximations are implemented, such that learning remains feasible, albeit non-Bayesian. The topics discussed are: basic concepts about probabilities, graph theory and conditional independence; Bayesian network learning from data; Monte Carlo simulation techniques; and, the concept of incomplete data. In order to provide a coherent treatment of matters, thereby helping the reader to gain a thorough understanding of the whole concept of learning Bayesian networks from (in)complete data, this publication combines in a clarifying way all the issues presented in the papers with previously unpublished work.

  10. The Beta Cell in Its Cluster: Stochastic Graphs of Beta Cell Connectivity in the Islets of Langerhans.

    Directory of Open Access Journals (Sweden)

    Deborah A Striegel

    2015-08-01

    Full Text Available Pancreatic islets of Langerhans consist of endocrine cells, primarily α, β and δ cells, which secrete glucagon, insulin, and somatostatin, respectively, to regulate plasma glucose. β cells form irregular locally connected clusters within islets that act in concert to secrete insulin upon glucose stimulation. Due to the central functional significance of this local connectivity in the placement of β cells in an islet, it is important to characterize it quantitatively. However, quantification of the seemingly stochastic cytoarchitecture of β cells in an islet requires mathematical methods that can capture topological connectivity in the entire β-cell population in an islet. Graph theory provides such a framework. Using large-scale imaging data for thousands of islets containing hundreds of thousands of cells in human organ donor pancreata, we show that quantitative graph characteristics differ between control and type 2 diabetic islets. Further insight into the processes that shape and maintain this architecture is obtained by formulating a stochastic theory of β-cell rearrangement in whole islets, just as the normal equilibrium distribution of the Ornstein-Uhlenbeck process can be viewed as the result of the interplay between a random walk and a linear restoring force. Requiring that rearrangements maintain the observed quantitative topological graph characteristics strongly constrained possible processes. Our results suggest that β-cell rearrangement is dependent on its connectivity in order to maintain an optimal cluster size in both normal and T2D islets.

  11. The Beta Cell in Its Cluster: Stochastic Graphs of Beta Cell Connectivity in the Islets of Langerhans.

    Science.gov (United States)

    Striegel, Deborah A; Hara, Manami; Periwal, Vipul

    2015-08-01

    Pancreatic islets of Langerhans consist of endocrine cells, primarily α, β and δ cells, which secrete glucagon, insulin, and somatostatin, respectively, to regulate plasma glucose. β cells form irregular locally connected clusters within islets that act in concert to secrete insulin upon glucose stimulation. Due to the central functional significance of this local connectivity in the placement of β cells in an islet, it is important to characterize it quantitatively. However, quantification of the seemingly stochastic cytoarchitecture of β cells in an islet requires mathematical methods that can capture topological connectivity in the entire β-cell population in an islet. Graph theory provides such a framework. Using large-scale imaging data for thousands of islets containing hundreds of thousands of cells in human organ donor pancreata, we show that quantitative graph characteristics differ between control and type 2 diabetic islets. Further insight into the processes that shape and maintain this architecture is obtained by formulating a stochastic theory of β-cell rearrangement in whole islets, just as the normal equilibrium distribution of the Ornstein-Uhlenbeck process can be viewed as the result of the interplay between a random walk and a linear restoring force. Requiring that rearrangements maintain the observed quantitative topological graph characteristics strongly constrained possible processes. Our results suggest that β-cell rearrangement is dependent on its connectivity in order to maintain an optimal cluster size in both normal and T2D islets.

  12. How Symmetric Are Real-World Graphs? A Large-Scale Study

    Directory of Open Access Journals (Sweden)

    Fabian Ball

    2018-01-01

    Full Text Available The analysis of symmetry is a main principle in natural sciences, especially physics. For network sciences, for example, in social sciences, computer science and data science, only a few small-scale studies of the symmetry of complex real-world graphs exist. Graph symmetry is a topic rooted in mathematics and is not yet well-received and applied in practice. This article underlines the importance of analyzing symmetry by showing the existence of symmetry in real-world graphs. An analysis of over 1500 graph datasets from the meta-repository networkrepository.com is carried out and a normalized version of the “network redundancy” measure is presented. It quantifies graph symmetry in terms of the number of orbits of the symmetry group from zero (no symmetries to one (completely symmetric, and improves the recognition of asymmetric graphs. Over 70% of the analyzed graphs contain symmetries (i.e., graph automorphisms, independent of size and modularity. Therefore, we conclude that real-world graphs are likely to contain symmetries. This contribution is the first larger-scale study of symmetry in graphs and it shows the necessity of handling symmetry in data analysis: The existence of symmetries in graphs is the cause of two problems in graph clustering we are aware of, namely, the existence of multiple equivalent solutions with the same value of the clustering criterion and, secondly, the inability of all standard partition-comparison measures of cluster analysis to identify automorphic partitions as equivalent.

  13. Generating hierarchial scale-free graphs from fractals

    Energy Technology Data Exchange (ETDEWEB)

    Komjathy, Julia, E-mail: komyju@math.bme.hu [Department of Stochastics, Institute of Mathematics, Technical University of Budapest, H-1529 P.O. Box 91 (Hungary); Simon, Karoly, E-mail: simonk@math.bme.hu [Department of Stochastics, Institute of Mathematics, Technical University of Budapest, H-1529 P.O. Box 91 (Hungary)

    2011-08-15

    Highlights: > We generate deterministic scale-free networks using graph-directed self similar IFS. > Our model exhibits similar clustering, power law decay properties to real networks. > The average length of shortest path and the diameter of the graph are determined. > Using this model, we generate random graphs with prescribed power law exponent. - Abstract: Motivated by the hierarchial network model of E. Ravasz, A.-L. Barabasi, and T. Vicsek, we introduce deterministic scale-free networks derived from a graph directed self-similar fractal {Lambda}. With rigorous mathematical results we verify that our model captures some of the most important features of many real networks: the scale-free and the high clustering properties. We also prove that the diameter is the logarithm of the size of the system. We point out a connection between the power law exponent of the degree distribution and some intrinsic geometric measure theoretical properties of the underlying fractal. Using our (deterministic) fractal {Lambda} we generate random graph sequence sharing similar properties.

  14. GraphAlignment: Bayesian pairwise alignment of biological networks

    Czech Academy of Sciences Publication Activity Database

    Kolář, Michal; Meier, J.; Mustonen, V.; Lässig, M.; Berg, J.

    2012-01-01

    Roč. 6, November 21 (2012) ISSN 1752-0509 Grant - others:Deutsche Forschungsgemeinschaft(DE) SFB 680; Deutsche Forschungsgemeinschaft(DE) SFB-TR12; Deutsche Forschungsgemeinschaft(DE) BE 2478/2-1 Institutional research plan: CEZ:AV0Z50520514 Keywords : Graph alignment * Biological networks * Parameter estimation * Bioconductor Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.982, year: 2012

  15. Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph

    KAUST Repository

    Alabdulmohsin, Ibrahim; Han, Yufei; Shen, Yun; Zhang, Xiangliang

    2016-01-01

    graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node

  16. Modelling of JET diagnostics using Bayesian Graphical Models

    Energy Technology Data Exchange (ETDEWEB)

    Svensson, J. [IPP Greifswald, Greifswald (Germany); Ford, O. [Imperial College, London (United Kingdom); McDonald, D.; Hole, M.; Nessi, G. von; Meakins, A.; Brix, M.; Thomsen, H.; Werner, A.; Sirinelli, A.

    2011-07-01

    The mapping between physics parameters (such as densities, currents, flows, temperatures etc) defining the plasma 'state' under a given model and the raw observations of each plasma diagnostic will 1) depend on the particular physics model used, 2) is inherently probabilistic, from uncertainties on both observations and instrumental aspects of the mapping, such as calibrations, instrument functions etc. A flexible and principled way of modelling such interconnected probabilistic systems is through so called Bayesian graphical models. Being an amalgam between graph theory and probability theory, Bayesian graphical models can simulate the complex interconnections between physics models and diagnostic observations from multiple heterogeneous diagnostic systems, making it relatively easy to optimally combine the observations from multiple diagnostics for joint inference on parameters of the underlying physics model, which in itself can be represented as part of the graph. At JET about 10 diagnostic systems have to date been modelled in this way, and has lead to a number of new results, including: the reconstruction of the flux surface topology and q-profiles without any specific equilibrium assumption, using information from a number of different diagnostic systems; profile inversions taking into account the uncertainties in the flux surface positions and a substantial increase in accuracy of JET electron density and temperature profiles, including improved pedestal resolution, through the joint analysis of three diagnostic systems. It is believed that the Bayesian graph approach could potentially be utilised for very large sets of diagnostics, providing a generic data analysis framework for nuclear fusion experiments, that would be able to optimally utilize the information from multiple diagnostics simultaneously, and where the explicit graph representation of the connections to underlying physics models could be used for sophisticated model testing. This

  17. Current trends in Bayesian methodology with applications

    CERN Document Server

    Upadhyay, Satyanshu K; Dey, Dipak K; Loganathan, Appaia

    2015-01-01

    Collecting Bayesian material scattered throughout the literature, Current Trends in Bayesian Methodology with Applications examines the latest methodological and applied aspects of Bayesian statistics. The book covers biostatistics, econometrics, reliability and risk analysis, spatial statistics, image analysis, shape analysis, Bayesian computation, clustering, uncertainty assessment, high-energy astrophysics, neural networking, fuzzy information, objective Bayesian methodologies, empirical Bayes methods, small area estimation, and many more topics.Each chapter is self-contained and focuses on

  18. Using consensus bayesian network to model the reactive oxygen species regulatory pathway.

    Directory of Open Access Journals (Sweden)

    Liangdong Hu

    Full Text Available Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.

  19. Bell inequalities for graph states

    International Nuclear Information System (INIS)

    Toth, G.; Hyllus, P.; Briegel, H.J.; Guehne, O.

    2005-01-01

    Full text: In the last years graph states have attracted an increasing interest in the field of quantum information theory. Graph states form a family of multi-qubit states which comprises many popular states such as the GHZ states and the cluster states. They also play an important role in applications. For instance, measurement based quantum computation uses graph states as resources. From a theoretical point of view, it is remarkable that graph states allow for a simple description in terms of stabilizing operators. In this contribution, we investigate the non-local properties of graph states. We derive a family of Bell inequalities which require three measurement settings for each party and are maximally violated by graph states. In turn, any graph state violates at least one of the inequalities. We show that for certain types of graph states the violation of these inequalities increases exponentially with the number of qubits. We also discuss connections to other entanglement properties such as the positively of the partial transpose or the geometric measure of entanglement. (author)

  20. Why so GLUMM? Detecting depression clusters through graphing lifestyle-environs using machine-learning methods (GLUMM).

    Science.gov (United States)

    Dipnall, J F; Pasco, J A; Berk, M; Williams, L J; Dodd, S; Jacka, F N; Meyer, D

    2017-01-01

    Key lifestyle-environ risk factors are operative for depression, but it is unclear how risk factors cluster. Machine-learning (ML) algorithms exist that learn, extract, identify and map underlying patterns to identify groupings of depressed individuals without constraints. The aim of this research was to use a large epidemiological study to identify and characterise depression clusters through "Graphing lifestyle-environs using machine-learning methods" (GLUMM). Two ML algorithms were implemented: unsupervised Self-organised mapping (SOM) to create GLUMM clusters and a supervised boosted regression algorithm to describe clusters. Ninety-six "lifestyle-environ" variables were used from the National health and nutrition examination study (2009-2010). Multivariate logistic regression validated clusters and controlled for possible sociodemographic confounders. The SOM identified two GLUMM cluster solutions. These solutions contained one dominant depressed cluster (GLUMM5-1, GLUMM7-1). Equal proportions of members in each cluster rated as highly depressed (17%). Alcohol consumption and demographics validated clusters. Boosted regression identified GLUMM5-1 as more informative than GLUMM7-1. Members were more likely to: have problems sleeping; unhealthy eating; ≤2 years in their home; an old home; perceive themselves underweight; exposed to work fumes; experienced sex at ≤14 years; not perform moderate recreational activities. A positive relationship between GLUMM5-1 (OR: 7.50, Pdepression was found, with significant interactions with those married/living with partner (P=0.001). Using ML based GLUMM to form ordered depressive clusters from multitudinous lifestyle-environ variables enabled a deeper exploration of the heterogeneous data to uncover better understandings into relationships between the complex mental health factors. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  1. High Dimensional Spectral Graph Theory and Non-backtracking Random Walks on Graphs

    Science.gov (United States)

    Kempton, Mark

    This thesis has two primary areas of focus. First we study connection graphs, which are weighted graphs in which each edge is associated with a d-dimensional rotation matrix for some fixed dimension d, in addition to a scalar weight. Second, we study non-backtracking random walks on graphs, which are random walks with the additional constraint that they cannot return to the immediately previous state at any given step. Our work in connection graphs is centered on the notion of consistency, that is, the product of rotations moving from one vertex to another is independent of the path taken, and a generalization called epsilon-consistency. We present higher dimensional versions of the combinatorial Laplacian matrix and normalized Laplacian matrix from spectral graph theory, and give results characterizing the consistency of a connection graph in terms of the spectra of these matrices. We generalize several tools from classical spectral graph theory, such as PageRank and effective resistance, to apply to connection graphs. We use these tools to give algorithms for sparsification, clustering, and noise reduction on connection graphs. In non-backtracking random walks, we address the question raised by Alon et. al. concerning how the mixing rate of a non-backtracking random walk to its stationary distribution compares to the mixing rate for an ordinary random walk. Alon et. al. address this question for regular graphs. We take a different approach, and use a generalization of Ihara's Theorem to give a new proof of Alon's result for regular graphs, and to extend the result to biregular graphs. Finally, we give a non-backtracking version of Polya's Random Walk Theorem for 2-dimensional grids.

  2. Dense graph limits under respondent-driven sampling

    Indian Academy of Sciences (India)

    Siva Athreya

    Types of network. Food. Political. Social. Linkedin. Protien. Professional. Source: IUPUI Network Sampling course. Page 3. Techniques to Analyse Networks. Network Characteristics: • Distribution: degree, clustering coefficient, hop-plot. ... Hypothesis Testing: H0: Graph is a linkedin network. H1: Graph is a facebook network ...

  3. A comparison between fault tree analysis and reliability graph with general gates

    International Nuclear Information System (INIS)

    Kim, Man Cheol; Seong, Poong Hyun; Jung, Woo Sik

    2004-01-01

    Currently, level-1 probabilistic safety assessment (PSA) is performed on the basis of event tree analysis and fault tree analysis. Kim and Seong developed a new method for system reliability analysis named reliability graph with general gates (RGGG). The RGGG is an extension of conventional reliability graph, and it utilizes the transformation of system structures to equivalent Bayesian networks for quantitative calculation. The RGGG is considered to be intuitive and easy-to-use while as powerful as fault tree analysis. As an example, Kim and Seong already showed that the Bayesian network model for digital plant protection system (DPPS), which is transformed from the RGGG model for DPPS, can be shown in 1 page, while the fault tree model for DPPS consists of 64 pages of fault trees. Kim and Seong also insisted that Bayesian network model for DPPS is more intuitive because the one-to-one matching between each node in the Bayesian network model and an actual component of DPPS is possible. In this paper, we are going to give a comparison between fault tree analysis and the RGGG method with two example systems. The two example systems are the recirculation of in Korean standard nuclear power plants (KSNP) and the fault tree model developed by Rauzy

  4. Bayesian networks improve causal environmental ...

    Science.gov (United States)

    Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on value

  5. Spectral clustering and biclustering learning large graphs and contingency tables

    CERN Document Server

    Bolla, Marianna

    2013-01-01

    Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, mult

  6. Bayesian pedigree inference with small numbers of single nucleotide polymorphisms via a factor-graph representation.

    Science.gov (United States)

    Anderson, Eric C; Ng, Thomas C

    2016-02-01

    We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.

  7. Clustering execution in a processing system to increase power savings

    Energy Technology Data Exchange (ETDEWEB)

    Bose, Pradip; Buyuktosunoglu, Alper; Jacobson, Hans M.; Vega, Augusto J.

    2018-04-03

    Embodiments relate to clustering execution in a processing system. An aspect includes accessing a control flow graph that defines a data dependency and an execution sequence of a plurality of tasks of an application that executes on a plurality of system components. The execution sequence of the tasks in the control flow graph is modified as a clustered control flow graph that clusters active and idle phases of a system component while maintaining the data dependency. The clustered control flow graph is sent to an operating system, where the operating system utilizes the clustered control flow graph for scheduling the tasks.

  8. Clustering execution in a processing system to increase power savings

    Science.gov (United States)

    Bose, Pradip; Buyuktosunoglu, Alper; Jacobson, Hans M.; Vega, Augusto J.

    2018-03-20

    Embodiments relate to clustering execution in a processing system. An aspect includes accessing a control flow graph that defines a data dependency and an execution sequence of a plurality of tasks of an application that executes on a plurality of system components. The execution sequence of the tasks in the control flow graph is modified as a clustered control flow graph that clusters active and idle phases of a system component while maintaining the data dependency. The clustered control flow graph is sent to an operating system, where the operating system utilizes the clustered control flow graph for scheduling the tasks.

  9. Advances in Bayesian Model Based Clustering Using Particle Learning

    Energy Technology Data Exchange (ETDEWEB)

    Merl, D M

    2009-11-19

    Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original

  10. Interactive Graph Layout of a Million Nodes

    Directory of Open Access Journals (Sweden)

    Peng Mi

    2016-12-01

    Full Text Available Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially support sensemaking of large graphs. However, designing interactive algorithms to achieve this is challenging. In this paper, we tackle the scalability problem of interactive layout of large graphs, and contribute a new GPU-based force-directed layout algorithm that exploits graph topology. This algorithm can interactively layout graphs with millions of nodes, and support real-time interaction to explore alternative graph layouts. Users can directly manipulate the layout of vertices in a force-directed fashion. The complexity of traditional repulsive force computation is reduced by approximating calculations based on the hierarchical structure of multi-level clustered graphs. We evaluate the algorithm performance, and demonstrate human-in-the-loop layout in two sensemaking case studies. Moreover, we summarize lessons learned for designing interactive large graph layout algorithms on the GPU.

  11. Risk Assessment for Mobile Systems Through a Multilayered Hierarchical Bayesian Network.

    Science.gov (United States)

    Li, Shancang; Tryfonas, Theo; Russell, Gordon; Andriotis, Panagiotis

    2016-08-01

    Mobile systems are facing a number of application vulnerabilities that can be combined together and utilized to penetrate systems with devastating impact. When assessing the overall security of a mobile system, it is important to assess the security risks posed by each mobile applications (apps), thus gaining a stronger understanding of any vulnerabilities present. This paper aims at developing a three-layer framework that assesses the potential risks which apps introduce within the Android mobile systems. A Bayesian risk graphical model is proposed to evaluate risk propagation in a layered risk architecture. By integrating static analysis, dynamic analysis, and behavior analysis in a hierarchical framework, the risks and their propagation through each layer are well modeled by the Bayesian risk graph, which can quantitatively analyze risks faced to both apps and mobile systems. The proposed hierarchical Bayesian risk graph model offers a novel way to investigate the security risks in mobile environment and enables users and administrators to evaluate the potential risks. This strategy allows to strengthen both app security as well as the security of the entire system.

  12. GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics1

    Energy Technology Data Exchange (ETDEWEB)

    Simmhan, Yogesh; Kumbhare, Alok; Wickramaarachchi, Charith; Nagarkar, Soonil; Ravi, Santosh; Raghavendra, Cauligi; Prasanna, Viktor

    2014-08-25

    Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines the scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.

  13. Learning a Health Knowledge Graph from Electronic Medical Records.

    Science.gov (United States)

    Rotmensch, Maya; Halpern, Yoni; Tlimat, Abdulhakim; Horng, Steven; Sontag, David

    2017-07-20

    Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google's manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).

  14. Verification of Bayesian Clustering in Travel Behaviour Research – First Step to Macroanalysis of Travel Behaviour

    Science.gov (United States)

    Satra, P.; Carsky, J.

    2018-04-01

    Our research is looking at the travel behaviour from a macroscopic view, taking one municipality as a basic unit. The travel behaviour of one municipality as a whole is becoming one piece of a data in the research of travel behaviour of a larger area, perhaps a country. A data pre-processing is used to cluster the municipalities in groups, which show similarities in their travel behaviour. Such groups can be then researched for reasons of their prevailing pattern of travel behaviour without any distortion caused by municipalities with a different pattern. This paper deals with actual settings of the clustering process, which is based on Bayesian statistics, particularly the mixture model. An optimization of the settings parameters based on correlation of pointer model parameters and relative number of data in clusters is helpful, however not fully reliable method. Thus, method for graphic representation of clusters needs to be developed in order to check their quality. A training of the setting parameters in 2D has proven to be a beneficial method, because it allows visual control of the produced clusters. The clustering better be applied on separate groups of municipalities, where competition of only identical transport modes can be found.

  15. Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow.

    Science.gov (United States)

    Wongsuphasawat, Kanit; Smilkov, Daniel; Wexler, James; Wilson, Jimbo; Mane, Dandelion; Fritz, Doug; Krishnan, Dilip; Viegas, Fernanda B; Wattenberg, Martin

    2018-01-01

    We present a design study of the TensorFlow Graph Visualizer, part of the TensorFlow machine intelligence platform. This tool helps users understand complex machine learning architectures by visualizing their underlying dataflow graphs. The tool works by applying a series of graph transformations that enable standard layout techniques to produce a legible interactive diagram. To declutter the graph, we decouple non-critical nodes from the layout. To provide an overview, we build a clustered graph using the hierarchical structure annotated in the source code. To support exploration of nested structure on demand, we perform edge bundling to enable stable and responsive cluster expansion. Finally, we detect and highlight repeated structures to emphasize a model's modular composition. To demonstrate the utility of the visualizer, we describe example usage scenarios and report user feedback. Overall, users find the visualizer useful for understanding, debugging, and sharing the structures of their models.

  16. Network reconstruction via graph blending

    Science.gov (United States)

    Estrada, Rolando

    2016-05-01

    Graphs estimated from empirical data are often noisy and incomplete due to the difficulty of faithfully observing all the components (nodes and edges) of the true graph. This problem is particularly acute for large networks where the number of components may far exceed available surveillance capabilities. Errors in the observed graph can render subsequent analyses invalid, so it is vital to develop robust methods that can minimize these observational errors. Errors in the observed graph may include missing and spurious components, as well fused (multiple nodes are merged into one) and split (a single node is misinterpreted as many) nodes. Traditional graph reconstruction methods are only able to identify missing or spurious components (primarily edges, and to a lesser degree nodes), so we developed a novel graph blending framework that allows us to cast the full estimation problem as a simple edge addition/deletion problem. Armed with this framework, we systematically investigate the viability of various topological graph features, such as the degree distribution or the clustering coefficients, and existing graph reconstruction methods for tackling the full estimation problem. Our experimental results suggest that incorporating any topological feature as a source of information actually hinders reconstruction accuracy. We provide a theoretical analysis of this phenomenon and suggest several avenues for improving this estimation problem.

  17. A model of language inflection graphs

    Science.gov (United States)

    Fukś, Henryk; Farzad, Babak; Cao, Yi

    2014-01-01

    Inflection graphs are highly complex networks representing relationships between inflectional forms of words in human languages. For so-called synthetic languages, such as Latin or Polish, they have particularly interesting structure due to the abundance of inflectional forms. We construct the simplest form of inflection graphs, namely a bipartite graph in which one group of vertices corresponds to dictionary headwords and the other group to inflected forms encountered in a given text. We, then, study projection of this graph on the set of headwords. The projection decomposes into a large number of connected components, to be called word groups. Distribution of sizes of word group exhibits some remarkable properties, resembling cluster distribution in a lattice percolation near the critical point. We propose a simple model which produces graphs of this type, reproducing the desired component distribution and other topological features.

  18. Bayesian investigation of isochrone consistency using the old open cluster NGC 188

    Energy Technology Data Exchange (ETDEWEB)

    Hills, Shane; Courteau, Stéphane [Department of Physics, Engineering Physics and Astronomy, Queen’s University, Kingston, ON K7L 3N6 Canada (Canada); Von Hippel, Ted [Department of Physical Sciences, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114 (United States); Geller, Aaron M., E-mail: shane.hills@queensu.ca, E-mail: courteau@astro.queensu.ca, E-mail: ted.vonhippel@erau.edu, E-mail: a-geller@northwestern.edu [Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA) and Department of Physics and Astronomy, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208 (United States)

    2015-03-01

    This paper provides a detailed comparison of the differences in parameters derived for a star cluster from its color–magnitude diagrams (CMDs) depending on the filters and models used. We examine the consistency and reliability of fitting three widely used stellar evolution models to 15 combinations of optical and near-IR photometry for the old open cluster NGC 188. The optical filter response curves match those of theoretical systems and are thus not the source of fit inconsistencies. NGC 188 is ideally suited to this study thanks to a wide variety of high-quality photometry and available proper motions and radial velocities that enable us to remove non-cluster members and many binaries. Our Bayesian fitting technique yields inferred values of age, metallicity, distance modulus, and absorption as a function of the photometric band combinations and stellar models. We show that the historically favored three-band combinations of UBV and VRI can be meaningfully inconsistent with each other and with longer baseline data sets such as UBVRIJHK{sub S}. Differences among model sets can also be substantial. For instance, fitting Yi et al. (2001) and Dotter et al. (2008) models to UBVRIJHK{sub S} photometry for NGC 188 yields the following cluster parameters: age = (5.78 ± 0.03, 6.45 ± 0.04) Gyr, [Fe/H] = (+0.125 ± 0.003, −0.077 ± 0.003) dex, (m−M){sub V} = (11.441 ± 0.007, 11.525 ± 0.005) mag, and A{sub V} = (0.162 ± 0.003, 0.236 ± 0.003) mag, respectively. Within the formal fitting errors, these two fits are substantially and statistically different. Such differences among fits using different filters and models are a cautionary tale regarding our current ability to fit star cluster CMDs. Additional modeling of this kind, with more models and star clusters, and future Gaia parallaxes are critical for isolating and quantifying the most relevant uncertainties in stellar evolutionary models.

  19. graphkernels: R and Python packages for graph comparison.

    Science.gov (United States)

    Sugiyama, Mahito; Ghisu, M Elisabetta; Llinares-López, Felipe; Borgwardt, Karsten

    2018-02-01

    Measuring the similarity of graphs is a fundamental step in the analysis of graph-structured data, which is omnipresent in computational biology. Graph kernels have been proposed as a powerful and efficient approach to this problem of graph comparison. Here we provide graphkernels, the first R and Python graph kernel libraries including baseline kernels such as label histogram based kernels, classic graph kernels such as random walk based kernels, and the state-of-the-art Weisfeiler-Lehman graph kernel. The core of all graph kernels is implemented in C ++ for efficiency. Using the kernel matrices computed by the package, we can easily perform tasks such as classification, regression and clustering on graph-structured samples. The R and Python packages including source code are available at https://CRAN.R-project.org/package=graphkernels and https://pypi.python.org/pypi/graphkernels. mahito@nii.ac.jp or elisabetta.ghisu@bsse.ethz.ch. Supplementary data are available online at Bioinformatics. © The Author(s) 2017. Published by Oxford University Press.

  20. Reproducibility of graph metrics in fMRI networks

    Directory of Open Access Journals (Sweden)

    Qawi K Telesford

    2010-12-01

    Full Text Available The reliability of graph metrics calculated in network analysis is essential to the interpretation of complex network organization. These graph metrics are used to deduce the small-world properties in networks. In this study, we investigated the test-retest reliability of graph metrics from functional magnetic resonance imaging (fMRI data collected for two runs in 45 healthy older adults. Graph metrics were calculated on data for both runs and compared using intraclass correlation coefficient (ICC statistics and Bland-Altman (BA plots. ICC scores describe the level of absolute agreement between two measurements and provide a measure of reproducibility. For mean graph metrics, ICC scores were high for clustering coefficient (ICC=0.86, global efficiency (ICC=0.83, path length (ICC=0.79, and local efficiency (ICC=0.75; the ICC score for degree was found to be low (ICC=0.29. ICC scores were also used to generate reproducibility maps in brain space to test voxel-wise reproducibility for unsmoothed and smoothed data. Reproducibility was uniform across the brain for global efficiency and path length, but was only high in network hubs for clustering coefficient, local efficiency and degree. BA plots were used to test the measurement repeatability of all graph metrics. All graph metrics fell within the limits for repeatability. Together, these results suggest that with exception of degree, mean graph metrics are reproducible and suitable for clinical studies. Further exploration is warranted to better understand reproducibility across the brain on a voxel-wise basis.

  1. Coupling graph perturbation theory with scalable parallel algorithms for large-scale enumeration of maximal cliques in biological graphs

    International Nuclear Information System (INIS)

    Samatova, N F; Schmidt, M C; Hendrix, W; Breimyer, P; Thomas, K; Park, B-H

    2008-01-01

    Data-driven construction of predictive models for biological systems faces challenges from data intensity, uncertainty, and computational complexity. Data-driven model inference is often considered a combinatorial graph problem where an enumeration of all feasible models is sought. The data-intensive and the NP-hard nature of such problems, however, challenges existing methods to meet the required scale of data size and uncertainty, even on modern supercomputers. Maximal clique enumeration (MCE) in a graph derived from such biological data is often a rate-limiting step in detecting protein complexes in protein interaction data, finding clusters of co-expressed genes in microarray data, or identifying clusters of orthologous genes in protein sequence data. We report two key advances that address this challenge. We designed and implemented the first (to the best of our knowledge) parallel MCE algorithm that scales linearly on thousands of processors running MCE on real-world biological networks with thousands and hundreds of thousands of vertices. In addition, we proposed and developed the Graph Perturbation Theory (GPT) that establishes a foundation for efficiently solving the MCE problem in perturbed graphs, which model the uncertainty in the data. GPT formulates necessary and sufficient conditions for detecting the differences between the sets of maximal cliques in the original and perturbed graphs and reduces the enumeration time by more than 80% compared to complete recomputation

  2. Bayesian Methods for Radiation Detection and Dosimetry

    CERN Document Server

    Groer, Peter G

    2002-01-01

    We performed work in three areas: radiation detection, external and internal radiation dosimetry. In radiation detection we developed Bayesian techniques to estimate the net activity of high and low activity radioactive samples. These techniques have the advantage that the remaining uncertainty about the net activity is described by probability densities. Graphs of the densities show the uncertainty in pictorial form. Figure 1 below demonstrates this point. We applied stochastic processes for a method to obtain Bayesian estimates of 222Rn-daughter products from observed counting rates. In external radiation dosimetry we studied and developed Bayesian methods to estimate radiation doses to an individual with radiation induced chromosome aberrations. We analyzed chromosome aberrations after exposure to gammas and neutrons and developed a method for dose-estimation after criticality accidents. The research in internal radiation dosimetry focused on parameter estimation for compartmental models from observed comp...

  3. Stability of maximum-likelihood-based clustering methods: exploring the backbone of classifications

    International Nuclear Information System (INIS)

    Mungan, Muhittin; Ramasco, José J

    2010-01-01

    Components of complex systems are often classified according to the way they interact with each other. In graph theory such groups are known as clusters or communities. Many different techniques have been recently proposed to detect them, some of which involve inference methods using either Bayesian or maximum likelihood approaches. In this paper, we study a statistical model designed for detecting clusters based on connection similarity. The basic assumption of the model is that the graph was generated by a certain grouping of the nodes and an expectation maximization algorithm is employed to infer that grouping. We show that the method admits further development to yield a stability analysis of the groupings that quantifies the extent to which each node influences its neighbors' group membership. Our approach naturally allows for the identification of the key elements responsible for the grouping and their resilience to changes in the network. Given the generality of the assumptions underlying the statistical model, such nodes are likely to play special roles in the original system. We illustrate this point by analyzing several empirical networks for which further information about the properties of the nodes is available. The search and identification of stabilizing nodes constitutes thus a novel technique to characterize the relevance of nodes in complex networks

  4. Image-Based Edge Bundles : Simplified Visualization of Large Graphs

    NARCIS (Netherlands)

    Telea, A.; Ersoy, O.

    2010-01-01

    We present a new approach aimed at understanding the structure of connections in edge-bundling layouts. We combine the advantages of edge bundles with a bundle-centric simplified visual representation of a graph's structure. For this, we first compute a hierarchical edge clustering of a given graph

  5. Row—column visibility graph approach to two-dimensional landscapes

    International Nuclear Information System (INIS)

    Xiao Qin; Pan Xue; Li Xin-Li; Stephen Mutua; Yang Hui-Jie; Jiang Yan; Wang Jian-Yong; Zhang Qing-Jun

    2014-01-01

    A new concept, called the row—column visibility graph, is proposed to map two-dimensional landscapes to complex networks. A cluster coverage is introduced to describe the extensive property of node clusters on a Euclidean lattice. Graphs mapped from fractals generated with the probability redistribution model behave scale-free. They have pattern-induced hierarchical organizations and comparatively much more extensive structures. The scale-free exponent has a negative correlation with the Hurst exponent, however, there is no deterministic relation between them. Graphs for fractals generated with the midpoint displacement model are exponential networks. When the Hurst exponent is large enough (e.g., H > 0.5), the degree distribution decays much more slowly, the average coverage becomes significant large, and the initially hierarchical structure at H < 0.5 is destroyed completely. Hence, the row—column visibility graph can be used to detect the pattern-related new characteristics of two-dimensional landscapes. (interdisciplinary physics and related areas of science and technology)

  6. Voting-based consensus clustering for combining multiple clusterings of chemical structures

    Directory of Open Access Journals (Sweden)

    Saeed Faisal

    2012-12-01

    Full Text Available Abstract Background Although many consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics, few consensus clustering methods have been applied for combining multiple clusterings of chemical structures. It is known that any individual clustering method will not always give the best results for all types of applications. So, in this paper, three voting and graph-based consensus clusterings were used for combining multiple clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Results The cumulative voting-based aggregation algorithm (CVAA, cluster-based similarity partitioning algorithm (CSPA and hyper-graph partitioning algorithm (HGPA were examined. The F-measure and Quality Partition Index method (QPI were used to evaluate the clusterings and the results were compared to the Ward’s clustering method. The MDL Drug Data Report (MDDR dataset was used for experiments and was represented by two 2D fingerprints, ALOGP and ECFP_4. The performance of voting-based consensus clustering method outperformed the Ward’s method using F-measure and QPI method for both ALOGP and ECFP_4 fingerprints, while the graph-based consensus clustering methods outperformed the Ward’s method only for ALOGP using QPI. The Jaccard and Euclidean distance measures were the methods of choice to generate the ensembles, which give the highest values for both criteria. Conclusions The results of the experiments show that consensus clustering methods can improve the effectiveness of chemical structures clusterings. The cumulative voting-based aggregation algorithm (CVAA was the method of choice among consensus clustering methods.

  7. Bayesian networks and boundedly rational expectations

    OpenAIRE

    Ran Spiegler

    2014-01-01

    I present a framework for analyzing decision makers with an imperfect understanding of their environment's correlation structure. The framework borrows the tool of "Bayesian networks", which is ubiquitous in statistics and artificial intelligence. In the model, a decision maker faces an objective multivariate probability distribution (his own action is one of the random variables). He is characterized by a directed acyclic graph over the set of random variables. His subjective belief filters ...

  8. Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory

    KAUST Repository

    Pearce, Roger

    2013-05-01

    We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS). © 2013 IEEE.

  9. Belief propagation and loop series on planar graphs

    International Nuclear Information System (INIS)

    Chertkov, Michael; Teodorescu, Razvan; Chernyak, Vladimir Y

    2008-01-01

    We discuss a generic model of Bayesian inference with binary variables defined on edges of a planar graph. The Loop Calculus approach of Chertkov and Chernyak (2006 Phys. Rev. E 73 065102(R) [cond-mat/0601487]; 2006 J. Stat. Mech. P06009 [cond-mat/0603189]) is used to evaluate the resulting series expansion for the partition function. We show that, for planar graphs, truncating the series at single-connected loops reduces, via a map reminiscent of the Fisher transformation (Fisher 1961 Phys. Rev. 124 1664), to evaluating the partition function of the dimer-matching model on an auxiliary planar graph. Thus, the truncated series can be easily re-summed, using the Pfaffian formula of Kasteleyn (1961 Physics 27 1209). This allows us to identify a big class of computationally tractable planar models reducible to a dimer model via the Belief Propagation (gauge) transformation. The Pfaffian representation can also be extended to the full Loop Series, in which case the expansion becomes a sum of Pfaffian contributions, each associated with dimer matchings on an extension to a subgraph of the original graph. Algorithmic consequences of the Pfaffian representation, as well as relations to quantum and non-planar models, are discussed

  10. Exact WKB analysis and cluster algebras

    International Nuclear Information System (INIS)

    Iwaki, Kohei; Nakanishi, Tomoki

    2014-01-01

    We develop the mutation theory in the exact WKB analysis using the framework of cluster algebras. Under a continuous deformation of the potential of the Schrödinger equation on a compact Riemann surface, the Stokes graph may change the topology. We call this phenomenon the mutation of Stokes graphs. Along the mutation of Stokes graphs, the Voros symbols, which are monodromy data of the equation, also mutate due to the Stokes phenomenon. We show that the Voros symbols mutate as variables of a cluster algebra with surface realization. As an application, we obtain the identities of Stokes automorphisms associated with periods of cluster algebras. The paper also includes an extensive introduction of the exact WKB analysis and the surface realization of cluster algebras for nonexperts. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Cluster algebras in mathematical physics’. (paper)

  11. Two-Phase and Graph-Based Clustering Methods for Accurate and Efficient Segmentation of Large Mass Spectrometry Images.

    Science.gov (United States)

    Dexter, Alex; Race, Alan M; Steven, Rory T; Barnes, Jennifer R; Hulme, Heather; Goodwin, Richard J A; Styles, Iain B; Bunch, Josephine

    2017-11-07

    Clustering is widely used in MSI to segment anatomical features and differentiate tissue types, but existing approaches are both CPU and memory-intensive, limiting their application to small, single data sets. We propose a new approach that uses a graph-based algorithm with a two-phase sampling method that overcomes this limitation. We demonstrate the algorithm on a range of sample types and show that it can segment anatomical features that are not identified using commonly employed algorithms in MSI, and we validate our results on synthetic MSI data. We show that the algorithm is robust to fluctuations in data quality by successfully clustering data with a designed-in variance using data acquired with varying laser fluence. Finally, we show that this method is capable of generating accurate segmentations of large MSI data sets acquired on the newest generation of MSI instruments and evaluate these results by comparison with histopathology.

  12. Bayesian ARTMAP for regression.

    Science.gov (United States)

    Sasu, L M; Andonie, R

    2013-10-01

    Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Learning Local Components to Understand Large Bayesian Networks

    DEFF Research Database (Denmark)

    Zeng, Yifeng; Xiang, Yanping; Cordero, Jorge

    2009-01-01

    (domain experts) to extract accurate information from a large Bayesian network due to dimensional difficulty. We define a formulation of local components and propose a clustering algorithm to learn such local components given complete data. The algorithm groups together most inter-relevant attributes......Bayesian networks are known for providing an intuitive and compact representation of probabilistic information and allowing the creation of models over a large and complex domain. Bayesian learning and reasoning are nontrivial for a large Bayesian network. In parallel, it is a tough job for users...... in a domain. We evaluate its performance on three benchmark Bayesian networks and provide results in support. We further show that the learned components may represent local knowledge more precisely in comparison to the full Bayesian networks when working with a small amount of data....

  14. Identifying the source of farmed escaped Atlantic salmon (Salmo salar): Bayesian clustering analysis increases accuracy of assignment

    DEFF Research Database (Denmark)

    Glover, Kevin A.; Hansen, Michael Møller; Skaala, Oystein

    2009-01-01

    44 cages located on 26 farms in the Hardangerfjord, western Norway. This fjord represents one of the major salmon farming areas in Norway, with a production of 57,000 t in 2007. Based upon genetic data from 17 microsatellite markers, significant but highly variable differentiation was observed among....... Accuracy of assignment varied greatly among the individual samples. For the Bayesian clustered data set consisting of five genetic groups, overall accuracy of self-assignment was 99%, demonstrating the effectiveness of this strategy to significantly increase accuracy of assignment, albeit at the expense...

  15. Characteristic imsets for learning Bayesian network structure

    Czech Academy of Sciences Publication Activity Database

    Hemmecke, R.; Lindner, S.; Studený, Milan

    2012-01-01

    Roč. 53, č. 9 (2012), s. 1336-1349 ISSN 0888-613X R&D Projects: GA MŠk(CZ) 1M0572; GA ČR GA201/08/0539 Institutional support: RVO:67985556 Keywords : learning Bayesian network structure * essential graph * standard imset * characteristic imset * LP relaxation of a polytope Subject RIV: BA - General Mathematics Impact factor: 1.729, year: 2012 http://library.utia.cas.cz/separaty/2012/MTR/studeny-0382596.pdf

  16. Characterization of inclusion neighbourhood in terms of the essential graph

    Czech Academy of Sciences Publication Activity Database

    Studený, Milan

    2005-01-01

    Roč. 38, č. 3 (2005), s. 283-309 ISSN 0888-613X R&D Projects: GA ČR GA201/04/0393; GA AV ČR IAA1075104 Institutional research plan: CEZ:AV0Z10750506 Keywords : learning Bayesian networks * inclusion neighbourhood * essential graph Subject RIV: BA - General Mathematics Impact factor: 0.959, year: 2005 http://library.utia.cas.cz/separaty/historie/studeny-0411275.pdf

  17. Bayesian community detection

    DEFF Research Database (Denmark)

    Mørup, Morten; Schmidt, Mikkel N

    2012-01-01

    Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....

  18. A graph-Laplacian-based feature extraction algorithm for neural spike sorting.

    Science.gov (United States)

    Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos

    2009-01-01

    Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.

  19. Bayesian analysis for exponential random graph models using the adaptive exchange sampler

    KAUST Repository

    Jin, Ick Hoon; Liang, Faming; Yuan, Ying

    2013-01-01

    Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we

  20. Noise reduction in protein-protein interaction graphs by the implementation of a novel weighting scheme

    Directory of Open Access Journals (Sweden)

    Moschopoulos Charalampos

    2011-06-01

    Full Text Available Abstract Background Recent technological advances applied to biology such as yeast-two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of protein interaction networks. These interaction networks represent a rich, yet noisy, source of data that could be used to extract meaningful information, such as protein complexes. Several interaction network weighting schemes have been proposed so far in the literature in order to eliminate the noise inherent in interactome data. In this paper, we propose a novel weighting scheme and apply it to the S. cerevisiae interactome. Complex prediction rates are improved by up to 39%, depending on the clustering algorithm applied. Results We adopt a two step procedure. During the first step, by applying both novel and well established protein-protein interaction (PPI weighting methods, weights are introduced to the original interactome graph based on the confidence level that a given interaction is a true-positive one. The second step applies clustering using established algorithms in the field of graph theory, as well as two variations of Spectral clustering. The clustered interactome networks are also cross-validated against the confirmed protein complexes present in the MIPS database. Conclusions The results of our experimental work demonstrate that interactome graph weighting methods clearly improve the clustering results of several clustering algorithms. Moreover, our proposed weighting scheme outperforms other approaches of PPI graph weighting.

  1. A spatio-temporal nonparametric Bayesian variable selection model of fMRI data for clustering correlated time courses.

    Science.gov (United States)

    Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina

    2014-07-15

    In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Evaluation of clustering algorithms for protein-protein interaction networks

    Directory of Open Access Journals (Sweden)

    van Helden Jacques

    2006-11-01

    Full Text Available Abstract Background Protein interactions are crucial components of all cellular processes. Recently, high-throughput methods have been developed to obtain a global description of the interactome (the whole network of protein interactions for a given organism. In 2002, the yeast interactome was estimated to contain up to 80,000 potential interactions. This estimate is based on the integration of data sets obtained by various methods (mass spectrometry, two-hybrid methods, genetic studies. High-throughput methods are known, however, to yield a non-negligible rate of false positives, and to miss a fraction of existing interactions. The interactome can be represented as a graph where nodes correspond with proteins and edges with pairwise interactions. In recent years clustering methods have been developed and applied in order to extract relevant modules from such graphs. These algorithms require the specification of parameters that may drastically affect the results. In this paper we present a comparative assessment of four algorithms: Markov Clustering (MCL, Restricted Neighborhood Search Clustering (RNSC, Super Paramagnetic Clustering (SPC, and Molecular Complex Detection (MCODE. Results A test graph was built on the basis of 220 complexes annotated in the MIPS database. To evaluate the robustness to false positives and false negatives, we derived 41 altered graphs by randomly removing edges from or adding edges to the test graph in various proportions. Each clustering algorithm was applied to these graphs with various parameter settings, and the clusters were compared with the annotated complexes. We analyzed the sensitivity of the algorithms to the parameters and determined their optimal parameter values. We also evaluated their robustness to alterations of the test graph. We then applied the four algorithms to six graphs obtained from high-throughput experiments and compared the resulting clusters with the annotated complexes. Conclusion This

  3. Regular graph construction for semi-supervised learning

    International Nuclear Information System (INIS)

    Vega-Oliveros, Didier A; Berton, Lilian; Eberle, Andre Mantini; Lopes, Alneu de Andrade; Zhao, Liang

    2014-01-01

    Semi-supervised learning (SSL) stands out for using a small amount of labeled points for data clustering and classification. In this scenario graph-based methods allow the analysis of local and global characteristics of the available data by identifying classes or groups regardless data distribution and representing submanifold in Euclidean space. Most of methods used in literature for SSL classification do not worry about graph construction. However, regular graphs can obtain better classification accuracy compared to traditional methods such as k-nearest neighbor (kNN), since kNN benefits the generation of hubs and it is not appropriate for high-dimensionality data. Nevertheless, methods commonly used for generating regular graphs have high computational cost. We tackle this problem introducing an alternative method for generation of regular graphs with better runtime performance compared to methods usually find in the area. Our technique is based on the preferential selection of vertices according some topological measures, like closeness, generating at the end of the process a regular graph. Experiments using the global and local consistency method for label propagation show that our method provides better or equal classification rate in comparison with kNN

  4. Application of dynamic uncertain causality graph in spacecraft fault diagnosis: Logic cycle

    Science.gov (United States)

    Yao, Quanying; Zhang, Qin; Liu, Peng; Yang, Ping; Zhu, Ma; Wang, Xiaochen

    2017-04-01

    Intelligent diagnosis system are applied to fault diagnosis in spacecraft. Dynamic Uncertain Causality Graph (DUCG) is a new probability graphic model with many advantages. In the knowledge expression of spacecraft fault diagnosis, feedback among variables is frequently encountered, which may cause directed cyclic graphs (DCGs). Probabilistic graphical models (PGMs) such as bayesian network (BN) have been widely applied in uncertain causality representation and probabilistic reasoning, but BN does not allow DCGs. In this paper, DUGG is applied to fault diagnosis in spacecraft: introducing the inference algorithm for the DUCG to deal with feedback. Now, DUCG has been tested in 16 typical faults with 100% diagnosis accuracy.

  5. Accurate phenotyping: Reconciling approaches through Bayesian model averaging.

    Directory of Open Access Journals (Sweden)

    Carla Chia-Ming Chen

    Full Text Available Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder-an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method.

  6. Efficient Extraction of High Centrality Vertices in Distributed Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Kumbhare, Alok [Univ. of Southern California, Los Angeles, CA (United States); Frincu, Marc [Univ. of Southern California, Los Angeles, CA (United States); Raghavendra, Cauligi S. [Univ. of Southern California, Los Angeles, CA (United States); Prasanna, Viktor K. [Univ. of Southern California, Los Angeles, CA (United States)

    2014-09-09

    Betweenness centrality (BC) is an important measure for identifying high value or critical vertices in graphs, in variety of domains such as communication networks, road networks, and social graphs. However, calculating betweenness values is prohibitively expensive and, more often, domain experts are interested only in the vertices with the highest centrality values. In this paper, we first propose a partition-centric algorithm (MS-BC) to calculate BC for a large distributed graph that optimizes resource utilization and improves overall performance. Further, we extend the notion of approximate BC by pruning the graph and removing a subset of edges and vertices that contribute the least to the betweenness values of other vertices (MSL-BC), which further improves the runtime performance. We evaluate the proposed algorithms using a mix of real-world and synthetic graphs on an HPC cluster and analyze its strengths and weaknesses. The experimental results show an improvement in performance of upto 12x for large sparse graphs as compared to the state-of-the-art, and at the same time highlights the need for better partitioning methods to enable a balanced workload across partitions for unbalanced graphs such as small-world or power-law graphs.

  7. SpectralNET – an application for spectral graph analysis and visualization

    Directory of Open Access Journals (Sweden)

    Schreiber Stuart L

    2005-10-01

    Full Text Available Abstract Background Graph theory provides a computational framework for modeling a variety of datasets including those emerging from genomics, proteomics, and chemical genetics. Networks of genes, proteins, small molecules, or other objects of study can be represented as graphs of nodes (vertices and interactions (edges that can carry different weights. SpectralNET is a flexible application for analyzing and visualizing these biological and chemical networks. Results Available both as a standalone .NET executable and as an ASP.NET web application, SpectralNET was designed specifically with the analysis of graph-theoretic metrics in mind, a computational task not easily accessible using currently available applications. Users can choose either to upload a network for analysis using a variety of input formats, or to have SpectralNET generate an idealized random network for comparison to a real-world dataset. Whichever graph-generation method is used, SpectralNET displays detailed information about each connected component of the graph, including graphs of degree distribution, clustering coefficient by degree, and average distance by degree. In addition, extensive information about the selected vertex is shown, including degree, clustering coefficient, various distance metrics, and the corresponding components of the adjacency, Laplacian, and normalized Laplacian eigenvectors. SpectralNET also displays several graph visualizations, including a linear dimensionality reduction for uploaded datasets (Principal Components Analysis and a non-linear dimensionality reduction that provides an elegant view of global graph structure (Laplacian eigenvectors. Conclusion SpectralNET provides an easily accessible means of analyzing graph-theoretic metrics for data modeling and dimensionality reduction. SpectralNET is publicly available as both a .NET application and an ASP.NET web application from http://chembank.broad.harvard.edu/resources/. Source code is

  8. Bayesian Methods for Radiation Detection and Dosimetry

    International Nuclear Information System (INIS)

    Peter G. Groer

    2002-01-01

    We performed work in three areas: radiation detection, external and internal radiation dosimetry. In radiation detection we developed Bayesian techniques to estimate the net activity of high and low activity radioactive samples. These techniques have the advantage that the remaining uncertainty about the net activity is described by probability densities. Graphs of the densities show the uncertainty in pictorial form. Figure 1 below demonstrates this point. We applied stochastic processes for a method to obtain Bayesian estimates of 222Rn-daughter products from observed counting rates. In external radiation dosimetry we studied and developed Bayesian methods to estimate radiation doses to an individual with radiation induced chromosome aberrations. We analyzed chromosome aberrations after exposure to gammas and neutrons and developed a method for dose-estimation after criticality accidents. The research in internal radiation dosimetry focused on parameter estimation for compartmental models from observed compartmental activities. From the estimated probability densities of the model parameters we were able to derive the densities for compartmental activities for a two compartment catenary model at different times. We also calculated the average activities and their standard deviation for a simple two compartment model

  9. Label-based routing for a family of scale-free, modular, planar and unclustered graphs

    International Nuclear Information System (INIS)

    Comellas, Francesc; Miralles, Alicia

    2011-01-01

    We give an optimal labeling and routing algorithm for a family of scale-free, modular and planar graphs with zero clustering. The relevant properties of this family match those of some networks associated with technological and biological systems with a low clustering, including some electronic circuits and protein networks. The existence of an efficient routing protocol for this graph model should help when designing communication algorithms in real networks and also in the understanding of their dynamic processes.

  10. Mizan: Optimizing Graph Mining in Large Parallel Systems

    KAUST Repository

    Kalnis, Panos

    2012-03-01

    Extracting information from graphs, from nding shortest paths to complex graph mining, is essential for many ap- plications. Due to the shear size of modern graphs (e.g., social networks), processing must be done on large paral- lel computing infrastructures (e.g., the cloud). Earlier ap- proaches relied on the MapReduce framework, which was proved inadequate for graph algorithms. More recently, the message passing model (e.g., Pregel) has emerged. Although the Pregel model has many advantages, it is agnostic to the graph properties and the architecture of the underlying com- puting infrastructure, leading to suboptimal performance. In this paper, we propose Mizan, a layer between the users\\' code and the computing infrastructure. Mizan considers the structure of the input graph and the architecture of the in- frastructure in order to: (i) decide whether it is bene cial to generate a near-optimal partitioning of the graph in a pre- processing step, and (ii) choose between typical point-to- point message passing and a novel approach that puts com- puting nodes in a virtual overlay ring. We deployed Mizan on a small local Linux cluster, on the cloud (256 virtual machines in Amazon EC2), and on an IBM Blue Gene/P supercomputer (1024 CPUs). We show that Mizan executes common algorithms on very large graphs 1-2 orders of mag- nitude faster than MapReduce-based implementations and up to one order of magnitude faster than implementations relying on Pregel-like hash-based graph partitioning.

  11. Probability on graphs random processes on graphs and lattices

    CERN Document Server

    Grimmett, Geoffrey

    2018-01-01

    This introduction to some of the principal models in the theory of disordered systems leads the reader through the basics, to the very edge of contemporary research, with the minimum of technical fuss. Topics covered include random walk, percolation, self-avoiding walk, interacting particle systems, uniform spanning tree, random graphs, as well as the Ising, Potts, and random-cluster models for ferromagnetism, and the Lorentz model for motion in a random medium. This new edition features accounts of major recent progress, including the exact value of the connective constant of the hexagonal lattice, and the critical point of the random-cluster model on the square lattice. The choice of topics is strongly motivated by modern applications, and focuses on areas that merit further research. Accessible to a wide audience of mathematicians and physicists, this book can be used as a graduate course text. Each chapter ends with a range of exercises.

  12. Graphic Symbol Recognition using Graph Based Signature and Bayesian Network Classifier

    OpenAIRE

    Luqman, Muhammad Muzzamil; Brouard, Thierry; Ramel, Jean-Yves

    2010-01-01

    We present a new approach for recognition of complex graphic symbols in technical documents. Graphic symbol recognition is a well known challenge in the field of document image analysis and is at heart of most graphic recognition systems. Our method uses structural approach for symbol representation and statistical classifier for symbol recognition. In our system we represent symbols by their graph based signatures: a graphic symbol is vectorized and is converted to an attributed relational g...

  13. A Dynamic Bayesian Network Model for the Production and Inventory Control

    Science.gov (United States)

    Shin, Ji-Sun; Takazaki, Noriyuki; Lee, Tae-Hong; Kim, Jin-Il; Lee, Hee-Hyol

    In general, the production quantities and delivered goods are changed randomly and then the total stock is also changed randomly. This paper deals with the production and inventory control using the Dynamic Bayesian Network. Bayesian Network is a probabilistic model which represents the qualitative dependence between two or more random variables by the graph structure, and indicates the quantitative relations between individual variables by the conditional probability. The probabilistic distribution of the total stock is calculated through the propagation of the probability on the network. Moreover, an adjusting rule of the production quantities to maintain the probability of a lower limit and a ceiling of the total stock to certain values is shown.

  14. Topological structure of dictionary graphs

    International Nuclear Information System (INIS)

    Fuks, Henryk; Krzeminski, Mark

    2009-01-01

    We investigate the topological structure of the subgraphs of dictionary graphs constructed from WordNet and Moby thesaurus data. In the process of learning a foreign language, the learner knows only a subset of all words of the language, corresponding to a subgraph of a dictionary graph. When this subgraph grows with time, its topological properties change. We introduce the notion of the pseudocore and argue that the growth of the vocabulary roughly follows decreasing pseudocore numbers-that is, one first learns words with a high pseudocore number followed by smaller pseudocores. We also propose an alternative strategy for vocabulary growth, involving decreasing core numbers as opposed to pseudocore numbers. We find that as the core or pseudocore grows in size, the clustering coefficient first decreases, then reaches a minimum and starts increasing again. The minimum occurs when the vocabulary reaches a size between 10 3 and 10 4 . A simple model exhibiting similar behavior is proposed. The model is based on a generalized geometric random graph. Possible implications for language learning are discussed.

  15. Clustering and Bayesian hierarchical modeling for the definition of informative prior distributions in hydrogeology

    Science.gov (United States)

    Cucchi, K.; Kawa, N.; Hesse, F.; Rubin, Y.

    2017-12-01

    In order to reduce uncertainty in the prediction of subsurface flow and transport processes, practitioners should use all data available. However, classic inverse modeling frameworks typically only make use of information contained in in-situ field measurements to provide estimates of hydrogeological parameters. Such hydrogeological information about an aquifer is difficult and costly to acquire. In this data-scarce context, the transfer of ex-situ information coming from previously investigated sites can be critical for improving predictions by better constraining the estimation procedure. Bayesian inverse modeling provides a coherent framework to represent such ex-situ information by virtue of the prior distribution and combine them with in-situ information from the target site. In this study, we present an innovative data-driven approach for defining such informative priors for hydrogeological parameters at the target site. Our approach consists in two steps, both relying on statistical and machine learning methods. The first step is data selection; it consists in selecting sites similar to the target site. We use clustering methods for selecting similar sites based on observable hydrogeological features. The second step is data assimilation; it consists in assimilating data from the selected similar sites into the informative prior. We use a Bayesian hierarchical model to account for inter-site variability and to allow for the assimilation of multiple types of site-specific data. We present the application and validation of the presented methods on an established database of hydrogeological parameters. Data and methods are implemented in the form of an open-source R-package and therefore facilitate easy use by other practitioners.

  16. Narrative Collage of Image Collections by Scene Graph Recombination.

    Science.gov (United States)

    Fang, Fei; Yi, Miao; Feng, Hui; Hu, Shenghong; Xiao, Chunxia

    2017-10-04

    Narrative collage is an interesting image editing art to summarize the main theme or storyline behind an image collection. We present a novel method to generate narrative images with plausible semantic scene structures. To achieve this goal, we introduce a layer graph and a scene graph to represent relative depth order and semantic relationship between image objects, respectively. We firstly cluster the input image collection to select representative images, and then extract a group of semantic salient objects from each representative image. Both Layer graphs and scene graphs are constructed and combined according to our specific rules for reorganizing the extracted objects in every image. We design an energy model to appropriately locate every object on the final canvas. Experiment results show that our method can produce competitive narrative collage result and works well on a wide range of image collections.

  17. Using Graph and Vertex Entropy to Compare Empirical Graphs with Theoretical Graph Models

    Directory of Open Access Journals (Sweden)

    Tomasz Kajdanowicz

    2016-09-01

    Full Text Available Over the years, several theoretical graph generation models have been proposed. Among the most prominent are: the Erdős–Renyi random graph model, Watts–Strogatz small world model, Albert–Barabási preferential attachment model, Price citation model, and many more. Often, researchers working with real-world data are interested in understanding the generative phenomena underlying their empirical graphs. They want to know which of the theoretical graph generation models would most probably generate a particular empirical graph. In other words, they expect some similarity assessment between the empirical graph and graphs artificially created from theoretical graph generation models. Usually, in order to assess the similarity of two graphs, centrality measure distributions are compared. For a theoretical graph model this means comparing the empirical graph to a single realization of a theoretical graph model, where the realization is generated from the given model using an arbitrary set of parameters. The similarity between centrality measure distributions can be measured using standard statistical tests, e.g., the Kolmogorov–Smirnov test of distances between cumulative distributions. However, this approach is both error-prone and leads to incorrect conclusions, as we show in our experiments. Therefore, we propose a new method for graph comparison and type classification by comparing the entropies of centrality measure distributions (degree centrality, betweenness centrality, closeness centrality. We demonstrate that our approach can help assign the empirical graph to the most similar theoretical model using a simple unsupervised learning method.

  18. Graph configuration model based evaluation of the education-occupation match.

    Science.gov (United States)

    Gadar, Laszlo; Abonyi, Janos

    2018-01-01

    To study education-occupation matchings we developed a bipartite network model of education to work transition and a graph configuration model based metric. We studied the career paths of 15 thousand Hungarian students based on the integrated database of the National Tax Administration, the National Health Insurance Fund, and the higher education information system of the Hungarian Government. A brief analysis of gender pay gap and the spatial distribution of over-education is presented to demonstrate the background of the research and the resulted open dataset. We highlighted the hierarchical and clustered structure of the career paths based on the multi-resolution analysis of the graph modularity. The results of the cluster analysis can support policymakers to fine-tune the fragmented program structure of higher education.

  19. Eulerian graph embeddings and trails confined to lattice tubes

    International Nuclear Information System (INIS)

    Soteros, C E

    2006-01-01

    Embeddings of graphs in sublattices of the square and simple cubic lattice known as tubes (or prisms) are considered. For such sublattices, two combinatorial bounds are obtained which each relate the number of embeddings of all closed eulerian graphs with k branch points (vertices of degree greater than two) to the number of self-avoiding polygons. From these bounds it is proved that the entropic critical exponent for the number of embeddings of closed eulerian graphs with k branch points is equal to k, and the entropic critical exponent for the number of closed trails with k branch points is equal to k + 1. One of the required combinatorial bounds is obtained via Madras' 1999 lattice cluster pattern theorem, which yields a bound on the number of ways to convert a self-avoiding polygon into a closed eulerian graph embedding with k branch points. The other combinatorial bound is established by constructing a method for sequentially removing branch points from a closed eulerian graph embedding; this yields a bound on the number of ways to convert a closed eulerian graph embedding into a self-avoiding polygon

  20. PERSEUS-HUB: Interactive and Collective Exploration of Large-Scale Graphs

    Directory of Open Access Journals (Sweden)

    Di Jin

    2017-07-01

    Full Text Available Graphs emerge naturally in many domains, such as social science, neuroscience, transportation engineering, and more. In many cases, such graphs have millions or billions of nodes and edges, and their sizes increase daily at a fast pace. How can researchers from various domains explore large graphs interactively and efficiently to find out what is ‘important’? How can multiple researchers explore a new graph dataset collectively and “help” each other with their findings? In this article, we present Perseus-Hub, a large-scale graph mining tool that computes a set of graph properties in a distributed manner, performs ensemble, multi-view anomaly detection to highlight regions that are worth investigating, and provides users with uncluttered visualization and easy interaction with complex graph statistics. Perseus-Hub uses a Spark cluster to calculate various statistics of large-scale graphs efficiently, and aggregates the results in a summary on the master node to support interactive user exploration. In Perseus-Hub, the visualized distributions of graph statistics provide preliminary analysis to understand a graph. To perform a deeper analysis, users with little prior knowledge can leverage patterns (e.g., spikes in the power-law degree distribution marked by other users or experts. Moreover, Perseus-Hub guides users to regions of interest by highlighting anomalous nodes and helps users establish a more comprehensive understanding about the graph at hand. We demonstrate our system through the case study on real, large-scale networks.

  1. Towards port sustainability through probabilistic models: Bayesian networks

    Directory of Open Access Journals (Sweden)

    B. Molina

    2018-04-01

    Full Text Available It is necessary that a manager of an infrastructure knows relations between variables. Using Bayesian networks, variables can be classified, predicted and diagnosed, being able to estimate posterior probability of the unknown ones based on known ones. The proposed methodology has generated a database with port variables, which have been classified as economic, social, environmental and institutional, as addressed in of smart ports studies made in all Spanish Port System. Network has been developed using an acyclic directed graph, which have let us know relationships in terms of parents and sons. In probabilistic terms, it can be concluded from the constructed network that the most decisive variables for port sustainability are those that are part of the institutional dimension. It has been concluded that Bayesian networks allow modeling uncertainty probabilistically even when the number of variables is high as it occurs in port planning and exploitation.

  2. Evaluating Mixture Modeling for Clustering: Recommendations and Cautions

    Science.gov (United States)

    Steinley, Douglas; Brusco, Michael J.

    2011-01-01

    This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…

  3. Cluster editing

    DEFF Research Database (Denmark)

    Böcker, S.; Baumbach, Jan

    2013-01-01

    . The problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications......The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side...

  4. Bayesian feature weighting for unsupervised learning, with application to object recognition

    OpenAIRE

    Carbonetto , Peter; De Freitas , Nando; Gustafson , Paul; Thompson , Natalie

    2003-01-01

    International audience; We present a method for variable selection/weighting in an unsupervised learning context using Bayesian shrinkage. The basis for the model parameters and cluster assignments can be computed simultaneous using an efficient EM algorithm. Applying our Bayesian shrinkage model to a complex problem in object recognition (Duygulu, Barnard, de Freitas and Forsyth 2002), our experiments yied good results.

  5. Macroscopic Models of Clique Tree Growth for Bayesian Networks

    Data.gov (United States)

    National Aeronautics and Space Administration — In clique tree clustering, inference consists of propagation in a clique tree compiled from a Bayesian network. In this paper, we develop an analytical approach to...

  6. Graph analysis of cell clusters forming vascular networks

    Science.gov (United States)

    Alves, A. P.; Mesquita, O. N.; Gómez-Gardeñes, J.; Agero, U.

    2018-03-01

    This manuscript describes the experimental observation of vasculogenesis in chick embryos by means of network analysis. The formation of the vascular network was observed in the area opaca of embryos from 40 to 55 h of development. In the area opaca endothelial cell clusters self-organize as a primitive and approximately regular network of capillaries. The process was observed by bright-field microscopy in control embryos and in embryos treated with Bevacizumab (Avastin), an antibody that inhibits the signalling of the vascular endothelial growth factor (VEGF). The sequence of images of the vascular growth were thresholded, and used to quantify the forming network in control and Avastin-treated embryos. This characterization is made by measuring vessels density, number of cell clusters and the largest cluster density. From the original images, the topology of the vascular network was extracted and characterized by means of the usual network metrics such as: the degree distribution, average clustering coefficient, average short path length and assortativity, among others. This analysis allows to monitor how the largest connected cluster of the vascular network evolves in time and provides with quantitative evidence of the disruptive effects that Avastin has on the tree structure of vascular networks.

  7. Spanning Tree Based Attribute Clustering

    DEFF Research Database (Denmark)

    Zeng, Yifeng; Jorge, Cordero Hernandez

    2009-01-01

    Attribute clustering has been previously employed to detect statistical dependence between subsets of variables. We propose a novel attribute clustering algorithm motivated by research of complex networks, called the Star Discovery algorithm. The algorithm partitions and indirectly discards...... inconsistent edges from a maximum spanning tree by starting appropriate initial modes, therefore generating stable clusters. It discovers sound clusters through simple graph operations and achieves significant computational savings. We compare the Star Discovery algorithm against earlier attribute clustering...

  8. Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph

    KAUST Repository

    Alabdulmohsin, Ibrahim

    2016-10-26

    Malware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy. © 2016 Copyright held by the owner/author(s).

  9. A graph rewriting programming language for graph drawing

    OpenAIRE

    Rodgers, Peter

    1998-01-01

    This paper describes Grrr, a prototype visual graph drawing tool. Previously there were no visual languages for programming graph drawing algorithms despite the inherently visual nature of the process. The languages which gave a diagrammatic view of graphs were not computationally complete and so could not be used to implement complex graph drawing algorithms. Hence current graph drawing tools are all text based. Recent developments in graph rewriting systems have produced computationally com...

  10. Graph Regularized Auto-Encoders for Image Representation.

    Science.gov (United States)

    Yiyi Liao; Yue Wang; Yong Liu

    2017-06-01

    Image representation has been intensively explored in the domain of computer vision for its significant influence on the relative tasks such as image clustering and classification. It is valuable to learn a low-dimensional representation of an image which preserves its inherent information from the original image space. At the perspective of manifold learning, this is implemented with the local invariant idea to capture the intrinsic low-dimensional manifold embedded in the high-dimensional input space. Inspired by the recent successes of deep architectures, we propose a local invariant deep nonlinear mapping algorithm, called graph regularized auto-encoder (GAE). With the graph regularization, the proposed method preserves the local connectivity from the original image space to the representation space, while the stacked auto-encoders provide explicit encoding model for fast inference and powerful expressive capacity for complex modeling. Theoretical analysis shows that the graph regularizer penalizes the weighted Frobenius norm of the Jacobian matrix of the encoder mapping, where the weight matrix captures the local property in the input space. Furthermore, the underlying effects on the hidden representation space are revealed, providing insightful explanation to the advantage of the proposed method. Finally, the experimental results on both clustering and classification tasks demonstrate the effectiveness of our GAE as well as the correctness of the proposed theoretical analysis, and it also suggests that GAE is a superior solution to the current deep representation learning techniques comparing with variant auto-encoders and existing local invariant methods.

  11. Data clustering algorithms and applications

    CERN Document Server

    Aggarwal, Charu C

    2013-01-01

    Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data mining, and machine learning communities. Addressing this problem in a unified way, Data Clustering: Algorithms and Applications provides complete coverage of the entire area of clustering, from basic methods to more refined and complex data clustering approaches. It pays special attention to recent issues in graphs, social networks, and other domains.The book focuses on three primary aspects of data clustering: Methods, describing key techniques commonly used for clustering, such as fea

  12. Graph Aggregation

    NARCIS (Netherlands)

    Endriss, U.; Grandi, U.

    Graph aggregation is the process of computing a single output graph that constitutes a good compromise between several input graphs, each provided by a different source. One needs to perform graph aggregation in a wide variety of situations, e.g., when applying a voting rule (graphs as preference

  13. Graph theoretical model of a sensorimotor connectome in zebrafish.

    Science.gov (United States)

    Stobb, Michael; Peterson, Joshua M; Mazzag, Borbala; Gahtan, Ethan

    2012-01-01

    Mapping the detailed connectivity patterns (connectomes) of neural circuits is a central goal of neuroscience. The best quantitative approach to analyzing connectome data is still unclear but graph theory has been used with success. We present a graph theoretical model of the posterior lateral line sensorimotor pathway in zebrafish. The model includes 2,616 neurons and 167,114 synaptic connections. Model neurons represent known cell types in zebrafish larvae, and connections were set stochastically following rules based on biological literature. Thus, our model is a uniquely detailed computational representation of a vertebrate connectome. The connectome has low overall connection density, with 2.45% of all possible connections, a value within the physiological range. We used graph theoretical tools to compare the zebrafish connectome graph to small-world, random and structured random graphs of the same size. For each type of graph, 100 randomly generated instantiations were considered. Degree distribution (the number of connections per neuron) varied more in the zebrafish graph than in same size graphs with less biological detail. There was high local clustering and a short average path length between nodes, implying a small-world structure similar to other neural connectomes and complex networks. The graph was found not to be scale-free, in agreement with some other neural connectomes. An experimental lesion was performed that targeted three model brain neurons, including the Mauthner neuron, known to control fast escape turns. The lesion decreased the number of short paths between sensory and motor neurons analogous to the behavioral effects of the same lesion in zebrafish. This model is expandable and can be used to organize and interpret a growing database of information on the zebrafish connectome.

  14. Mining chemical reactions using neighborhood behavior and condensed graphs of reactions approaches.

    Science.gov (United States)

    de Luca, Aurélie; Horvath, Dragos; Marcou, Gilles; Solov'ev, Vitaly; Varnek, Alexandre

    2012-09-24

    This work addresses the problem of similarity search and classification of chemical reactions using Neighborhood Behavior (NB) and Condensed Graphs of Reaction (CGR) approaches. The CGR formalism represents chemical reactions as a classical molecular graph with dynamic bonds, enabling descriptor calculations on this graph. Different types of the ISIDA fragment descriptors generated for CGRs in combination with two metrics--Tanimoto and Euclidean--were considered as chemical spaces, to serve for reaction dissimilarity scoring. The NB method has been used to select an optimal combination of descriptors which distinguish different types of chemical reactions in a database containing 8544 reactions of 9 classes. Relevance of NB analysis has been validated in generic (multiclass) similarity search and in clustering with Self-Organizing Maps (SOM). NB-compliant sets of descriptors were shown to display enhanced mapping propensities, allowing the construction of better Self-Organizing Maps and similarity searches (NB and classical similarity search criteria--AUC ROC--correlate at a level of 0.7). The analysis of the SOM clusters proved chemically meaningful CGR substructures representing specific reaction signatures.

  15. Phase transition in the parametric natural visibility graph.

    Science.gov (United States)

    Snarskii, A A; Bezsudnov, I V

    2016-10-01

    We investigate time series by mapping them to the complex networks using a parametric natural visibility graph (PNVG) algorithm that generates graphs depending on arbitrary continuous parameter-the angle of view. We study the behavior of the relative number of clusters in PNVG near the critical value of the angle of view. Artificial and experimental time series of different nature are used for numerical PNVG investigations to find critical exponents above and below the critical point as well as the exponent in the finite size scaling regime. Altogether, they allow us to find the critical exponent of the correlation length for PNVG. The set of calculated critical exponents satisfies the basic Widom relation. The PNVG is found to demonstrate scaling behavior. Our results reveal the similarity between the behavior of the relative number of clusters in PNVG and the order parameter in the second-order phase transitions theory. We show that the PNVG is another example of a system (in addition to magnetic, percolation, superconductivity, etc.) with observed second-order phase transition.

  16. Proxy Graph: Visual Quality Metrics of Big Graph Sampling.

    Science.gov (United States)

    Nguyen, Quan Hoang; Hong, Seok-Hee; Eades, Peter; Meidiana, Amyra

    2017-06-01

    Data sampling has been extensively studied for large scale graph mining. Many analyses and tasks become more efficient when performed on graph samples of much smaller size. The use of proxy objects is common in software engineering for analysis and interaction with heavy objects or systems. In this paper, we coin the term 'proxy graph' and empirically investigate how well a proxy graph visualization can represent a big graph. Our investigation focuses on proxy graphs obtained by sampling; this is one of the most common proxy approaches. Despite the plethora of data sampling studies, this is the first evaluation of sampling in the context of graph visualization. For an objective evaluation, we propose a new family of quality metrics for visual quality of proxy graphs. Our experiments cover popular sampling techniques. Our experimental results lead to guidelines for using sampling-based proxy graphs in visualization.

  17. A Multilevel Gamma-Clustering Layout Algorithm for Visualization of Biological Networks

    Science.gov (United States)

    Hruz, Tomas; Lucas, Christoph; Laule, Oliver; Zimmermann, Philip

    2013-01-01

    Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ-clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs. PMID:23864855

  18. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    Science.gov (United States)

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  19. Bayesian Estimation and Inference using Stochastic Hardware

    Directory of Open Access Journals (Sweden)

    Chetan Singh Thakur

    2016-03-01

    Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.

  20. Bayesian Estimation and Inference Using Stochastic Electronics.

    Science.gov (United States)

    Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André

    2016-01-01

    In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.

  1. Bipartite Graphs as Models of Population Structures in Evolutionary Multiplayer Games

    Science.gov (United States)

    Peña, Jorge; Rochat, Yannick

    2012-01-01

    By combining evolutionary game theory and graph theory, “games on graphs” study the evolutionary dynamics of frequency-dependent selection in population structures modeled as geographical or social networks. Networks are usually represented by means of unipartite graphs, and social interactions by two-person games such as the famous prisoner’s dilemma. Unipartite graphs have also been used for modeling interactions going beyond pairwise interactions. In this paper, we argue that bipartite graphs are a better alternative to unipartite graphs for describing population structures in evolutionary multiplayer games. To illustrate this point, we make use of bipartite graphs to investigate, by means of computer simulations, the evolution of cooperation under the conventional and the distributed N-person prisoner’s dilemma. We show that several implicit assumptions arising from the standard approach based on unipartite graphs (such as the definition of replacement neighborhoods, the intertwining of individual and group diversity, and the large overlap of interaction neighborhoods) can have a large impact on the resulting evolutionary dynamics. Our work provides a clear example of the importance of construction procedures in games on graphs, of the suitability of bigraphs and hypergraphs for computational modeling, and of the importance of concepts from social network analysis such as centrality, centralization and bipartite clustering for the understanding of dynamical processes occurring on networked population structures. PMID:22970237

  2. Graph theoretical model of a sensorimotor connectome in zebrafish.

    Directory of Open Access Journals (Sweden)

    Michael Stobb

    Full Text Available Mapping the detailed connectivity patterns (connectomes of neural circuits is a central goal of neuroscience. The best quantitative approach to analyzing connectome data is still unclear but graph theory has been used with success. We present a graph theoretical model of the posterior lateral line sensorimotor pathway in zebrafish. The model includes 2,616 neurons and 167,114 synaptic connections. Model neurons represent known cell types in zebrafish larvae, and connections were set stochastically following rules based on biological literature. Thus, our model is a uniquely detailed computational representation of a vertebrate connectome. The connectome has low overall connection density, with 2.45% of all possible connections, a value within the physiological range. We used graph theoretical tools to compare the zebrafish connectome graph to small-world, random and structured random graphs of the same size. For each type of graph, 100 randomly generated instantiations were considered. Degree distribution (the number of connections per neuron varied more in the zebrafish graph than in same size graphs with less biological detail. There was high local clustering and a short average path length between nodes, implying a small-world structure similar to other neural connectomes and complex networks. The graph was found not to be scale-free, in agreement with some other neural connectomes. An experimental lesion was performed that targeted three model brain neurons, including the Mauthner neuron, known to control fast escape turns. The lesion decreased the number of short paths between sensory and motor neurons analogous to the behavioral effects of the same lesion in zebrafish. This model is expandable and can be used to organize and interpret a growing database of information on the zebrafish connectome.

  3. Clustering by reordering of similarity and Laplacian matrices: Application to galaxy clusters

    Science.gov (United States)

    Mahmoud, E.; Shoukry, A.; Takey, A.

    2018-04-01

    Similarity metrics, kernels and similarity-based algorithms have gained much attention due to their increasing applications in information retrieval, data mining, pattern recognition and machine learning. Similarity Graphs are often adopted as the underlying representation of similarity matrices and are at the origin of known clustering algorithms such as spectral clustering. Similarity matrices offer the advantage of working in object-object (two-dimensional) space where visualization of clusters similarities is available instead of object-features (multi-dimensional) space. In this paper, sparse ɛ-similarity graphs are constructed and decomposed into strong components using appropriate methods such as Dulmage-Mendelsohn permutation (DMperm) and/or Reverse Cuthill-McKee (RCM) algorithms. The obtained strong components correspond to groups (clusters) in the input (feature) space. Parameter ɛi is estimated locally, at each data point i from a corresponding narrow range of the number of nearest neighbors. Although more advanced clustering techniques are available, our method has the advantages of simplicity, better complexity and direct visualization of the clusters similarities in a two-dimensional space. Also, no prior information about the number of clusters is needed. We conducted our experiments on two and three dimensional, low and high-sized synthetic datasets as well as on an astronomical real-dataset. The results are verified graphically and analyzed using gap statistics over a range of neighbors to verify the robustness of the algorithm and the stability of the results. Combining the proposed algorithm with gap statistics provides a promising tool for solving clustering problems. An astronomical application is conducted for confirming the existence of 45 galaxy clusters around the X-ray positions of galaxy clusters in the redshift range [0.1..0.8]. We re-estimate the photometric redshifts of the identified galaxy clusters and obtain acceptable values

  4. A Bayesian method for inferring transmission chains in a partially observed epidemic.

    Energy Technology Data Exchange (ETDEWEB)

    Marzouk, Youssef M.; Ray, Jaideep

    2008-10-01

    We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.

  5. Bayesian networks modeling for thermal error of numerical control machine tools

    Institute of Scientific and Technical Information of China (English)

    Xin-hua YAO; Jian-zhong FU; Zi-chen CHEN

    2008-01-01

    The interaction between the heat source location,its intensity,thermal expansion coefficient,the machine system configuration and the running environment creates complex thermal behavior of a machine tool,and also makes thermal error prediction difficult.To address this issue,a novel prediction method for machine tool thermal error based on Bayesian networks (BNs) was presented.The method described causal relationships of factors inducing thermal deformation by graph theory and estimated the thermal error by Bayesian statistical techniques.Due to the effective combination of domain knowledge and sampled data,the BN method could adapt to the change of running state of machine,and obtain satisfactory prediction accuracy.Ex-periments on spindle thermal deformation were conducted to evaluate the modeling performance.Experimental results indicate that the BN method performs far better than the least squares(LS)analysis in terms of modeling estimation accuracy.

  6. Chromatic graph theory

    CERN Document Server

    Chartrand, Gary; Rosen, Kenneth H

    2008-01-01

    Beginning with the origin of the four color problem in 1852, the field of graph colorings has developed into one of the most popular areas of graph theory. Introducing graph theory with a coloring theme, Chromatic Graph Theory explores connections between major topics in graph theory and graph colorings as well as emerging topics. This self-contained book first presents various fundamentals of graph theory that lie outside of graph colorings, including basic terminology and results, trees and connectivity, Eulerian and Hamiltonian graphs, matchings and factorizations, and graph embeddings. The remainder of the text deals exclusively with graph colorings. It covers vertex colorings and bounds for the chromatic number, vertex colorings of graphs embedded on surfaces, and a variety of restricted vertex colorings. The authors also describe edge colorings, monochromatic and rainbow edge colorings, complete vertex colorings, several distinguishing vertex and edge colorings, and many distance-related vertex coloring...

  7. Brain Graph Topology Changes Associated with Anti-Epileptic Drug Use

    Science.gov (United States)

    Levin, Harvey S.; Chiang, Sharon

    2015-01-01

    Abstract Neuroimaging studies of functional connectivity using graph theory have furthered our understanding of the network structure in temporal lobe epilepsy (TLE). Brain network effects of anti-epileptic drugs could influence such studies, but have not been systematically studied. Resting-state functional MRI was analyzed in 25 patients with TLE using graph theory analysis. Patients were divided into two groups based on anti-epileptic medication use: those taking carbamazepine/oxcarbazepine (CBZ/OXC) (n=9) and those not taking CBZ/OXC (n=16) as a part of their medication regimen. The following graph topology metrics were analyzed: global efficiency, betweenness centrality (BC), clustering coefficient, and small-world index. Multiple linear regression was used to examine the association of CBZ/OXC with graph topology. The two groups did not differ from each other based on epilepsy characteristics. Use of CBZ/OXC was associated with a lower BC. Longer epilepsy duration was also associated with a lower BC. These findings can inform graph theory-based studies in patients with TLE. The changes observed are discussed in relation to the anti-epileptic mechanism of action and adverse effects of CBZ/OXC. PMID:25492633

  8. Spatial cluster detection using dynamic programming

    Directory of Open Access Journals (Sweden)

    Sverchkov Yuriy

    2012-03-01

    Full Text Available Abstract Background The task of spatial cluster detection involves finding spatial regions where some property deviates from the norm or the expected value. In a probabilistic setting this task can be expressed as finding a region where some event is significantly more likely than usual. Spatial cluster detection is of interest in fields such as biosurveillance, mining of astronomical data, military surveillance, and analysis of fMRI images. In almost all such applications we are interested both in the question of whether a cluster exists in the data, and if it exists, we are interested in finding the most accurate characterization of the cluster. Methods We present a general dynamic programming algorithm for grid-based spatial cluster detection. The algorithm can be used for both Bayesian maximum a-posteriori (MAP estimation of the most likely spatial distribution of clusters and Bayesian model averaging over a large space of spatial cluster distributions to compute the posterior probability of an unusual spatial clustering. The algorithm is explained and evaluated in the context of a biosurveillance application, specifically the detection and identification of Influenza outbreaks based on emergency department visits. A relatively simple underlying model is constructed for the purpose of evaluating the algorithm, and the algorithm is evaluated using the model and semi-synthetic test data. Results When compared to baseline methods, tests indicate that the new algorithm can improve MAP estimates under certain conditions: the greedy algorithm we compared our method to was found to be more sensitive to smaller outbreaks, while as the size of the outbreaks increases, in terms of area affected and proportion of individuals affected, our method overtakes the greedy algorithm in spatial precision and recall. The new algorithm performs on-par with baseline methods in the task of Bayesian model averaging. Conclusions We conclude that the dynamic

  9. Analyzing the effect of introducing a kurtosis parameter in Gaussian Bayesian networks

    International Nuclear Information System (INIS)

    Main, P.; Navarro, H.

    2009-01-01

    Gaussian Bayesian networks are graphical models that represent the dependence structure of a multivariate normal random variable with a directed acyclic graph (DAG). In Gaussian Bayesian networks the output is usually the conditional distribution of some unknown variables of interest given a set of evidential nodes whose values are known. The problem of uncertainty about the assumption of normality is very common in applications. Thus a sensitivity analysis of the non-normality effect in our conclusions could be necessary. The aspect of non-normality to be considered is the tail behavior. In this line, the multivariate exponential power distribution is a family depending on a kurtosis parameter that goes from a leptokurtic to a platykurtic distribution with the normal as a mesokurtic distribution. Therefore a more general model can be considered using the multivariate exponential power distribution to describe the joint distribution of a Bayesian network, with a kurtosis parameter reflecting deviations from the normal distribution. The sensitivity of the conclusions to this perturbation is analyzed using the Kullback-Leibler divergence measure that provides an interesting formula to evaluate the effect

  10. Analyzing the effect of introducing a kurtosis parameter in Gaussian Bayesian networks

    Energy Technology Data Exchange (ETDEWEB)

    Main, P. [Dpto. Estadistica e I.O., Fac. Ciencias Matematicas, Univ. Complutense de Madrid, 28040 Madrid (Spain)], E-mail: pmain@mat.ucm.es; Navarro, H. [Dpto. de Estadistica, I.O. y Calc. Numerico, Fac. Ciencias, UNED, 28040 Madrid (Spain)

    2009-05-15

    Gaussian Bayesian networks are graphical models that represent the dependence structure of a multivariate normal random variable with a directed acyclic graph (DAG). In Gaussian Bayesian networks the output is usually the conditional distribution of some unknown variables of interest given a set of evidential nodes whose values are known. The problem of uncertainty about the assumption of normality is very common in applications. Thus a sensitivity analysis of the non-normality effect in our conclusions could be necessary. The aspect of non-normality to be considered is the tail behavior. In this line, the multivariate exponential power distribution is a family depending on a kurtosis parameter that goes from a leptokurtic to a platykurtic distribution with the normal as a mesokurtic distribution. Therefore a more general model can be considered using the multivariate exponential power distribution to describe the joint distribution of a Bayesian network, with a kurtosis parameter reflecting deviations from the normal distribution. The sensitivity of the conclusions to this perturbation is analyzed using the Kullback-Leibler divergence measure that provides an interesting formula to evaluate the effect.

  11. Dirichlet Process Parsimonious Mixtures for clustering

    OpenAIRE

    Chamroukhi, Faicel; Bartcus, Marius; Glotin, Hervé

    2015-01-01

    The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixtur...

  12. Co-clustering Analysis of Weblogs Using Bipartite Spectral Projection Approach

    DEFF Research Database (Denmark)

    Xu, Guandong; Zong, Yu; Dolog, Peter

    2010-01-01

    Web clustering is an approach for aggregating Web objects into various groups according to underlying relationships among them. Finding co-clusters of Web objects is an interesting topic in the context of Web usage mining, which is able to capture the underlying user navigational interest...... and content preference simultaneously. In this paper we will present an algorithm using bipartite spectral clustering to co-cluster Web users and pages. The usage data of users visiting Web sites is modeled as a bipartite graph and the spectral clustering is then applied to the graph representation of usage...... data. The proposed approach is evaluated by experiments performed on real datasets, and the impact of using various clustering algorithms is also investigated. Experimental results have demonstrated the employed method can effectively reveal the subset aggregates of Web users and pages which...

  13. Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory

    KAUST Repository

    Pearce, Roger; Gokhale, Maya; Amato, Nancy M.

    2013-01-01

    We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash

  14. An application of the graph theory which examines the metro networks

    Directory of Open Access Journals (Sweden)

    Svetla STOILOVA

    2015-06-01

    Full Text Available The graph theory gives a mathematical representation of transport networks and allows us to study their characteristics effectively. A research of the structure of metro system has been conducted in the study by using the graph theory. The study includes subway systems of 22 European capitals. New indicators have been defined in the research such as a degree of routing, a connectivity of the route, average length per link (which takes into account the number of routes, intensity of the route, density of the route. The new and the existing indicators have been used to analyze and classify the metro networks. The statistical method cluster analysis has been applied to classify the networks. Ten indicators have been used to carry out an analysis. The metro systems in European capitals have been classified in three clusters. The first cluster includes large metro systems, the second one includes small metro networks whereas the third cluster includes metro networks with only one line. The combination of both two methods has been used for the first time in this research. The methodology could be used to evaluate other existing metro networks as well as for preliminary analysis in the design of subway systems.

  15. MadGraph/MadEvent. The new web generation

    International Nuclear Information System (INIS)

    Alwall, J.

    2007-01-01

    The new web-based version of the automatized process and event generator MadGraph/MadEvent is now available. Recent developments are: New models, notably MSSM, 2HDM and a framework for addition of user-defined models, inclusive sample generation and on-line hadronization and detector simulation. Event generation can be done on-line on any of our clusters. (author)

  16. Bayesian Analysis of two Censored Shifted Gompertz Mixture Distributions using Informative and Noninformative Priors

    Directory of Open Access Journals (Sweden)

    Tabassum Naz Sindhu

    2017-03-01

    Full Text Available This study deals with Bayesian analysis of shifted Gompertz mixture model under type-I censored samples assuming both informative and noninformative priors. We have discussed the Bayesian estimation of parameters of shifted Gompertz mixture model under the uniform, and gamma priors assuming three loss functions. Further, some properties of the model with some graphs of the mixture density are discussed. These properties include Bayes estimators, posterior risks and reliability function under simulation scheme. Bayes estimates are obtained considering two cases: (a when the shape parameter is known and (b when all parameters are unknown. We analyzed some simulated sets in order to investigate the effect of prior belief, loss functions, and performance of the proposed set of estimators of the mixture model parameters.

  17. Big Data Clustering via Community Detection and Hyperbolic Network Embedding in IoT Applications.

    Science.gov (United States)

    Karyotis, Vasileios; Tsitseklis, Konstantinos; Sotiropoulos, Konstantinos; Papavassiliou, Symeon

    2018-04-15

    In this paper, we present a novel data clustering framework for big sensory data produced by IoT applications. Based on a network representation of the relations among multi-dimensional data, data clustering is mapped to node clustering over the produced data graphs. To address the potential very large scale of such datasets/graphs that test the limits of state-of-the-art approaches, we map the problem of data clustering to a community detection one over the corresponding data graphs. Specifically, we propose a novel computational approach for enhancing the traditional Girvan-Newman (GN) community detection algorithm via hyperbolic network embedding. The data dependency graph is embedded in the hyperbolic space via Rigel embedding, allowing more efficient computation of edge-betweenness centrality needed in the GN algorithm. This allows for more efficient clustering of the nodes of the data graph in terms of modularity, without sacrificing considerable accuracy. In order to study the operation of our approach with respect to enhancing GN community detection, we employ various representative types of artificial complex networks, such as scale-free, small-world and random geometric topologies, and frequently-employed benchmark datasets for demonstrating its efficacy in terms of data clustering via community detection. Furthermore, we provide a proof-of-concept evaluation by applying the proposed framework over multi-dimensional datasets obtained from an operational smart-city/building IoT infrastructure provided by the Federated Interoperable Semantic IoT/cloud Testbeds and Applications (FIESTA-IoT) testbed federation. It is shown that the proposed framework can be indeed used for community detection/data clustering and exploited in various other IoT applications, such as performing more energy-efficient smart-city/building sensing.

  18. Bayesian Estimation of the Kumaraswamy InverseWeibull Distribution

    Directory of Open Access Journals (Sweden)

    Felipe R.S. de Gusmao

    2017-05-01

    Full Text Available The Kumaraswamy InverseWeibull distribution has the ability to model failure rates that have unimodal shapes and are quite common in reliability and biological studies. The three-parameter Kumaraswamy InverseWeibull distribution with decreasing and unimodal failure rate is introduced. We provide a comprehensive treatment of the mathematical properties of the Kumaraswany Inverse Weibull distribution and derive expressions for its moment generating function and the ligrl/ig-th generalized moment. Some properties of the model with some graphs of density and hazard function are discussed. We also discuss a Bayesian approach for this distribution and an application was made for a real data set.

  19. Graph sampling

    OpenAIRE

    Zhang, L.-C.; Patone, M.

    2017-01-01

    We synthesise the existing theory of graph sampling. We propose a formal definition of sampling in finite graphs, and provide a classification of potential graph parameters. We develop a general approach of Horvitz–Thompson estimation to T-stage snowball sampling, and present various reformulations of some common network sampling methods in the literature in terms of the outlined graph sampling theory.

  20. Clustering User Behavior in Scientific Collections

    OpenAIRE

    Blixhavn, Øystein Hoel

    2014-01-01

    This master thesis looks at how clustering techniques can be appliedto a collection of scientific documents. Approximately one year of serverlogs from the CERN Document Server (CDS) are analyzed and preprocessed.Based on the findings of this analysis, and a review of thecurrent state of the art, three different clustering methods are selectedfor further work: Simple k-Means, Hierarchical Agglomerative Clustering(HAC) and Graph Partitioning. In addition, a custom, agglomerativeclustering algor...

  1. Multiple Kernel Learning for adaptive graph regularized nonnegative matrix factorization

    KAUST Repository

    Wang, Jim Jing-Yan; AbdulJabbar, Mustafa Abdulmajeed

    2012-01-01

    Nonnegative Matrix Factorization (NMF) has been continuously evolving in several areas like pattern recognition and information retrieval methods. It factorizes a matrix into a product of 2 low-rank non-negative matrices that will define parts-based, and linear representation of non-negative data. Recently, Graph regularized NMF (GrNMF) is proposed to find a compact representation, which uncovers the hidden semantics and simultaneously respects the intrinsic geometric structure. In GNMF, an affinity graph is constructed from the original data space to encode the geometrical information. In this paper, we propose a novel idea which engages a Multiple Kernel Learning approach into refining the graph structure that reflects the factorization of the matrix and the new data space. The GrNMF is improved by utilizing the graph refined by the kernel learning, and then a novel kernel learning method is introduced under the GrNMF framework. Our approach shows encouraging results of the proposed algorithm in comparison to the state-of-the-art clustering algorithms like NMF, GrNMF, SVD etc.

  2. Algorithms for Planar Graphs and Graphs in Metric Spaces

    DEFF Research Database (Denmark)

    Wulff-Nilsen, Christian

    structural properties that can be exploited. For instance, a road network or a wire layout on a microchip is typically (near-)planar and distances in the network are often defined w.r.t. the Euclidean or the rectilinear metric. Specialized algorithms that take advantage of such properties are often orders...... of magnitude faster than the corresponding algorithms for general graphs. The first and main part of this thesis focuses on the development of efficient planar graph algorithms. The most important contributions include a faster single-source shortest path algorithm, a distance oracle with subquadratic...... for geometric graphs and graphs embedded in metric spaces. Roughly speaking, the stretch factor is a real value expressing how well a (geo-)metric graph approximates the underlying complete graph w.r.t. distances. We give improved algorithms for computing the stretch factor of a given graph and for augmenting...

  3. Ant-based extraction of rules in simple decision systems over ontological graphs

    Directory of Open Access Journals (Sweden)

    Pancerz Krzysztof

    2015-06-01

    Full Text Available In the paper, the problem of extraction of complex decision rules in simple decision systems over ontological graphs is considered. The extracted rules are consistent with the dominance principle similar to that applied in the dominancebased rough set approach (DRSA. In our study, we propose to use a heuristic algorithm, utilizing the ant-based clustering approach, searching the semantic spaces of concepts presented by means of ontological graphs. Concepts included in the semantic spaces are values of attributes describing objects in simple decision systems

  4. Upper-Lower Bounds Candidate Sets Searching Algorithm for Bayesian Network Structure Learning

    Directory of Open Access Journals (Sweden)

    Guangyi Liu

    2014-01-01

    Full Text Available Bayesian network is an important theoretical model in artificial intelligence field and also a powerful tool for processing uncertainty issues. Considering the slow convergence speed of current Bayesian network structure learning algorithms, a fast hybrid learning method is proposed in this paper. We start with further analysis of information provided by low-order conditional independence testing, and then two methods are given for constructing graph model of network, which is theoretically proved to be upper and lower bounds of the structure space of target network, so that candidate sets are given as a result; after that a search and scoring algorithm is operated based on the candidate sets to find the final structure of the network. Simulation results show that the algorithm proposed in this paper is more efficient than similar algorithms with the same learning precision.

  5. Spatial cluster modelling

    CERN Document Server

    Lawson, Andrew B

    2002-01-01

    Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space and space-time, spatial and spatio-temporal process modelling, nonparametric methods for clustering, and spatio-temporal ...

  6. A Random Walk Approach to Query Informative Constraints for Clustering.

    Science.gov (United States)

    Abin, Ahmad Ali

    2017-08-09

    This paper presents a random walk approach to the problem of querying informative constraints for clustering. The proposed method is based on the properties of the commute time, that is the expected time taken for a random walk to travel between two nodes and return, on the adjacency graph of data. Commute time has the nice property of that, the more short paths connect two given nodes in a graph, the more similar those nodes are. Since computing the commute time takes the Laplacian eigenspectrum into account, we use this property in a recursive fashion to query informative constraints for clustering. At each recursion, the proposed method constructs the adjacency graph of data and utilizes the spectral properties of the commute time matrix to bipartition the adjacency graph. Thereafter, the proposed method benefits from the commute times distance on graph to query informative constraints between partitions. This process iterates for each partition until the stop condition becomes true. Experiments on real-world data show the efficiency of the proposed method for constraints selection.

  7. Using graph theory to analyze the vulnerability of process plants in the context of cascading effects

    International Nuclear Information System (INIS)

    Khakzad, Nima; Reniers, Genserik

    2015-01-01

    Dealing with large quantities of flammable and explosive materials, usually at high-pressure high-temperature conditions, makes process plants very vulnerable to cascading effects compared with other infrastructures. The combination of the extremely low frequency of cascading effects and the high complexity and interdependencies of process plants makes risk assessment and vulnerability analysis of process plants very challenging in the context of such events. In the present study, cascading effects were represented as a directed graph; accordingly, the efficacy of a set of graph metrics and measurements was examined in both unit and plant-wide vulnerability analysis of process plants. We demonstrated that vertex-level closeness and betweenness can be used in the unit vulnerability analysis of process plants for the identification of critical units within a process plant. Furthermore, the graph-level closeness metric can be used in the plant-wide vulnerability analysis for the identification of the most vulnerable plant layout with respect to the escalation of cascading effects. Furthermore, the results from the application of the graph metrics have been verified using a Bayesian network methodology. - Highlights: • Graph metrics can effectively be employed to identify vulnerable units and layouts in process plants. • Units with larger vertex-level closeness result in more probable and severe cascading effects. • Units with larger vertex-level betweenness contribute more to the escalation of cascading effects. • Layouts with larger graph-level closeness are more vulnerable to the escalation of cascading effects

  8. Overlapping clusters for distributed computation.

    Energy Technology Data Exchange (ETDEWEB)

    Mirrokni, Vahab (Google Research, New York, NY); Andersen, Reid (Microsoft Corporation, Redmond, WA); Gleich, David F.

    2010-11-01

    Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initial partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.

  9. Degree Associated Edge Reconstruction Number of Graphs with Regular Pruned Graph

    Directory of Open Access Journals (Sweden)

    P. Anusha Devi

    2015-10-01

    Full Text Available An ecard of a graph $G$ is a subgraph formed by deleting an edge. A da-ecard specifies the degree of the deleted edge along with the ecard. The degree associated edge reconstruction number of a graph $G,~dern(G,$ is the minimum number of da-ecards that uniquely determines $G.$  The adversary degree associated edge reconstruction number of a graph $G, adern(G,$ is the minimum number $k$ such that every collection of $k$ da-ecards of $G$ uniquely determines $G.$ The maximal subgraph without end vertices of a graph $G$ which is not a tree is the pruned graph of $G.$ It is shown that $dern$ of complete multipartite graphs and some connected graphs with regular pruned graph is $1$ or $2.$ We also determine $dern$ and $adern$ of corona product of standard graphs.

  10. Exploring biological network structure with clustered random networks

    Directory of Open Access Journals (Sweden)

    Bansal Shweta

    2009-12-01

    Full Text Available Abstract Background Complex biological systems are often modeled as networks of interacting units. Networks of biochemical interactions among proteins, epidemiological contacts among hosts, and trophic interactions in ecosystems, to name a few, have provided useful insights into the dynamical processes that shape and traverse these systems. The degrees of nodes (numbers of interactions and the extent of clustering (the tendency for a set of three nodes to be interconnected are two of many well-studied network properties that can fundamentally shape a system. Disentangling the interdependent effects of the various network properties, however, can be difficult. Simple network models can help us quantify the structure of empirical networked systems and understand the impact of various topological properties on dynamics. Results Here we develop and implement a new Markov chain simulation algorithm to generate simple, connected random graphs that have a specified degree sequence and level of clustering, but are random in all other respects. The implementation of the algorithm (ClustRNet: Clustered Random Networks provides the generation of random graphs optimized according to a local or global, and relative or absolute measure of clustering. We compare our algorithm to other similar methods and show that ours more successfully produces desired network characteristics. Finding appropriate null models is crucial in bioinformatics research, and is often difficult, particularly for biological networks. As we demonstrate, the networks generated by ClustRNet can serve as random controls when investigating the impacts of complex network features beyond the byproduct of degree and clustering in empirical networks. Conclusion ClustRNet generates ensembles of graphs of specified edge structure and clustering. These graphs allow for systematic study of the impacts of connectivity and redundancies on network function and dynamics. This process is a key step in

  11. Big Data Clustering via Community Detection and Hyperbolic Network Embedding in IoT Applications

    Directory of Open Access Journals (Sweden)

    Vasileios Karyotis

    2018-04-01

    Full Text Available In this paper, we present a novel data clustering framework for big sensory data produced by IoT applications. Based on a network representation of the relations among multi-dimensional data, data clustering is mapped to node clustering over the produced data graphs. To address the potential very large scale of such datasets/graphs that test the limits of state-of-the-art approaches, we map the problem of data clustering to a community detection one over the corresponding data graphs. Specifically, we propose a novel computational approach for enhancing the traditional Girvan–Newman (GN community detection algorithm via hyperbolic network embedding. The data dependency graph is embedded in the hyperbolic space via Rigel embedding, allowing more efficient computation of edge-betweenness centrality needed in the GN algorithm. This allows for more efficient clustering of the nodes of the data graph in terms of modularity, without sacrificing considerable accuracy. In order to study the operation of our approach with respect to enhancing GN community detection, we employ various representative types of artificial complex networks, such as scale-free, small-world and random geometric topologies, and frequently-employed benchmark datasets for demonstrating its efficacy in terms of data clustering via community detection. Furthermore, we provide a proof-of-concept evaluation by applying the proposed framework over multi-dimensional datasets obtained from an operational smart-city/building IoT infrastructure provided by the Federated Interoperable Semantic IoT/cloud Testbeds and Applications (FIESTA-IoT testbed federation. It is shown that the proposed framework can be indeed used for community detection/data clustering and exploited in various other IoT applications, such as performing more energy-efficient smart-city/building sensing.

  12. Introduction to graph theory

    CERN Document Server

    Trudeau, Richard J

    1994-01-01

    Preface1. Pure Mathematics Introduction; Euclidean Geometry as Pure Mathematics; Games; Why Study Pure Mathematics?; What's Coming; Suggested Reading2. Graphs Introduction; Sets; Paradox; Graphs; Graph diagrams; Cautions; Common Graphs; Discovery; Complements and Subgraphs; Isomorphism; Recognizing Isomorphic Graphs; Semantics The Number of Graphs Having a Given nu; Exercises; Suggested Reading3. Planar Graphs Introduction; UG, K subscript 5, and the Jordan Curve Theorem; Are there More Nonplanar Graphs?; Expansions; Kuratowski's Theorem; Determining Whether a Graph is Planar or

  13. Graph Theory. 2. Vertex Descriptors and Graph Coloring

    Directory of Open Access Journals (Sweden)

    Lorentz JÄNTSCHI

    2002-12-01

    Full Text Available This original work presents the construction of a set of ten sequence matrices and their applications for ordering vertices in graphs. For every sequence matrix three ordering criteria are applied: lexicographic ordering, based on strings of numbers, corresponding to every vertex, extracted as rows from sequence matrices; ordering by the sum of path lengths from a given vertex; and ordering by the sum of paths, starting from a given vertex. We also examine a graph that has different orderings for the above criteria. We then proceed to demonstrate that every criterion induced its own partition of graph vertex. We propose the following theoretical result: both LAVS and LVDS criteria generate identical partitioning of vertices in any graph. Finally, a coloring of graph vertices according to introduced ordering criteria was proposed.

  14. On an edge partition and root graphs of some classes of line graphs

    Directory of Open Access Journals (Sweden)

    K Pravas

    2017-04-01

    Full Text Available The Gallai and the anti-Gallai graphs of a graph $G$ are complementary pairs of spanning subgraphs of the line graph of $G$. In this paper we find some structural relations between these graph classes by finding a partition of the edge set of the line graph of a graph $G$ into the edge sets of the Gallai and anti-Gallai graphs of $G$. Based on this, an optimal algorithm to find the root graph of a line graph is obtained. Moreover, root graphs of diameter-maximal, distance-hereditary, Ptolemaic and chordal graphs are also discussed.

  15. Graphs and matrices

    CERN Document Server

    Bapat, Ravindra B

    2014-01-01

    This new edition illustrates the power of linear algebra in the study of graphs. The emphasis on matrix techniques is greater than in other texts on algebraic graph theory. Important matrices associated with graphs (for example, incidence, adjacency and Laplacian matrices) are treated in detail. Presenting a useful overview of selected topics in algebraic graph theory, early chapters of the text focus on regular graphs, algebraic connectivity, the distance matrix of a tree, and its generalized version for arbitrary graphs, known as the resistance matrix. Coverage of later topics include Laplacian eigenvalues of threshold graphs, the positive definite completion problem and matrix games based on a graph. Such an extensive coverage of the subject area provides a welcome prompt for further exploration. The inclusion of exercises enables practical learning throughout the book. In the new edition, a new chapter is added on the line graph of a tree, while some results in Chapter 6 on Perron-Frobenius theory are reo...

  16. Focus-based filtering + clustering technique for power-law networks with small world phenomenon

    Science.gov (United States)

    Boutin, François; Thièvre, Jérôme; Hascoët, Mountaz

    2006-01-01

    Realistic interaction networks usually present two main properties: a power-law degree distribution and a small world behavior. Few nodes are linked to many nodes and adjacent nodes are likely to share common neighbors. Moreover, graph structure usually presents a dense core that is difficult to explore with classical filtering and clustering techniques. In this paper, we propose a new filtering technique accounting for a user-focus. This technique extracts a tree-like graph with also power-law degree distribution and small world behavior. Resulting structure is easily drawn with classical force-directed drawing algorithms. It is also quickly clustered and displayed into a multi-level silhouette tree (MuSi-Tree) from any user-focus. We built a new graph filtering + clustering + drawing API and report a case study.

  17. Graphs cospectral with a friendship graph or its complement

    Directory of Open Access Journals (Sweden)

    Alireza Abdollahi

    2013-12-01

    Full Text Available Let $n$ be any positive integer and let $F_n$ be the friendship (or Dutch windmill graph with $2n+1$ vertices and $3n$ edges. Here we study graphs with the same adjacency spectrum as the $F_n$. Two graphs are called cospectral if the eigenvalues multiset of their adjacency matrices are the same. Let $G$ be a graph cospectral with $F_n$. Here we prove that if $G$ has no cycle of length $4$ or $5$, then $Gcong F_n$. Moreover if $G$ is connected and planar then $Gcong F_n$.All but one of connected components of $G$ are isomorphic to $K_2$.The complement $overline{F_n}$ of the friendship graph is determined by its adjacency eigenvalues, that is, if $overline{F_n}$ is cospectral with a graph $H$, then $Hcong overline{F_n}$.

  18. Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering.

    Science.gov (United States)

    Theofilatos, Konstantinos; Pavlopoulou, Niki; Papasavvas, Christoforos; Likothanassis, Spiros; Dimitrakopoulos, Christos; Georgopoulos, Efstratios; Moschopoulos, Charalampos; Mavroudi, Seferina

    2015-03-01

    Proteins are considered to be the most important individual components of biological systems and they combine to form physical protein complexes which are responsible for certain molecular functions. Despite the large availability of protein-protein interaction (PPI) information, not much information is available about protein complexes. Experimental methods are limited in terms of time, efficiency, cost and performance constraints. Existing computational methods have provided encouraging preliminary results, but they phase certain disadvantages as they require parameter tuning, some of them cannot handle weighted PPI data and others do not allow a protein to participate in more than one protein complex. In the present paper, we propose a new fully unsupervised methodology for predicting protein complexes from weighted PPI graphs. The proposed methodology is called evolutionary enhanced Markov clustering (EE-MC) and it is a hybrid combination of an adaptive evolutionary algorithm and a state-of-the-art clustering algorithm named enhanced Markov clustering. EE-MC was compared with state-of-the-art methodologies when applied to datasets from the human and the yeast Saccharomyces cerevisiae organisms. Using public available datasets, EE-MC outperformed existing methodologies (in some datasets the separation metric was increased by 10-20%). Moreover, when applied to new human datasets its performance was encouraging in the prediction of protein complexes which consist of proteins with high functional similarity. In specific, 5737 protein complexes were predicted and 72.58% of them are enriched for at least one gene ontology (GO) function term. EE-MC is by design able to overcome intrinsic limitations of existing methodologies such as their inability to handle weighted PPI networks, their constraint to assign every protein in exactly one cluster and the difficulties they face concerning the parameter tuning. This fact was experimentally validated and moreover, new

  19. Entanglement of the valence-bond-solid state on an arbitrary graph

    International Nuclear Information System (INIS)

    Xu Ying; Korepin, Vladimir E

    2008-01-01

    The Affleck-Kennedy-Lieb-Tasaki (AKLT) spin interacting model can be defined on an arbitrary graph. We explain the construction of the AKLT Hamiltonian. Given certain conditions, the ground state is unique and known as the valence-bond-solid (VBS) state. It can be used in measurement-based quantum computation as a resource state instead of the cluster state. We study the VBS ground state on an arbitrary connected graph. The graph is cut into two disconnected parts: the block and the environment. We study the entanglement between these two parts and prove that many eigenvalues of the density matrix of the block are zero. We describe a subspace of eigenvectors of the density matrix corresponding to non-zero eigenvalues. The subspace is the degenerate ground states of some Hamiltonian which we call the block Hamiltonian

  20. Topics in graph theory graphs and their Cartesian product

    CERN Document Server

    Imrich, Wilfried; Rall, Douglas F

    2008-01-01

    From specialists in the field, you will learn about interesting connections and recent developments in the field of graph theory by looking in particular at Cartesian products-arguably the most important of the four standard graph products. Many new results in this area appear for the first time in print in this book. Written in an accessible way, this book can be used for personal study in advanced applications of graph theory or for an advanced graph theory course.

  1. Study of Chromatic parameters of Line, Total, Middle graphs and Graph operators of Bipartite graph

    Science.gov (United States)

    Nagarathinam, R.; Parvathi, N.

    2018-04-01

    Chromatic parameters have been explored on the basis of graph coloring process in which a couple of adjacent nodes receives different colors. But the Grundy and b-coloring executes maximum colors under certain restrictions. In this paper, Chromatic, b-chromatic and Grundy number of some graph operators of bipartite graph has been investigat

  2. Dynamics of cluster structures in a financial market network

    Science.gov (United States)

    Kocheturov, Anton; Batsyn, Mikhail; Pardalos, Panos M.

    2014-11-01

    In the course of recent fifteen years the network analysis has become a powerful tool for studying financial markets. In this work we analyze stock markets of the USA and Sweden. We study cluster structures of a market network constructed from a correlation matrix of returns of the stocks traded in each of these markets. Such cluster structures are obtained by means of the P-Median Problem (PMP) whose objective is to maximize the total correlation between a set of stocks called medians of size p and other stocks. Every cluster structure is an undirected disconnected weighted graph in which every connected component (cluster) is a star, or a tree with one central node (called a median) and several leaf nodes connected with the median by weighted edges. Our main observation is that in non-crisis periods of time cluster structures change more chaotically, while during crises they show more stable behavior and fewer changes. Thus an increasing stability of a market graph cluster structure obtained via the PMP could be used as an indicator of a coming crisis.

  3. Impaired cerebral blood flow networks in temporal lobe epilepsy with hippocampal sclerosis: A graph theoretical approach.

    Science.gov (United States)

    Sone, Daichi; Matsuda, Hiroshi; Ota, Miho; Maikusa, Norihide; Kimura, Yukio; Sumida, Kaoru; Yokoyama, Kota; Imabayashi, Etsuko; Watanabe, Masako; Watanabe, Yutaka; Okazaki, Mitsutoshi; Sato, Noriko

    2016-09-01

    Graph theory is an emerging method to investigate brain networks. Altered cerebral blood flow (CBF) has frequently been reported in temporal lobe epilepsy (TLE), but graph theoretical findings of CBF are poorly understood. Here, we explored graph theoretical networks of CBF in TLE using arterial spin labeling imaging. We recruited patients with TLE and unilateral hippocampal sclerosis (HS) (19 patients with left TLE, and 21 with right TLE) and 20 gender- and age-matched healthy control subjects. We obtained all participants' CBF maps using pseudo-continuous arterial spin labeling and analyzed them using the Graph Analysis Toolbox (GAT) software program. As a result, compared to the controls, the patients with left TLE showed a significantly low clustering coefficient (p=0.024), local efficiency (p=0.001), global efficiency (p=0.010), and high transitivity (p=0.015), whereas the patients with right TLE showed significantly high assortativity (p=0.046) and transitivity (p=0.011). The group with right TLE also had high characteristic path length values (p=0.085), low global efficiency (p=0.078), and low resilience to targeted attack (p=0.101) at a trend level. Lower normalized clustering coefficient (p=0.081) in the left TLE and higher normalized characteristic path length (p=0.089) in the right TLE were found also at a trend level. Both the patients with left and right TLE showed significantly decreased clustering in similar areas, i.e., the cingulate gyri, precuneus, and occipital lobe. Our findings revealed differing left-right network metrics in which an inefficient CBF network in left TLE and vulnerability to irritation in right TLE are suggested. The left-right common finding of regional decreased clustering might reflect impaired default-mode networks in TLE. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Refining intra-protein contact prediction by graph analysis

    Directory of Open Access Journals (Sweden)

    Eyal Eran

    2007-05-01

    Full Text Available Abstract Background Accurate prediction of intra-protein residue contacts from sequence information will allow the prediction of protein structures. Basic predictions of such specific contacts can be further refined by jointly analyzing predicted contacts, and by adding information on the relative positions of contacts in the protein primary sequence. Results We introduce a method for graph analysis refinement of intra-protein contacts, termed GARP. Our previously presented intra-contact prediction method by means of pair-to-pair substitution matrix (P2PConPred was used to test the GARP method. In our approach, the top contact predictions obtained by a basic prediction method were used as edges to create a weighted graph. The edges were scored by a mutual clustering coefficient that identifies highly connected graph regions, and by the density of edges between the sequence regions of the edge nodes. A test set of 57 proteins with known structures was used to determine contacts. GARP improves the accuracy of the P2PConPred basic prediction method in whole proteins from 12% to 18%. Conclusion Using a simple approach we increased the contact prediction accuracy of a basic method by 1.5 times. Our graph approach is simple to implement, can be used with various basic prediction methods, and can provide input for further downstream analyses.

  5. Handbook of graph grammars and computing by graph transformation

    CERN Document Server

    Engels, G; Kreowski, H J; Rozenberg, G

    1999-01-01

    Graph grammars originated in the late 60s, motivated by considerations about pattern recognition and compiler construction. Since then, the list of areas which have interacted with the development of graph grammars has grown quite impressively. Besides the aforementioned areas, it includes software specification and development, VLSI layout schemes, database design, modeling of concurrent systems, massively parallel computer architectures, logic programming, computer animation, developmental biology, music composition, visual languages, and many others.The area of graph grammars and graph tran

  6. Bayesian estimation of mixtures with dynamic transitions and known component parameters

    Czech Academy of Sciences Publication Activity Database

    Nagy, I.; Suzdaleva, Evgenia; Kárný, Miroslav

    2011-01-01

    Roč. 47, č. 4 (2011), s. 572-594 ISSN 0023-5954 R&D Projects: GA MŠk 1M0572; GA TA ČR TA01030123; GA ČR GA102/08/0567 Grant - others:Skoda Auto(CZ) ENS/2009/UTIA Institutional research plan: CEZ:AV0Z10750506 Keywords : mixture model * Bayesian estimation * approximation * clustering * classification Subject RIV: BC - Control Systems Theory Impact factor: 0.454, year: 2011 http://library.utia.cas.cz/separaty/2011/AS/nagy-bayesian estimation of mixtures with dynamic transitions and known component parameters.pdf

  7. Bayesian network learning for natural hazard assessments

    Science.gov (United States)

    Vogel, Kristin

    2016-04-01

    Even though quite different in occurrence and consequences, from a modelling perspective many natural hazards share similar properties and challenges. Their complex nature as well as lacking knowledge about their driving forces and potential effects make their analysis demanding. On top of the uncertainty about the modelling framework, inaccurate or incomplete event observations and the intrinsic randomness of the natural phenomenon add up to different interacting layers of uncertainty, which require a careful handling. Thus, for reliable natural hazard assessments it is crucial not only to capture and quantify involved uncertainties, but also to express and communicate uncertainties in an intuitive way. Decision-makers, who often find it difficult to deal with uncertainties, might otherwise return to familiar (mostly deterministic) proceedings. In the scope of the DFG research training group „NatRiskChange" we apply the probabilistic framework of Bayesian networks for diverse natural hazard and vulnerability studies. The great potential of Bayesian networks was already shown in previous natural hazard assessments. Treating each model component as random variable, Bayesian networks aim at capturing the joint distribution of all considered variables. Hence, each conditional distribution of interest (e.g. the effect of precautionary measures on damage reduction) can be inferred. The (in-)dependencies between the considered variables can be learned purely data driven or be given by experts. Even a combination of both is possible. By translating the (in-)dependences into a graph structure, Bayesian networks provide direct insights into the workings of the system and allow to learn about the underlying processes. Besides numerous studies on the topic, learning Bayesian networks from real-world data remains challenging. In previous studies, e.g. on earthquake induced ground motion and flood damage assessments, we tackled the problems arising with continuous variables

  8. Bayesian networks in educational assessment

    CERN Document Server

    Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M

    2015-01-01

    Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...

  9. Network Analysis Tools: from biological networks to clusters and pathways.

    Science.gov (United States)

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Vanderstocken, Gilles; van Helden, Jacques

    2008-01-01

    Network Analysis Tools (NeAT) is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding. The tools are interconnected to enable a stepwise analysis of the network through a complete analytical workflow. In this protocol, we present a typical case of utilization, where the tasks above are combined to decipher a protein-protein interaction network retrieved from the STRING database. The results returned by NeAT are typically subnetworks, networks enriched with additional information (i.e., clusters or paths) or tables displaying statistics. Typical networks comprising several thousands of nodes and arcs can be analyzed within a few minutes. The complete protocol can be read and executed in approximately 1 h.

  10. Cluster Optimization and Parallelization of Simulations with Dynamically Adaptive Grids

    KAUST Repository

    Schreiber, Martin; Weinzierl, Tobias; Bungartz, Hans-Joachim

    2013-01-01

    The present paper studies solvers for partial differential equations that work on dynamically adaptive grids stemming from spacetrees. Due to the underlying tree formalism, such grids efficiently can be decomposed into connected grid regions (clusters) on-the-fly. A graph on those clusters classified according to their grid invariancy, workload, multi-core affinity, and further meta data represents the inter-cluster communication. While stationary clusters already can be handled more efficiently than their dynamic counterparts, we propose to treat them as atomic grid entities and introduce a skip mechanism that allows the grid traversal to omit those regions completely. The communication graph ensures that the cluster data nevertheless are kept consistent, and several shared memory parallelization strategies are feasible. A hyperbolic benchmark that has to remesh selected mesh regions iteratively to preserve conforming tessellations acts as benchmark for the present work. We discuss runtime improvements resulting from the skip mechanism and the implications on shared memory performance and load balancing. © 2013 Springer-Verlag.

  11. Centrosymmetric Graphs And A Lower Bound For Graph Energy Of Fullerenes

    Directory of Open Access Journals (Sweden)

    Katona Gyula Y.

    2014-11-01

    Full Text Available The energy of a molecular graph G is defined as the summation of the absolute values of the eigenvalues of adjacency matrix of a graph G. In this paper, an infinite class of fullerene graphs with 10n vertices, n ≥ 2, is considered. By proving centrosymmetricity of the adjacency matrix of these fullerene graphs, a lower bound for its energy is given. Our method is general and can be extended to other class of fullerene graphs.

  12. Graphs and Homomorphisms

    CERN Document Server

    Hell, Pavol

    2004-01-01

    This is a book about graph homomorphisms. Graph theory is now an established discipline but the study of graph homomorphisms has only recently begun to gain wide acceptance and interest. The subject gives a useful perspective in areas such as graph reconstruction, products, fractional and circular colourings, and has applications in complexity theory, artificial intelligence, telecommunication, and, most recently, statistical physics.Based on the authors' lecture notes for graduate courses, this book can be used as a textbook for a second course in graph theory at 4th year or master's level an

  13. Bayesian inference for Hawkes processes

    DEFF Research Database (Denmark)

    Rasmussen, Jakob Gulddahl

    The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....

  14. Bayesian inference for Hawkes processes

    DEFF Research Database (Denmark)

    Rasmussen, Jakob Gulddahl

    2013-01-01

    The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....

  15. Bayesian methods for hackers probabilistic programming and Bayesian inference

    CERN Document Server

    Davidson-Pilon, Cameron

    2016-01-01

    Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...

  16. Decomposing Oriented Graphs into Six Locally Irregular Oriented Graphs

    DEFF Research Database (Denmark)

    Bensmail, Julien; Renault, Gabriel

    2016-01-01

    An undirected graph G is locally irregular if every two of its adjacent vertices have distinct degrees. We say that G is decomposable into k locally irregular graphs if there exists a partition E1∪E2∪⋯∪Ek of the edge set E(G) such that each Ei induces a locally irregular graph. It was recently co...

  17. An advanced method for classifying atmospheric circulation types based on prototypes connectivity graph

    Science.gov (United States)

    Zagouras, Athanassios; Argiriou, Athanassios A.; Flocas, Helena A.; Economou, George; Fotopoulos, Spiros

    2012-11-01

    Classification of weather maps at various isobaric levels as a methodological tool is used in several problems related to meteorology, climatology, atmospheric pollution and to other fields for many years. Initially the classification was performed manually. The criteria used by the person performing the classification are features of isobars or isopleths of geopotential height, depending on the type of maps to be classified. Although manual classifications integrate the perceptual experience and other unquantifiable qualities of the meteorology specialists involved, these are typically subjective and time consuming. Furthermore, during the last years different approaches of automated methods for atmospheric circulation classification have been proposed, which present automated and so-called objective classifications. In this paper a new method of atmospheric circulation classification of isobaric maps is presented. The method is based on graph theory. It starts with an intelligent prototype selection using an over-partitioning mode of fuzzy c-means (FCM) algorithm, proceeds to a graph formulation for the entire dataset and produces the clusters based on the contemporary dominant sets clustering method. Graph theory is a novel mathematical approach, allowing a more efficient representation of spatially correlated data, compared to the classical Euclidian space representation approaches, used in conventional classification methods. The method has been applied to the classification of 850 hPa atmospheric circulation over the Eastern Mediterranean. The evaluation of the automated methods is performed by statistical indexes; results indicate that the classification is adequately comparable with other state-of-the-art automated map classification methods, for a variable number of clusters.

  18. Non-heuristic reduction of the graph in graph-cut optimization

    International Nuclear Information System (INIS)

    Malgouyres, François; Lermé, Nicolas

    2012-01-01

    During the last ten years, graph cuts had a growing impact in shape optimization. In particular, they are commonly used in applications of shape optimization such as image processing, computer vision and computer graphics. Their success is due to their ability to efficiently solve (apparently) difficult shape optimization problems which typically involve the perimeter of the shape. Nevertheless, solving problems with a large number of variables remains computationally expensive and requires a high memory usage since underlying graphs sometimes involve billion of nodes and even more edges. Several strategies have been proposed in the literature to improve graph-cuts in this regards. In this paper, we give a formal statement which expresses that a simple and local test performed on every node before its construction permits to avoid the construction of useless nodes for the graphs typically encountered in image processing and vision. A useless node is such that the value of the maximum flow in the graph does not change when removing the node from the graph. Such a test therefore permits to limit the construction of the graph to a band of useful nodes surrounding the final cut.

  19. Reproducibility of graph metrics of human brain functional networks.

    Science.gov (United States)

    Deuker, Lorena; Bullmore, Edward T; Smith, Marie; Christensen, Soren; Nathan, Pradeep J; Rockstroh, Brigitte; Bassett, Danielle S

    2009-10-01

    Graph theory provides many metrics of complex network organization that can be applied to analysis of brain networks derived from neuroimaging data. Here we investigated the test-retest reliability of graph metrics of functional networks derived from magnetoencephalography (MEG) data recorded in two sessions from 16 healthy volunteers who were studied at rest and during performance of the n-back working memory task in each session. For each subject's data at each session, we used a wavelet filter to estimate the mutual information (MI) between each pair of MEG sensors in each of the classical frequency intervals from gamma to low delta in the overall range 1-60 Hz. Undirected binary graphs were generated by thresholding the MI matrix and 8 global network metrics were estimated: the clustering coefficient, path length, small-worldness, efficiency, cost-efficiency, assortativity, hierarchy, and synchronizability. Reliability of each graph metric was assessed using the intraclass correlation (ICC). Good reliability was demonstrated for most metrics applied to the n-back data (mean ICC=0.62). Reliability was greater for metrics in lower frequency networks. Higher frequency gamma- and beta-band networks were less reliable at a global level but demonstrated high reliability of nodal metrics in frontal and parietal regions. Performance of the n-back task was associated with greater reliability than measurements on resting state data. Task practice was also associated with greater reliability. Collectively these results suggest that graph metrics are sufficiently reliable to be considered for future longitudinal studies of functional brain network changes.

  20. A CLASSIFIER SYSTEM USING SMOOTH GRAPH COLORING

    Directory of Open Access Journals (Sweden)

    JORGE FLORES CRUZ

    2017-01-01

    Full Text Available Unsupervised classifiers allow clustering methods with less or no human intervention. Therefore it is desirable to group the set of items with less data processing. This paper proposes an unsupervised classifier system using the model of soft graph coloring. This method was tested with some classic instances in the literature and the results obtained were compared with classifications made with human intervention, yielding as good or better results than supervised classifiers, sometimes providing alternative classifications that considers additional information that humans did not considered.

  1. The cylindrical K-function and Poisson line cluster point processes

    DEFF Research Database (Denmark)

    Møller, Jesper; Safavimanesh, Farzaneh; Rasmussen, Jakob G.

    Poisson line cluster point processes, is also introduced. Parameter estimation based on moment methods or Bayesian inference for this model is discussed when the underlying Poisson line process and the cluster memberships are treated as hidden processes. To illustrate the methodologies, we analyze two...

  2. Graphing trillions of triangles.

    Science.gov (United States)

    Burkhardt, Paul

    2017-07-01

    The increasing size of Big Data is often heralded but how data are transformed and represented is also profoundly important to knowledge discovery, and this is exemplified in Big Graph analytics. Much attention has been placed on the scale of the input graph but the product of a graph algorithm can be many times larger than the input. This is true for many graph problems, such as listing all triangles in a graph. Enabling scalable graph exploration for Big Graphs requires new approaches to algorithms, architectures, and visual analytics. A brief tutorial is given to aid the argument for thoughtful representation of data in the context of graph analysis. Then a new algebraic method to reduce the arithmetic operations in counting and listing triangles in graphs is introduced. Additionally, a scalable triangle listing algorithm in the MapReduce model will be presented followed by a description of the experiments with that algorithm that led to the current largest and fastest triangle listing benchmarks to date. Finally, a method for identifying triangles in new visual graph exploration technologies is proposed.

  3. Prediction of community prevalence of human onchocerciasis in the Amazonian onchocerciasis focus: Bayesian approach.

    Science.gov (United States)

    Carabin, Hélène; Escalona, Marisela; Marshall, Clare; Vivas-Martínez, Sarai; Botto, Carlos; Joseph, Lawrence; Basáñez, María-Gloria

    2003-01-01

    To develop a Bayesian hierarchical model for human onchocerciasis with which to explore the factors that influence prevalence of microfilariae in the Amazonian focus of onchocerciasis and predict the probability of any community being at least mesoendemic (>20% prevalence of microfilariae), and thus in need of priority ivermectin treatment. Models were developed with data from 732 individuals aged > or =15 years who lived in 29 Yanomami communities along four rivers of the south Venezuelan Orinoco basin. The models' abilities to predict prevalences of microfilariae in communities were compared. The deviance information criterion, Bayesian P-values, and residual values were used to select the best model with an approximate cross-validation procedure. A three-level model that acknowledged clustering of infection within communities performed best, with host age and sex included at the individual level, a river-dependent altitude effect at the community level, and additional clustering of communities along rivers. This model correctly classified 25/29 (86%) villages with respect to their need for priority ivermectin treatment. Bayesian methods are a flexible and useful approach for public health research and control planning. Our model acknowledges the clustering of infection within communities, allows investigation of links between individual- or community-specific characteristics and infection, incorporates additional uncertainty due to missing covariate data, and informs policy decisions by predicting the probability that a new community is at least mesoendemic.

  4. Adaptive Graph Convolutional Neural Networks

    OpenAIRE

    Li, Ruoyu; Wang, Sheng; Zhu, Feiyun; Huang, Junzhou

    2018-01-01

    Graph Convolutional Neural Networks (Graph CNNs) are generalizations of classical CNNs to handle graph data such as molecular data, point could and social networks. Current filters in graph CNNs are built for fixed and shared graph structure. However, for most real data, the graph structures varies in both size and connectivity. The paper proposes a generalized and flexible graph CNN taking data of arbitrary graph structure as input. In that way a task-driven adaptive graph is learned for eac...

  5. Comparing brain networks of different size and connectivity density using graph theory.

    Directory of Open Access Journals (Sweden)

    Bernadette C M van Wijk

    Full Text Available Graph theory is a valuable framework to study the organization of functional and anatomical connections in the brain. Its use for comparing network topologies, however, is not without difficulties. Graph measures may be influenced by the number of nodes (N and the average degree (k of the network. The explicit form of that influence depends on the type of network topology, which is usually unknown for experimental data. Direct comparisons of graph measures between empirical networks with different N and/or k can therefore yield spurious results. We list benefits and pitfalls of various approaches that intend to overcome these difficulties. We discuss the initial graph definition of unweighted graphs via fixed thresholds, average degrees or edge densities, and the use of weighted graphs. For instance, choosing a threshold to fix N and k does eliminate size and density effects but may lead to modifications of the network by enforcing (ignoring non-significant (significant connections. Opposed to fixing N and k, graph measures are often normalized via random surrogates but, in fact, this may even increase the sensitivity to differences in N and k for the commonly used clustering coefficient and small-world index. To avoid such a bias we tried to estimate the N,k-dependence for empirical networks, which can serve to correct for size effects, if successful. We also add a number of methods used in social sciences that build on statistics of local network structures including exponential random graph models and motif counting. We show that none of the here-investigated methods allows for a reliable and fully unbiased comparison, but some perform better than others.

  6. Asymptote Misconception on Graphing Functions: Does Graphing Software Resolve It?

    Directory of Open Access Journals (Sweden)

    Mehmet Fatih Öçal

    2017-01-01

    Full Text Available Graphing function is an important issue in mathematics education due to its use in various areas of mathematics and its potential roles for students to enhance learning mathematics. The use of some graphing software assists students’ learning during graphing functions. However, the display of graphs of functions that students sketched by hand may be relatively different when compared to the correct forms sketched using graphing software. The possible misleading effects of this situation brought a discussion of a misconception (asymptote misconception on graphing functions. The purpose of this study is two- fold. First of all, this study investigated whether using graphing software (GeoGebra in this case helps students to determine and resolve this misconception in calculus classrooms. Second, the reasons for this misconception are sought. The multiple case study was utilized in this study. University students in two calculus classrooms who received instructions with (35 students or without GeoGebra assisted instructions (32 students were compared according to whether they fell into this misconception on graphing basic functions (1/x, lnx, ex. In addition, students were interviewed to reveal the reasons behind this misconception. Data were analyzed by means of descriptive and content analysis methods. The findings indicated that those who received GeoGebra assisted instruction were better in resolving it. In addition, the reasons behind this misconception were found to be teacher-based, exam-based and some other factors.

  7. X-Graphs: Language and Algorithms for Heterogeneous Graph Streams

    Science.gov (United States)

    2017-09-01

    are widely used by academia and industry. 15. SUBJECT TERMS Data Analytics, Graph Analytics, High-Performance Computing 16. SECURITY CLASSIFICATION...form the core of the DeepDive Knowledge Construction System. 2 INTRODUCTION The goal of the X-Graphs project was to develop computational techniques...memory multicore machine. Ringo is based on Snap.py and SNAP, and uses Python . Ringo now allows the integration of Delite DSL Framework Graph

  8. On the sizes of expander graphs and minimum distances of graph codes

    DEFF Research Database (Denmark)

    Høholdt, Tom; Justesen, Jørn

    2014-01-01

    We give lower bounds for the minimum distances of graph codes based on expander graphs. The bounds depend only on the second eigenvalue of the graph and the parameters of the component codes. We also give an upper bound on the size of a degree regular graph with given second eigenvalue....

  9. Similarity Measure of Graphs

    Directory of Open Access Journals (Sweden)

    Amine Labriji

    2017-07-01

    Full Text Available The topic of identifying the similarity of graphs was considered as highly recommended research field in the Web semantic, artificial intelligence, the shape recognition and information research. One of the fundamental problems of graph databases is finding similar graphs to a graph query. Existing approaches dealing with this problem are usually based on the nodes and arcs of the two graphs, regardless of parental semantic links. For instance, a common connection is not identified as being part of the similarity of two graphs in cases like two graphs without common concepts, the measure of similarity based on the union of two graphs, or the one based on the notion of maximum common sub-graph (SCM, or the distance of edition of graphs. This leads to an inadequate situation in the context of information research. To overcome this problem, we suggest a new measure of similarity between graphs, based on the similarity measure of Wu and Palmer. We have shown that this new measure satisfies the properties of a measure of similarities and we applied this new measure on examples. The results show that our measure provides a run time with a gain of time compared to existing approaches. In addition, we compared the relevance of the similarity values obtained, it appears that this new graphs measure is advantageous and  offers a contribution to solving the problem mentioned above.

  10. A Parallel Approach for Frequent Subgraph Mining in a Single Large Graph Using Spark

    Directory of Open Access Journals (Sweden)

    Fengcai Qiao

    2018-02-01

    Full Text Available Frequent subgraph mining (FSM plays an important role in graph mining, attracting a great deal of attention in many areas, such as bioinformatics, web data mining and social networks. In this paper, we propose SSiGraM (Spark based Single Graph Mining, a Spark based parallel frequent subgraph mining algorithm in a single large graph. Aiming to approach the two computational challenges of FSM, we conduct the subgraph extension and support evaluation parallel across all the distributed cluster worker nodes. In addition, we also employ a heuristic search strategy and three novel optimizations: load balancing, pre-search pruning and top-down pruning in the support evaluation process, which significantly improve the performance. Extensive experiments with four different real-world datasets demonstrate that the proposed algorithm outperforms the existing GraMi (Graph Mining algorithm by an order of magnitude for all datasets and can work with a lower support threshold.

  11. Cluster expansion for abstract polymer models New bounds from an old approach

    CERN Document Server

    Fernández, R

    2006-01-01

    We revisit the classical approach to cluster expansions, based on tree graphs, and establish a new convergence condition that improves those by Koteck\\'y-Preiss and Dobrushin, as we show in some examples. The strategy is to better exploit a well known tree-graph expression, due to Penrose.

  12. Spectra of Graphs

    NARCIS (Netherlands)

    Brouwer, A.E.; Haemers, W.H.

    2012-01-01

    This book gives an elementary treatment of the basic material about graph spectra, both for ordinary, and Laplace and Seidel spectra. The text progresses systematically, by covering standard topics before presenting some new material on trees, strongly regular graphs, two-graphs, association

  13. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    Science.gov (United States)

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  14. TRANSPORT AND LOGISTICS CLUSTER IN AN ECONOMIC SYSTEM OF A REGION

    Directory of Open Access Journals (Sweden)

    I.G. Menshenina

    2008-09-01

    Full Text Available The main types of clusters are described in the article. The function of a transport and logistics model is also described using the theory of graphs. The relationship of clusters is shown in the economic system of a region, and the main role of transport and logistics cluster is emphasized as a good condition for the effective functioning of other clusters in the region.

  15. Pattern graph rewrite systems

    Directory of Open Access Journals (Sweden)

    Aleks Kissinger

    2014-03-01

    Full Text Available String diagrams are a powerful tool for reasoning about physical processes, logic circuits, tensor networks, and many other compositional structures. Dixon, Duncan and Kissinger introduced string graphs, which are a combinatoric representations of string diagrams, amenable to automated reasoning about diagrammatic theories via graph rewrite systems. In this extended abstract, we show how the power of such rewrite systems can be greatly extended by introducing pattern graphs, which provide a means of expressing infinite families of rewrite rules where certain marked subgraphs, called !-boxes ("bang boxes", on both sides of a rule can be copied any number of times or removed. After reviewing the string graph formalism, we show how string graphs can be extended to pattern graphs and how pattern graphs and pattern rewrite rules can be instantiated to concrete string graphs and rewrite rules. We then provide examples demonstrating the expressive power of pattern graphs and how they can be applied to study interacting algebraic structures that are central to categorical quantum mechanics.

  16. Bayesian artificial intelligence

    CERN Document Server

    Korb, Kevin B

    2010-01-01

    Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente

  17. Unavailability analysis of a PWR safety system by a Bayesian network

    International Nuclear Information System (INIS)

    Estevao, Lilian B.; Melo, Paulo Fernando F. Frutuoso e; Rivero, Jose J.

    2013-01-01

    Bayesian networks (BN) are directed acyclic graphs that have dependencies between variables, which are represented by nodes. These dependencies are represented by lines connecting the nodes and can be directed or not. Thus, it is possible to model conditional probabilities and calculate them with the help of Bayes' Theorem. The objective of this paper is to present the modeling of the failure of a safety system of a typical second generation light water reactor plant, the Containment Heat Removal System (CHRS), whose function is to cool the water of containment reservoir being recirculated through the Containment Spray Recirculation System (CSRS). CSRS is automatically initiated after a loss of coolant accident (LOCA) and together with the CHRS cools the reservoir water. The choice of this system was due to the fact that its analysis by a fault tree is available in Appendix II of the Reactor Safety Study Report (WASH-1400), and therefore all the necessary technical information is also available, such as system diagrams, failure data input and the fault tree itself that was developed to study system failure. The reason for the use of a bayesian network in this context was to assess its ability to reproduce the results of fault tree analyses and also verify the feasibility of treating dependent events. Comparing the fault trees and bayesian networks, the results obtained for the system failure were very close. (author)

  18. A Latent Variable Clustering Method for Wireless Sensor Networks

    DEFF Research Database (Denmark)

    Vasilev, Vladislav; Iliev, Georgi; Poulkov, Vladimir

    2016-01-01

    In this paper we derive a clustering method based on the Hidden Conditional Random Field (HCRF) model in order to maximizes the performance of a wireless sensor. Our novel approach to clustering in this paper is in the application of an index invariant graph that we defined in a previous work and...

  19. On middle cube graphs

    Directory of Open Access Journals (Sweden)

    C. Dalfo

    2015-10-01

    Full Text Available We study a family of graphs related to the $n$-cube. The middle cube graph of parameter k is the subgraph of $Q_{2k-1}$ induced by the set of vertices whose binary representation has either $k-1$ or $k$ number of ones. The middle cube graphs can be obtained from the well-known odd graphs by doubling their vertex set. Here we study some of the properties of the middle cube graphs in the light of the theory of distance-regular graphs. In particular, we completely determine their spectra (eigenvalues and their multiplicities, and associated eigenvectors.

  20. Social Graph Community Differentiated by Node Features with Partly Missing Information

    Directory of Open Access Journals (Sweden)

    V. O. Chesnokov

    2015-01-01

    Full Text Available This paper proposes a new algorithm for community differentiation in social graphs, which uses information both on the graph structure and on the vertices. We consider user's ego-network i.e. his friends, with no himself, where each vertex has a set of features such as details on a workplace, institution, etc. The task is to determine missing or unspecified features of the vertices, based on their neighbors' features, and use these features to differentiate the communities in the social graph. Two vertices are believed to belong to the same community if they have a common feature. A hypothesis has been put forward that if most neighbors of a vertex have a common feature, there is a good probability that the vertex has this feature as well. The proposed algorithm is iterative and updates features of vertices, based on its neighbors, according to the hypothesis. Share of neighbors that form a majority is specified by the algorithm parameter. Complexity of single iteration depends linearly on the number of edges in the graph.To assess the quality of clustering three normalized metrics were used, namely: expected density, silhouette index, and Hubert's Gamma Statistic. The paper describes a method for test sampling of 2.000 graphs of the user's social network \\VKontakte". The API requests addressed \\VKontakte" and parsing HTML-pages of user's profiles and search results provided crawling. Information on user's group membership, secondary and higher education, and workplace was used as features. To store data the PostgreSQL DBMS was used, and the gexf format was used for data processing. For the test sample, metrics for several values of algorithm parameter were estimated: the value of index silhouettes was low (0.14-0.20, but within the normal range; the value of expected density was high, i.e. 1.17-1.52; the value of Hubert's gamma statistic was 0.94-0.95 that is close to the maximum. The number of vertices with no features was calculated before

  1. Graph visualization (Invited talk)

    NARCIS (Netherlands)

    Wijk, van J.J.; Kreveld, van M.J.; Speckmann, B.

    2012-01-01

    Black and white node link diagrams are the classic method to depict graphs, but these often fall short to give insight in large graphs or when attributes of nodes and edges play an important role. Graph visualization aims obtaining insight in such graphs using interactive graphical representations.

  2. Adventures in graph theory

    CERN Document Server

    Joyner, W David

    2017-01-01

    This textbook acts as a pathway to higher mathematics by seeking and illuminating the connections between graph theory and diverse fields of mathematics, such as calculus on manifolds, group theory, algebraic curves, Fourier analysis, cryptography and other areas of combinatorics. An overview of graph theory definitions and polynomial invariants for graphs prepares the reader for the subsequent dive into the applications of graph theory. To pique the reader’s interest in areas of possible exploration, recent results in mathematics appear throughout the book, accompanied with examples of related graphs, how they arise, and what their valuable uses are. The consequences of graph theory covered by the authors are complicated and far-reaching, so topics are always exhibited in a user-friendly manner with copious graphs, exercises, and Sage code for the computation of equations. Samples of the book’s source code can be found at github.com/springer-math/adventures-in-graph-theory. The text is geared towards ad...

  3. Identification of Functional Clusters in the Striatum Using Infinite Relational Modeling

    DEFF Research Database (Denmark)

    Andersen, Kasper Winther; Madsen, Kristoffer Hougaard; Siebner, Hartwig

    2011-01-01

    In this paper we investigate how the Infinite Relational Model can be used to infer functional groupings of the human striatum using resting state fMRI data from 30 healthy subjects. The Infinite Relational Model is a non-parametric Bayesian method for infering community structure in complex netw...... and non-links in the graphs as missing. We find that the model is performing well above chance for all subjects....

  4. Distance-regular graphs

    NARCIS (Netherlands)

    van Dam, Edwin R.; Koolen, Jack H.; Tanaka, Hajime

    2016-01-01

    This is a survey of distance-regular graphs. We present an introduction to distance-regular graphs for the reader who is unfamiliar with the subject, and then give an overview of some developments in the area of distance-regular graphs since the monograph 'BCN'[Brouwer, A.E., Cohen, A.M., Neumaier,

  5. A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.

    Science.gov (United States)

    Yan, Kang K; Zhao, Hongyu; Pang, Herbert

    2017-12-06

    High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

  6. On the strong metric dimension of generalized butterfly graph, starbarbell graph, and {C}_{m}\\odot {P}_{n} graph

    Science.gov (United States)

    Yunia Mayasari, Ratih; Atmojo Kusmayadi, Tri

    2018-04-01

    Let G be a connected graph with vertex set V(G) and edge set E(G). For every pair of vertices u,v\\in V(G), the interval I[u, v] between u and v to be the collection of all vertices that belong to some shortest u ‑ v path. A vertex s\\in V(G) strongly resolves two vertices u and v if u belongs to a shortest v ‑ s path or v belongs to a shortest u ‑ s path. A vertex set S of G is a strong resolving set of G if every two distinct vertices of G are strongly resolved by some vertex of S. The strong metric basis of G is a strong resolving set with minimal cardinality. The strong metric dimension sdim(G) of a graph G is defined as the cardinality of strong metric basis. In this paper we determine the strong metric dimension of a generalized butterfly graph, starbarbell graph, and {C}mȯ {P}n graph. We obtain the strong metric dimension of generalized butterfly graph is sdim(BFn ) = 2n ‑ 2. The strong metric dimension of starbarbell graph is sdim(S{B}{m1,{m}2,\\ldots,{m}n})={\\sum }i=1n({m}i-1)-1. The strong metric dimension of {C}mȯ {P}n graph are sdim({C}mȯ {P}n)=2m-1 for m > 3 and n = 2, and sdim({C}mȯ {P}n)=2m-2 for m > 3 and n > 2.

  7. Subgraph detection using graph signals

    KAUST Repository

    Chepuri, Sundeep Prabhakar

    2017-03-06

    In this paper we develop statistical detection theory for graph signals. In particular, given two graphs, namely, a background graph that represents an usual activity and an alternative graph that represents some unusual activity, we are interested in answering the following question: To which of the two graphs does the observed graph signal fit the best? To begin with, we assume both the graphs are known, and derive an optimal Neyman-Pearson detector. Next, we derive a suboptimal detector for the case when the alternative graph is not known. The developed theory is illustrated with numerical experiments.

  8. Subgraph detection using graph signals

    KAUST Repository

    Chepuri, Sundeep Prabhakar; Leus, Geert

    2017-01-01

    In this paper we develop statistical detection theory for graph signals. In particular, given two graphs, namely, a background graph that represents an usual activity and an alternative graph that represents some unusual activity, we are interested in answering the following question: To which of the two graphs does the observed graph signal fit the best? To begin with, we assume both the graphs are known, and derive an optimal Neyman-Pearson detector. Next, we derive a suboptimal detector for the case when the alternative graph is not known. The developed theory is illustrated with numerical experiments.

  9. Graph spectrum

    NARCIS (Netherlands)

    Brouwer, A.E.; Haemers, W.H.; Brouwer, A.E.; Haemers, W.H.

    2012-01-01

    This chapter presents some simple results on graph spectra.We assume the reader is familiar with elementary linear algebra and graph theory. Throughout, J will denote the all-1 matrix, and 1 is the all-1 vector.

  10. iBGP: A Bipartite Graph Propagation Approach for Mobile Advertising Fraud Detection

    Directory of Open Access Journals (Sweden)

    Jinlong Hu

    2017-01-01

    Full Text Available Online mobile advertising plays a vital financial role in supporting free mobile apps, but detecting malicious apps publishers who generate fraudulent actions on the advertisements hosted on their apps is difficult, since fraudulent traffic often mimics behaviors of legitimate users and evolves rapidly. In this paper, we propose a novel bipartite graph-based propagation approach, iBGP, for mobile apps advertising fraud detection in large advertising system. We exploit the characteristics of mobile advertising user’s behavior and identify two persistent patterns: power law distribution and pertinence and propose an automatic initial score learning algorithm to formulate both concepts to learn the initial scores of non-seed nodes. We propose a weighted graph propagation algorithm to propagate the scores of all nodes in the user-app bipartite graphs until convergence. To extend our approach for large-scale settings, we decompose the objective function of the initial score learning model into separate one-dimensional problems and parallelize the whole approach on an Apache Spark cluster. iBGP was applied on a large synthetic dataset and a large real-world mobile advertising dataset; experiment results demonstrate that iBGP significantly outperforms other popular graph-based propagation methods.

  11. Pragmatic Graph Rewriting Modifications

    OpenAIRE

    Rodgers, Peter; Vidal, Natalia

    1999-01-01

    We present new pragmatic constructs for easing programming in visual graph rewriting programming languages. The first is a modification to the rewriting process for nodes the host graph, where nodes specified as 'Once Only' in the LHS of a rewrite match at most once with a corresponding node in the host graph. This reduces the previously common use of tags to indicate the progress of matching in the graph. The second modification controls the application of LHS graphs, where those specified a...

  12. Simplicial complexes of graphs

    CERN Document Server

    Jonsson, Jakob

    2008-01-01

    A graph complex is a finite family of graphs closed under deletion of edges. Graph complexes show up naturally in many different areas of mathematics, including commutative algebra, geometry, and knot theory. Identifying each graph with its edge set, one may view a graph complex as a simplicial complex and hence interpret it as a geometric object. This volume examines topological properties of graph complexes, focusing on homotopy type and homology. Many of the proofs are based on Robin Forman's discrete version of Morse theory. As a byproduct, this volume also provides a loosely defined toolbox for attacking problems in topological combinatorics via discrete Morse theory. In terms of simplicity and power, arguably the most efficient tool is Forman's divide and conquer approach via decision trees; it is successfully applied to a large number of graph and digraph complexes.

  13. The STAPL Parallel Graph Library

    KAUST Repository

    Harshvardhan,

    2013-01-01

    This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.

  14. Manifold Adaptive Label Propagation for Face Clustering.

    Science.gov (United States)

    Pei, Xiaobing; Lyu, Zehua; Chen, Changqing; Chen, Chuanbo

    2015-08-01

    In this paper, a novel label propagation (LP) method is presented, called the manifold adaptive label propagation (MALP) method, which is to extend original LP by integrating sparse representation constraint into regularization framework of LP method. Similar to most LP, first of all, MALP also finds graph edges from given data and gives weights to the graph edges. Our goal is to find graph weights matrix adaptively. The key advantage of our approach is that MALP simultaneously finds graph weights matrix and predicts the label of unlabeled data. This paper also derives efficient algorithm to solve the proposed problem. Extensions of our MALP in kernel space and robust version are presented. The proposed method has been applied to the problem of semi-supervised face clustering using the well-known ORL, Yale, extended YaleB, and PIE datasets. Our experimental evaluations show the effectiveness of our method.

  15. Topic Model for Graph Mining.

    Science.gov (United States)

    Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Luo, Xiangfeng

    2015-12-01

    Graph mining has been a popular research area because of its numerous application scenarios. Many unstructured and structured data can be represented as graphs, such as, documents, chemical molecular structures, and images. However, an issue in relation to current research on graphs is that they cannot adequately discover the topics hidden in graph-structured data which can be beneficial for both the unsupervised learning and supervised learning of the graphs. Although topic models have proved to be very successful in discovering latent topics, the standard topic models cannot be directly applied to graph-structured data due to the "bag-of-word" assumption. In this paper, an innovative graph topic model (GTM) is proposed to address this issue, which uses Bernoulli distributions to model the edges between nodes in a graph. It can, therefore, make the edges in a graph contribute to latent topic discovery and further improve the accuracy of the supervised and unsupervised learning of graphs. The experimental results on two different types of graph datasets show that the proposed GTM outperforms the latent Dirichlet allocation on classification by using the unveiled topics of these two models to represent graphs.

  16. Modern graph theory

    CERN Document Server

    Bollobás, Béla

    1998-01-01

    The time has now come when graph theory should be part of the education of every serious student of mathematics and computer science, both for its own sake and to enhance the appreciation of mathematics as a whole. This book is an in-depth account of graph theory, written with such a student in mind; it reflects the current state of the subject and emphasizes connections with other branches of pure mathematics. The volume grew out of the author's earlier book, Graph Theory -- An Introductory Course, but its length is well over twice that of its predecessor, allowing it to reveal many exciting new developments in the subject. Recognizing that graph theory is one of several courses competing for the attention of a student, the book contains extensive descriptive passages designed to convey the flavor of the subject and to arouse interest. In addition to a modern treatment of the classical areas of graph theory such as coloring, matching, extremal theory, and algebraic graph theory, the book presents a detailed ...

  17. Graph Theory and Ion and Molecular Aggregation in Aqueous Solutions

    Science.gov (United States)

    Choi, Jun-Ho; Lee, Hochan; Choi, Hyung Ran; Cho, Minhaeng

    2018-04-01

    In molecular and cellular biology, dissolved ions and molecules have decisive effects on chemical and biological reactions, conformational stabilities, and functions of small to large biomolecules. Despite major efforts, the current state of understanding of the effects of specific ions, osmolytes, and bioprotecting sugars on the structure and dynamics of water H-bonding networks and proteins is not yet satisfactory. Recently, to gain deeper insight into this subject, we studied various aggregation processes of ions and molecules in high-concentration salt, osmolyte, and sugar solutions with time-resolved vibrational spectroscopy and molecular dynamics simulation methods. It turns out that ions (or solute molecules) have a strong propensity to self-assemble into large and polydisperse aggregates that affect both local and long-range water H-bonding structures. In particular, we have shown that graph-theoretical approaches can be used to elucidate morphological characteristics of large aggregates in various aqueous salt, osmolyte, and sugar solutions. When ion and molecular aggregates in such aqueous solutions are treated as graphs, a variety of graph-theoretical properties, such as graph spectrum, degree distribution, clustering coefficient, minimum path length, and graph entropy, can be directly calculated by considering an ensemble of configurations taken from molecular dynamics trajectories. Here we show percolating behavior exhibited by ion and molecular aggregates upon increase in solute concentration in high solute concentrations and discuss compelling evidence of the isomorphic relation between percolation transitions of ion and molecular aggregates and water H-bonding networks. We anticipate that the combination of graph theory and molecular dynamics simulation methods will be of exceptional use in achieving a deeper understanding of the fundamental physical chemistry of dissolution and in describing the interplay between the self-aggregation of solute

  18. Graph Theory and Ion and Molecular Aggregation in Aqueous Solutions.

    Science.gov (United States)

    Choi, Jun-Ho; Lee, Hochan; Choi, Hyung Ran; Cho, Minhaeng

    2018-04-20

    In molecular and cellular biology, dissolved ions and molecules have decisive effects on chemical and biological reactions, conformational stabilities, and functions of small to large biomolecules. Despite major efforts, the current state of understanding of the effects of specific ions, osmolytes, and bioprotecting sugars on the structure and dynamics of water H-bonding networks and proteins is not yet satisfactory. Recently, to gain deeper insight into this subject, we studied various aggregation processes of ions and molecules in high-concentration salt, osmolyte, and sugar solutions with time-resolved vibrational spectroscopy and molecular dynamics simulation methods. It turns out that ions (or solute molecules) have a strong propensity to self-assemble into large and polydisperse aggregates that affect both local and long-range water H-bonding structures. In particular, we have shown that graph-theoretical approaches can be used to elucidate morphological characteristics of large aggregates in various aqueous salt, osmolyte, and sugar solutions. When ion and molecular aggregates in such aqueous solutions are treated as graphs, a variety of graph-theoretical properties, such as graph spectrum, degree distribution, clustering coefficient, minimum path length, and graph entropy, can be directly calculated by considering an ensemble of configurations taken from molecular dynamics trajectories. Here we show percolating behavior exhibited by ion and molecular aggregates upon increase in solute concentration in high solute concentrations and discuss compelling evidence of the isomorphic relation between percolation transitions of ion and molecular aggregates and water H-bonding networks. We anticipate that the combination of graph theory and molecular dynamics simulation methods will be of exceptional use in achieving a deeper understanding of the fundamental physical chemistry of dissolution and in describing the interplay between the self-aggregation of solute

  19. A graph edit dictionary for correcting errors in roof topology graphs reconstructed from point clouds

    Science.gov (United States)

    Xiong, B.; Oude Elberink, S.; Vosselman, G.

    2014-07-01

    In the task of 3D building model reconstruction from point clouds we face the problem of recovering a roof topology graph in the presence of noise, small roof faces and low point densities. Errors in roof topology graphs will seriously affect the final modelling results. The aim of this research is to automatically correct these errors. We define the graph correction as a graph-to-graph problem, similar to the spelling correction problem (also called the string-to-string problem). The graph correction is more complex than string correction, as the graphs are 2D while strings are only 1D. We design a strategy based on a dictionary of graph edit operations to automatically identify and correct the errors in the input graph. For each type of error the graph edit dictionary stores a representative erroneous subgraph as well as the corrected version. As an erroneous roof topology graph may contain several errors, a heuristic search is applied to find the optimum sequence of graph edits to correct the errors one by one. The graph edit dictionary can be expanded to include entries needed to cope with errors that were previously not encountered. Experiments show that the dictionary with only fifteen entries already properly corrects one quarter of erroneous graphs in about 4500 buildings, and even half of the erroneous graphs in one test area, achieving as high as a 95% acceptance rate of the reconstructed models.

  20. Bayesian artificial intelligence

    CERN Document Server

    Korb, Kevin B

    2003-01-01

    As the power of Bayesian techniques has become more fully realized, the field of artificial intelligence has embraced Bayesian methodology and integrated it to the point where an introduction to Bayesian techniques is now a core course in many computer science programs. Unlike other books on the subject, Bayesian Artificial Intelligence keeps mathematical detail to a minimum and covers a broad range of topics. The authors integrate all of Bayesian net technology and learning Bayesian net technology and apply them both to knowledge engineering. They emphasize understanding and intuition but also provide the algorithms and technical background needed for applications. Software, exercises, and solutions are available on the authors' website.

  1. Introduction to quantum graphs

    CERN Document Server

    Berkolaiko, Gregory

    2012-01-01

    A "quantum graph" is a graph considered as a one-dimensional complex and equipped with a differential operator ("Hamiltonian"). Quantum graphs arise naturally as simplified models in mathematics, physics, chemistry, and engineering when one considers propagation of waves of various nature through a quasi-one-dimensional (e.g., "meso-" or "nano-scale") system that looks like a thin neighborhood of a graph. Works that currently would be classified as discussing quantum graphs have been appearing since at least the 1930s, and since then, quantum graphs techniques have been applied successfully in various areas of mathematical physics, mathematics in general and its applications. One can mention, for instance, dynamical systems theory, control theory, quantum chaos, Anderson localization, microelectronics, photonic crystals, physical chemistry, nano-sciences, superconductivity theory, etc. Quantum graphs present many non-trivial mathematical challenges, which makes them dear to a mathematician's heart. Work on qu...

  2. Graph Generator Survey

    Energy Technology Data Exchange (ETDEWEB)

    Lothian, Joshua [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Powers, Sarah S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sullivan, Blair D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Baker, Matthew B. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Schrock, Jonathan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Poole, Stephen W. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2013-10-01

    The benchmarking effort within the Extreme Scale Systems Center at Oak Ridge National Laboratory seeks to provide High Performance Computing benchmarks and test suites of interest to the DoD sponsor. The work described in this report is a part of the effort focusing on graph generation. A previously developed benchmark, SystemBurn, allowed the emulation of different application behavior profiles within a single framework. To complement this effort, similar capabilities are desired for graph-centric problems. This report examines existing synthetic graph generator implementations in preparation for further study on the properties of their generated synthetic graphs.

  3. Functions and graphs

    CERN Document Server

    Gelfand, I M; Shnol, E E

    1969-01-01

    The second in a series of systematic studies by a celebrated mathematician I. M. Gelfand and colleagues, this volume presents students with a well-illustrated sequence of problems and exercises designed to illuminate the properties of functions and graphs. Since readers do not have the benefit of a blackboard on which a teacher constructs a graph, the authors abandoned the customary use of diagrams in which only the final form of the graph appears; instead, the book's margins feature step-by-step diagrams for the complete construction of each graph. The first part of the book employs simple fu

  4. Loose Graph Simulations

    DEFF Research Database (Denmark)

    Mansutti, Alessio; Miculan, Marino; Peressotti, Marco

    2017-01-01

    We introduce loose graph simulations (LGS), a new notion about labelled graphs which subsumes in an intuitive and natural way subgraph isomorphism (SGI), regular language pattern matching (RLPM) and graph simulation (GS). Being a unification of all these notions, LGS allows us to express directly...... also problems which are “mixed” instances of previous ones, and hence which would not fit easily in any of them. After the definition and some examples, we show that the problem of finding loose graph simulations is NP-complete, we provide formal translation of SGI, RLPM, and GS into LGSs, and we give...

  5. Bayesian estimation of covariance matrices: Application to market risk management at EDF

    International Nuclear Information System (INIS)

    Jandrzejewski-Bouriga, M.

    2012-01-01

    In this thesis, we develop new methods of regularized covariance matrix estimation, under the Bayesian setting. The regularization methodology employed is first related to shrinkage. We investigate a new Bayesian modeling of covariance matrix, based on hierarchical inverse-Wishart distribution, and then derive different estimators under standard loss functions. Comparisons between shrunk and empirical estimators are performed in terms of frequentist performance under different losses. It allows us to highlight the critical importance of the definition of cost function and show the persistent effect of the shrinkage-type prior on inference. In a second time, we consider the problem of covariance matrix estimation in Gaussian graphical models. If the issue is well treated for the decomposable case, it is not the case if you also consider non-decomposable graphs. We then describe a Bayesian and operational methodology to carry out the estimation of covariance matrix of Gaussian graphical models, decomposable or not. This procedure is based on a new and objective method of graphical-model selection, combined with a constrained and regularized estimation of the covariance matrix of the model chosen. The procedures studied effectively manage missing data. These estimation techniques were applied to calculate the covariance matrices involved in the market risk management for portfolios of EDF (Electricity of France), in particular for problems of calculating Value-at-Risk or in Asset Liability Management. (author)

  6. A Single Session of rTMS Enhances Small-Worldness in Writer’s Cramp: Evidence from Simultaneous EEG-fMRI Multi-Modal Brain Graph

    Directory of Open Access Journals (Sweden)

    Rose D. Bharath

    2017-09-01

    Full Text Available Background and Purpose: Repetitive transcranial magnetic stimulation (rTMS induces widespread changes in brain connectivity. As the network topology differences induced by a single session of rTMS are less known we undertook this study to ascertain whether the network alterations had a small-world morphology using multi-modal graph theory analysis of simultaneous EEG-fMRI.Method: Simultaneous EEG-fMRI was acquired in duplicate before (R1 and after (R2 a single session of rTMS in 14 patients with Writer’s Cramp (WC. Whole brain neuronal and hemodynamic network connectivity were explored using the graph theory measures and clustering coefficient, path length and small-world index were calculated for EEG and resting state fMRI (rsfMRI. Multi-modal graph theory analysis was used to evaluate the correlation of EEG and fMRI clustering coefficients.Result: A single session of rTMS was found to increase the clustering coefficient and small-worldness significantly in both EEG and fMRI (p < 0.05. Multi-modal graph theory analysis revealed significant modulations in the fronto-parietal regions immediately after rTMS. The rsfMRI revealed additional modulations in several deep brain regions including cerebellum, insula and medial frontal lobe.Conclusion: Multi-modal graph theory analysis of simultaneous EEG-fMRI can supplement motor physiology methods in understanding the neurobiology of rTMS in vivo. Coinciding evidence from EEG and rsfMRI reports small-world morphology for the acute phase network hyper-connectivity indicating changes ensuing low-frequency rTMS is probably not “noise”.

  7. Autoregressive Moving Average Graph Filtering

    OpenAIRE

    Isufi, Elvin; Loukas, Andreas; Simonetto, Andrea; Leus, Geert

    2016-01-01

    One of the cornerstones of the field of signal processing on graphs are graph filters, direct analogues of classical filters, but intended for signals defined on graphs. This work brings forth new insights on the distributed graph filtering problem. We design a family of autoregressive moving average (ARMA) recursions, which (i) are able to approximate any desired graph frequency response, and (ii) give exact solutions for tasks such as graph signal denoising and interpolation. The design phi...

  8. Vertex labeling and routing in self-similar outerplanar unclustered graphs modeling complex networks

    International Nuclear Information System (INIS)

    Comellas, Francesc; Miralles, Alicia

    2009-01-01

    This paper introduces a labeling and optimal routing algorithm for a family of modular, self-similar, small-world graphs with clustering zero. Many properties of this family are comparable to those of networks associated with technological and biological systems with low clustering, such as the power grid, some electronic circuits and protein networks. For these systems, the existence of models with an efficient routing protocol is of interest to design practical communication algorithms in relation to dynamical processes (including synchronization) and also to understand the underlying mechanisms that have shaped their particular structure.

  9. On cyclic orthogonal double covers of circulant graphs by special infinite graphs

    Directory of Open Access Journals (Sweden)

    R. El-Shanawany

    2017-12-01

    Full Text Available In this article, a technique to construct cyclic orthogonal double covers (CODCs of regular circulant graphs by certain infinite graph classes such as complete bipartite and tripartite graphs and disjoint union of butterfly and K1,2n−10 is introduced.

  10. A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.

    Science.gov (United States)

    Zhang, Lin; Meng, Jia; Liu, Hui; Huang, Yufei

    2012-01-01

    DNA methylation occurs in the context of a CpG dinucleotide. It is an important epigenetic modification, which can be inherited through cell division. The two major types of methylation include hypomethylation and hypermethylation. Unique methylation patterns have been shown to exist in diseases including various types of cancer. DNA methylation analysis promises to become a powerful tool in cancer diagnosis, treatment and prognostication. Large-scale methylation arrays are now available for studying methylation genome-wide. The Illumina methylation platform simultaneously measures cytosine methylation at more than 1500 CpG sites associated with over 800 cancer-related genes. Cluster analysis is often used to identify DNA methylation subgroups for prognosis and diagnosis. However, due to the unique non-Gaussian characteristics, traditional clustering methods may not be appropriate for DNA and methylation data, and the determination of optimal cluster number is still problematic. A Dirichlet process beta mixture model (DPBMM) is proposed that models the DNA methylation expressions as an infinite number of beta mixture distribution. The model allows automatic learning of the relevant parameters such as the cluster mixing proportion, the parameters of beta distribution for each cluster, and especially the number of potential clusters. Since the model is high dimensional and analytically intractable, we proposed a Gibbs sampling "no-gaps" solution for computing the posterior distributions, hence the estimates of the parameters. The proposed algorithm was tested on simulated data as well as methylation data from 55 Glioblastoma multiform (GBM) brain tissue samples. To reduce the computational burden due to the high data dimensionality, a dimension reduction method is adopted. The two GBM clusters yielded by DPBMM are based on data of different number of loci (P-value < 0.1), while hierarchical clustering cannot yield statistically significant clusters.

  11. Fitting Latent Cluster Models for Networks with latentnet

    Directory of Open Access Journals (Sweden)

    Pavel N. Krivitsky

    2007-12-01

    Full Text Available latentnet is a package to fit and evaluate statistical latent position and cluster models for networks. Hoff, Raftery, and Handcock (2002 suggested an approach to modeling networks based on positing the existence of an latent space of characteristics of the actors. Relationships form as a function of distances between these characteristics as well as functions of observed dyadic level covariates. In latentnet social distances are represented in a Euclidean space. It also includes a variant of the extension of the latent position model to allow for clustering of the positions developed in Handcock, Raftery, and Tantrum (2007.The package implements Bayesian inference for the models based on an Markov chain Monte Carlo algorithm. It can also compute maximum likelihood estimates for the latent position model and a two-stage maximum likelihood method for the latent position cluster model. For latent position cluster models, the package provides a Bayesian way of assessing how many groups there are, and thus whether or not there is any clustering (since if the preferred number of groups is 1, there is little evidence for clustering. It also estimates which cluster each actor belongs to. These estimates are probabilistic, and provide the probability of each actor belonging to each cluster. It computes four types of point estimates for the coefficients and positions: maximum likelihood estimate, posterior mean, posterior mode and the estimator which minimizes Kullback-Leibler divergence from the posterior. You can assess the goodness-of-fit of the model via posterior predictive checks. It has a function to simulate networks from a latent position or latent position cluster model.

  12. Quantum walks on quotient graphs

    International Nuclear Information System (INIS)

    Krovi, Hari; Brun, Todd A.

    2007-01-01

    A discrete-time quantum walk on a graph Γ is the repeated application of a unitary evolution operator to a Hilbert space corresponding to the graph. If this unitary evolution operator has an associated group of symmetries, then for certain initial states the walk will be confined to a subspace of the original Hilbert space. Symmetries of the original graph, given by its automorphism group, can be inherited by the evolution operator. We show that a quantum walk confined to the subspace corresponding to this symmetry group can be seen as a different quantum walk on a smaller quotient graph. We give an explicit construction of the quotient graph for any subgroup H of the automorphism group and illustrate it with examples. The automorphisms of the quotient graph which are inherited from the original graph are the original automorphism group modulo the subgroup H used to construct it. The quotient graph is constructed by removing the symmetries of the subgroup H from the original graph. We then analyze the behavior of hitting times on quotient graphs. Hitting time is the average time it takes a walk to reach a given final vertex from a given initial vertex. It has been shown in earlier work [Phys. Rev. A 74, 042334 (2006)] that the hitting time for certain initial states of a quantum walks can be infinite, in contrast to classical random walks. We give a condition which determines whether the quotient graph has infinite hitting times given that they exist in the original graph. We apply this condition for the examples discussed and determine which quotient graphs have infinite hitting times. All known examples of quantum walks with hitting times which are short compared to classical random walks correspond to systems with quotient graphs much smaller than the original graph; we conjecture that the existence of a small quotient graph with finite hitting times is necessary for a walk to exhibit a quantum speedup

  13. Fundamentals of algebraic graph transformation

    CERN Document Server

    Ehrig, Hartmut; Prange, Ulrike; Taentzer, Gabriele

    2006-01-01

    Graphs are widely used to represent structural information in the form of objects and connections between them. Graph transformation is the rule-based manipulation of graphs, an increasingly important concept in computer science and related fields. This is the first textbook treatment of the algebraic approach to graph transformation, based on algebraic structures and category theory. Part I is an introduction to the classical case of graph and typed graph transformation. In Part II basic and advanced results are first shown for an abstract form of replacement systems, so-called adhesive high-level replacement systems based on category theory, and are then instantiated to several forms of graph and Petri net transformation systems. Part III develops typed attributed graph transformation, a technique of key relevance in the modeling of visual languages and in model transformation. Part IV contains a practical case study on model transformation and a presentation of the AGG (attributed graph grammar) tool envir...

  14. Genetic structure and admixture between Bayash Roma from northwestern Croatia and general Croatian population: evidence from Bayesian clustering analysis.

    Science.gov (United States)

    Novokmet, Natalija; Galov, Ana; Marjanović, Damir; Škaro, Vedrana; Projić, Petar; Lauc, Gordan; Primorac, Dragan; Rudan, Pavao

    2015-01-01

    The European Roma represent a transnational mosaic of minority population groups with different migration histories and contrasting experiences in their interactions with majority populations across the European continent. Although historical genetic contributions of European lineages to the Roma pool were investigated before, the extent of contemporary genetic admixture between Bayash Roma and non-Romani majority population remains elusive. The aim of this study was to assess the genetic structure of the Bayash Roma population from northwestern Croatia and the general Croatian population and to investigate the extent of admixture between them. A set of genetic data from two original studies (100 Bayash Roma from northwestern Croatia and 195 individuals from the general Croatian population) was analyzed by Bayesian clustering implemented in STRUCTURE software. By re-analyzing published data we intended to focus for the first time on genetic differentiation and structure and in doing so we clearly pointed to the importance of considering social phenomena in understanding genetic structuring. Our results demonstrated that two population clusters best explain the genetic structure, which is consistent with social exclusion of Roma and the demographic history of Bayash Roma who have settled in NW Croatia only about 150 years ago and mostly applied rules of endogamy. The presence of admixture was revealed, while the percentage of non-Croatian individuals in general Croatian population was approximately twofold higher than the percentage of non-Romani individuals in Roma population corroborating the presence of ethnomimicry in Roma.

  15. Graph-based modelling in engineering

    CERN Document Server

    Rysiński, Jacek

    2017-01-01

    This book presents versatile, modern and creative applications of graph theory in mechanical engineering, robotics and computer networks. Topics related to mechanical engineering include e.g. machine and mechanism science, mechatronics, robotics, gearing and transmissions, design theory and production processes. The graphs treated are simple graphs, weighted and mixed graphs, bond graphs, Petri nets, logical trees etc. The authors represent several countries in Europe and America, and their contributions show how different, elegant, useful and fruitful the utilization of graphs in modelling of engineering systems can be. .

  16. Potential energy landscapes of elemental and heterogeneous chalcogen clusters

    International Nuclear Information System (INIS)

    Mauro, John C.; Loucks, Roger J.; Balakrishnan, Jitendra; Varshneya, Arun K.

    2006-01-01

    We describe the potential energy landscapes of elemental S 8 , Se 8 , and Te 8 clusters using disconnectivity graphs. Inherent structures include both ring and chain configurations, with rings especially dominant in Se 8 . We also map the potential energy landscapes of heterogeneous Se n (S,Te) 8-n clusters, which offer insights into the structure of heterogeneous chalcogen glasses

  17. Representing Degree Distributions, Clustering, and Homophily in Social Networks With Latent Cluster Random Effects Models.

    Science.gov (United States)

    Krivitsky, Pavel N; Handcock, Mark S; Raftery, Adrian E; Hoff, Peter D

    2009-07-01

    Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actor degrees. We propose a latent cluster random effects model to represent all of these features, and we describe a Bayesian estimation method for it. The model is applicable to both binary and non-binary network data. We illustrate the model using two real datasets. We also apply it to two simulated network datasets with the same, highly skewed, degree distribution, but very different network behavior: one unstructured and the other with transitivity and clustering. Models based on degree distributions, such as scale-free, preferential attachment and power-law models, cannot distinguish between these very different situations, but our model does.

  18. Bayesian Mediation Analysis

    OpenAIRE

    Yuan, Ying; MacKinnon, David P.

    2009-01-01

    This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...

  19. Semantic Mining based on graph theory and ontologies. Case Study: Cell Signaling Pathways

    Directory of Open Access Journals (Sweden)

    Carlos R. Rangel

    2016-08-01

    Full Text Available In this paper we use concepts from graph theory and cellular biology represented as ontologies, to carry out semantic mining tasks on signaling pathway networks. Specifically, the paper describes the semantic enrichment of signaling pathway networks. A cell signaling network describes the basic cellular activities and their interactions. The main contribution of this paper is in the signaling pathway research area, it proposes a new technique to analyze and understand how changes in these networks may affect the transmission and flow of information, which produce diseases such as cancer and diabetes. Our approach is based on three concepts from graph theory (modularity, clustering and centrality frequently used on social networks analysis. Our approach consists into two phases: the first uses the graph theory concepts to determine the cellular groups in the network, which we will call them communities; the second uses ontologies for the semantic enrichment of the cellular communities. The measures used from the graph theory allow us to determine the set of cells that are close (for example, in a disease, and the main cells in each community. We analyze our approach in two cases: TGF-ß and the Alzheimer Disease.

  20. Introduction to Bayesian statistics

    CERN Document Server

    Bolstad, William M

    2017-01-01

    There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...

  1. BootGraph: probabilistic fiber tractography using bootstrap algorithms and graph theory.

    Science.gov (United States)

    Vorburger, Robert S; Reischauer, Carolin; Boesiger, Peter

    2013-02-01

    Bootstrap methods have recently been introduced to diffusion-weighted magnetic resonance imaging to estimate the measurement uncertainty of ensuing diffusion parameters directly from the acquired data without the necessity to assume a noise model. These methods have been previously combined with deterministic streamline tractography algorithms to allow for the assessment of connection probabilities in the human brain. Thereby, the local noise induced disturbance in the diffusion data is accumulated additively due to the incremental progression of streamline tractography algorithms. Graph based approaches have been proposed to overcome this drawback of streamline techniques. For this reason, the bootstrap method is in the present work incorporated into a graph setup to derive a new probabilistic fiber tractography method, called BootGraph. The acquired data set is thereby converted into a weighted, undirected graph by defining a vertex in each voxel and edges between adjacent vertices. By means of the cone of uncertainty, which is derived using the wild bootstrap, a weight is thereafter assigned to each edge. Two path finding algorithms are subsequently applied to derive connection probabilities. While the first algorithm is based on the shortest path approach, the second algorithm takes all existing paths between two vertices into consideration. Tracking results are compared to an established algorithm based on the bootstrap method in combination with streamline fiber tractography and to another graph based algorithm. The BootGraph shows a very good performance in crossing situations with respect to false negatives and permits incorporating additional constraints, such as a curvature threshold. By inheriting the advantages of the bootstrap method and graph theory, the BootGraph method provides a computationally efficient and flexible probabilistic tractography setup to compute connection probability maps and virtual fiber pathways without the drawbacks of

  2. A bayesian approach to inferring the genetic population structure of sugarcane accessions from INTA (Argentina

    Directory of Open Access Journals (Sweden)

    Mariana Inés Pocovi

    2015-06-01

    Full Text Available Understanding the population structure and genetic diversity in sugarcane (Saccharum officinarum L. accessions from INTA germplasm bank (Argentina will be of great importance for germplasm collection and breeding improvement as it will identify diverse parental combinations to create segregating progenies with maximum genetic variability for further selection. A Bayesian approach, ordination methods (PCoA, Principal Coordinate Analysis and clustering analysis (UPGMA, Unweighted Pair Group Method with Arithmetic Mean were applied to this purpose. Sixty three INTA sugarcane hybrids were genotyped for 107 Simple Sequence Repeat (SSR and 136 Amplified Fragment Length Polymorphism (AFLP loci. Given the low probability values found with AFLP for individual assignment (4.7%, microsatellites seemed to perform better (54% for STRUCTURE analysis that revealed the germplasm to exist in five optimum groups with partly corresponding to their origin. However clusters shown high degree of admixture, F ST values confirmed the existence of differences among groups. Dissimilarity coefficients ranged from 0.079 to 0.651. PCoA separated sugarcane in groups that did not agree with those identified by STRUCTURE. The clustering including all genotypes neither showed resemblance to populations find by STRUCTURE, but clustering performed considering only individuals displaying a proportional membership > 0.6 in their primary population obtained with STRUCTURE showed close similarities. The Bayesian method indubitably brought more information on cultivar origins than classical PCoA and hierarchical clustering method.

  3. Joint Bayesian variable and graph selection for regression models with network-structured predictors

    Science.gov (United States)

    Peterson, C. B.; Stingo, F. C.; Vannucci, M.

    2015-01-01

    In this work, we develop a Bayesian approach to perform selection of predictors that are linked within a network. We achieve this by combining a sparse regression model relating the predictors to a response variable with a graphical model describing conditional dependencies among the predictors. The proposed method is well-suited for genomic applications since it allows the identification of pathways of functionally related genes or proteins which impact an outcome of interest. In contrast to previous approaches for network-guided variable selection, we infer the network among predictors using a Gaussian graphical model and do not assume that network information is available a priori. We demonstrate that our method outperforms existing methods in identifying network-structured predictors in simulation settings, and illustrate our proposed model with an application to inference of proteins relevant to glioblastoma survival. PMID:26514925

  4. Probing dark energy via galaxy cluster outskirts

    Science.gov (United States)

    Morandi, Andrea; Sun, Ming

    2016-04-01

    We present a Bayesian approach to combine Planck data and the X-ray physical properties of the intracluster medium in the virialization region of a sample of 320 galaxy clusters (0.056 definition of cluster boundary radius is more tenable, namely based on a fixed overdensity with respect to the critical density of the Universe. This novel cosmological test has the capacity to provide a generational leap forward in our understanding of the equation of state of dark energy.

  5. Equipackable graphs

    DEFF Research Database (Denmark)

    Vestergaard, Preben Dahl; Hartnell, Bert L.

    2006-01-01

    There are many results dealing with the problem of decomposing a fixed graph into isomorphic subgraphs. There has also been work on characterizing graphs with the property that one can delete the edges of a number of edge disjoint copies of the subgraph and, regardless of how that is done, the gr...

  6. Khovanov homology of graph-links

    Energy Technology Data Exchange (ETDEWEB)

    Nikonov, Igor M [M. V. Lomonosov Moscow State University, Faculty of Mechanics and Mathematics, Moscow (Russian Federation)

    2012-08-31

    Graph-links arise as the intersection graphs of turning chord diagrams of links. Speaking informally, graph-links provide a combinatorial description of links up to mutations. Many link invariants can be reformulated in the language of graph-links. Khovanov homology, a well-known and useful knot invariant, is defined for graph-links in this paper (in the case of the ground field of characteristic two). Bibliography: 14 titles.

  7. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    Science.gov (United States)

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  8. Generalized connectivity of graphs

    CERN Document Server

    Li, Xueliang

    2016-01-01

    Noteworthy results, proof techniques, open problems and conjectures in generalized (edge-) connectivity are discussed in this book. Both theoretical and practical analyses for generalized (edge-) connectivity of graphs are provided. Topics covered in this book include: generalized (edge-) connectivity of graph classes, algorithms, computational complexity, sharp bounds, Nordhaus-Gaddum-type results, maximum generalized local connectivity, extremal problems, random graphs, multigraphs, relations with the Steiner tree packing problem and generalizations of connectivity. This book enables graduate students to understand and master a segment of graph theory and combinatorial optimization. Researchers in graph theory, combinatorics, combinatorial optimization, probability, computer science, discrete algorithms, complexity analysis, network design, and the information transferring models will find this book useful in their studies.

  9. On two energy-like invariants of line graphs and related graph operations

    Directory of Open Access Journals (Sweden)

    Xiaodan Chen

    2016-02-01

    Full Text Available Abstract For a simple graph G of order n, let μ 1 ≥ μ 2 ≥ ⋯ ≥ μ n = 0 $\\mu_{1}\\geq\\mu_{2}\\geq\\cdots\\geq\\mu_{n}=0$ be its Laplacian eigenvalues, and let q 1 ≥ q 2 ≥ ⋯ ≥ q n ≥ 0 $q_{1}\\geq q_{2}\\geq\\cdots\\geq q_{n}\\geq0$ be its signless Laplacian eigenvalues. The Laplacian-energy-like invariant and incidence energy of G are defined as, respectively, LEL ( G = ∑ i = 1 n − 1 μ i and IE ( G = ∑ i = 1 n q i . $$\\mathit{LEL}(G=\\sum_{i=1}^{n-1}\\sqrt{ \\mu_{i}} \\quad\\mbox{and}\\quad \\mathit {IE}(G=\\sum_{i=1}^{n} \\sqrt{q_{i}}. $$ In this paper, we present some new upper and lower bounds on LEL and IE of line graph, subdivision graph, para-line graph and total graph of a regular graph, some of which improve previously known results. The main tools we use here are the Cauchy-Schwarz inequality and the Ozeki inequality.

  10. Fuzzy Rules for Ant Based Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Amira Hamdi

    2016-01-01

    Full Text Available This paper provides a new intelligent technique for semisupervised data clustering problem that combines the Ant System (AS algorithm with the fuzzy c-means (FCM clustering algorithm. Our proposed approach, called F-ASClass algorithm, is a distributed algorithm inspired by foraging behavior observed in ant colonyT. The ability of ants to find the shortest path forms the basis of our proposed approach. In the first step, several colonies of cooperating entities, called artificial ants, are used to find shortest paths in a complete graph that we called graph-data. The number of colonies used in F-ASClass is equal to the number of clusters in dataset. Hence, the partition matrix of dataset founded by artificial ants is given in the second step, to the fuzzy c-means technique in order to assign unclassified objects generated in the first step. The proposed approach is tested on artificial and real datasets, and its performance is compared with those of K-means, K-medoid, and FCM algorithms. Experimental section shows that F-ASClass performs better according to the error rate classification, accuracy, and separation index.

  11. Price Competition on Graphs

    OpenAIRE

    Adriaan R. Soetevent

    2010-01-01

    This paper extends Hotelling's model of price competition with quadratic transportation costs from a line to graphs. I propose an algorithm to calculate firm-level demand for any given graph, conditional on prices and firm locations. One feature of graph models of price competition is that spatial discontinuities in firm-level demand may occur. I show that the existence result of D'Aspremont et al. (1979) does not extend to simple star graphs. I conjecture that this non-existence result holds...

  12. Price Competition on Graphs

    OpenAIRE

    Pim Heijnen; Adriaan Soetevent

    2014-01-01

    This paper extends Hotelling's model of price competition with quadratic transportation costs from a line to graphs. We derive an algorithm to calculate firm-level demand for any given graph, conditional on prices and firm locations. These graph models of price competition may lead to spatial discontinuities in firm-level demand. We show that the existence result of D'Aspremont et al. (1979) does not extend to simple star graphs and conjecture that this non-existence result holds more general...

  13. Skew-adjacency matrices of graphs

    NARCIS (Netherlands)

    Cavers, M.; Cioaba, S.M.; Fallat, S.; Gregory, D.A.; Haemers, W.H.; Kirkland, S.J.; McDonald, J.J.; Tsatsomeros, M.

    2012-01-01

    The spectra of the skew-adjacency matrices of a graph are considered as a possible way to distinguish adjacency cospectral graphs. This leads to the following topics: graphs whose skew-adjacency matrices are all cospectral; relations between the matchings polynomial of a graph and the characteristic

  14. Graph theory

    CERN Document Server

    Diestel, Reinhard

    2017-01-01

    This standard textbook of modern graph theory, now in its fifth edition, combines the authority of a classic with the engaging freshness of style that is the hallmark of active mathematics. It covers the core material of the subject with concise yet reliably complete proofs, while offering glimpses of more advanced methods in each field by one or two deeper results, again with proofs given in full detail. The book can be used as a reliable text for an introductory course, as a graduate text, and for self-study. From the reviews: “This outstanding book cannot be substituted with any other book on the present textbook market. It has every chance of becoming the standard textbook for graph theory.”Acta Scientiarum Mathematiciarum “Deep, clear, wonderful. This is a serious book about the heart of graph theory. It has depth and integrity. ”Persi Diaconis & Ron Graham, SIAM Review “The book has received a very enthusiastic reception, which it amply deserves. A masterly elucidation of modern graph theo...

  15. Acyclicity in edge-colored graphs

    DEFF Research Database (Denmark)

    Gutin, Gregory; Jones, Mark; Sheng, Bin

    2017-01-01

    A walk W in edge-colored graphs is called properly colored (PC) if every pair of consecutive edges in W is of different color. We introduce and study five types of PC acyclicity in edge-colored graphs such that graphs of PC acyclicity of type i is a proper superset of graphs of acyclicity of type i......+1, i=1,2,3,4. The first three types are equivalent to the absence of PC cycles, PC closed trails, and PC closed walks, respectively. While graphs of types 1, 2 and 3 can be recognized in polynomial time, the problem of recognizing graphs of type 4 is, somewhat surprisingly, NP-hard even for 2-edge-colored...... graphs (i.e., when only two colors are used). The same problem with respect to type 5 is polynomial-time solvable for all edge-colored graphs. Using the five types, we investigate the border between intractability and tractability for the problems of finding the maximum number of internally vertex...

  16. Graph Sampling for Covariance Estimation

    KAUST Repository

    Chepuri, Sundeep Prabhakar

    2017-04-25

    In this paper the focus is on subsampling as well as reconstructing the second-order statistics of signals residing on nodes of arbitrary undirected graphs. Second-order stationary graph signals may be obtained by graph filtering zero-mean white noise and they admit a well-defined power spectrum whose shape is determined by the frequency response of the graph filter. Estimating the graph power spectrum forms an important component of stationary graph signal processing and related inference tasks such as Wiener prediction or inpainting on graphs. The central result of this paper is that by sampling a significantly smaller subset of vertices and using simple least squares, we can reconstruct the second-order statistics of the graph signal from the subsampled observations, and more importantly, without any spectral priors. To this end, both a nonparametric approach as well as parametric approaches including moving average and autoregressive models for the graph power spectrum are considered. The results specialize for undirected circulant graphs in that the graph nodes leading to the best compression rates are given by the so-called minimal sparse rulers. A near-optimal greedy algorithm is developed to design the subsampling scheme for the non-parametric and the moving average models, whereas a particular subsampling scheme that allows linear estimation for the autoregressive model is proposed. Numerical experiments on synthetic as well as real datasets related to climatology and processing handwritten digits are provided to demonstrate the developed theory.

  17. GeoSciGraph: An Ontological Framework for EarthCube Semantic Infrastructure

    Science.gov (United States)

    Gupta, A.; Schachne, A.; Condit, C.; Valentine, D.; Richard, S.; Zaslavsky, I.

    2015-12-01

    The CINERGI (Community Inventory of EarthCube Resources for Geosciences Interoperability) project compiles an inventory of a wide variety of earth science resources including documents, catalogs, vocabularies, data models, data services, process models, information repositories, domain-specific ontologies etc. developed by research groups and data practitioners. We have developed a multidisciplinary semantic framework called GeoSciGraph semantic ingration of earth science resources. An integrated ontology is constructed with Basic Formal Ontology (BFO) as its upper ontology and currently ingests multiple component ontologies including the SWEET ontology, GeoSciML's lithology ontology, Tematres controlled vocabulary server, GeoNames, GCMD vocabularies on equipment, platforms and institutions, software ontology, CUAHSI hydrology vocabulary, the environmental ontology (ENVO) and several more. These ontologies are connected through bridging axioms; GeoSciGraph identifies lexically close terms and creates equivalence class or subclass relationships between them after human verification. GeoSciGraph allows a community to create community-specific customizations of the integrated ontology. GeoSciGraph uses the Neo4J,a graph database that can hold several billion concepts and relationships. GeoSciGraph provides a number of REST services that can be called by other software modules like the CINERGI information augmentation pipeline. 1) Vocabulary services are used to find exact and approximate terms, term categories (community-provided clusters of terms e.g., measurement-related terms or environmental material related terms), synonyms, term definitions and annotations. 2) Lexical services are used for text parsing to find entities, which can then be included into the ontology by a domain expert. 3) Graph services provide the ability to perform traversal centric operations e.g., finding paths and neighborhoods which can be used to perform ontological operations like

  18. Price competition on graphs

    NARCIS (Netherlands)

    Soetevent, A.R.

    2010-01-01

    This paper extends Hotelling's model of price competition with quadratic transportation costs from a line to graphs. I propose an algorithm to calculate firm-level demand for any given graph, conditional on prices and firm locations. One feature of graph models of price competition is that spatial

  19. Graphing the order of the sexes: constructing, recalling, interpreting, and putting the self in gender difference graphs.

    Science.gov (United States)

    Hegarty, Peter; Lemieux, Anthony F; McQueen, Grant

    2010-03-01

    Graphs seem to connote facts more than words or tables do. Consequently, they seem unlikely places to spot implicit sexism at work. Yet, in 6 studies (N = 741), women and men constructed (Study 1) and recalled (Study 2) gender difference graphs with men's data first, and graphed powerful groups (Study 3) and individuals (Study 4) ahead of weaker ones. Participants who interpreted graph order as evidence of author "bias" inferred that the author graphed his or her own gender group first (Study 5). Women's, but not men's, preferences to graph men first were mitigated when participants graphed a difference between themselves and an opposite-sex friend prior to graphing gender differences (Study 6). Graph production and comprehension are affected by beliefs and suppositions about the groups represented in graphs to a greater degree than cognitive models of graph comprehension or realist models of scientific thinking have yet acknowledged.

  20. Collective Rationality in Graph Aggregation

    NARCIS (Netherlands)

    Endriss, U.; Grandi, U.; Schaub, T.; Friedrich, G.; O'Sullivan, B.

    2014-01-01

    Suppose a number of agents each provide us with a directed graph over a common set of vertices. Graph aggregation is the problem of computing a single “collective” graph that best represents the information inherent in this profile of individual graphs. We consider this aggregation problem from the

  1. Graph Theory. 1. Fragmentation of Structural Graphs

    Directory of Open Access Journals (Sweden)

    Lorentz JÄNTSCHI

    2002-12-01

    Full Text Available The investigation of structural graphs has many fields of applications in engineering, especially in applied sciences like as applied chemistry and physics, computer sciences and automation, electronics and telecommunication. The main subject of the paper is to express fragmentation criteria in graph using a new method of investigation: terminal paths. Using terminal paths are defined most of the fragmentation criteria that are in use in molecular topology, but the fields of applications are more generally than that, as I mentioned before. Graphical examples of fragmentation are given for every fragmentation criteria. Note that all fragmentation is made with a computer program that implements a routine for every criterion.[1] A web routine for tracing all terminal paths in graph can be found at the address: http://vl.academicdirect.ro/molecular_topology/tpaths/ [1] M. V. Diudea, I. Gutman, L. Jäntschi, Molecular Topology, Nova Science, Commack, New York, 2001, 2002.

  2. Associating clinical archetypes through UMLS Metathesaurus term clusters.

    Science.gov (United States)

    Lezcano, Leonardo; Sánchez-Alonso, Salvador; Sicilia, Miguel-Angel

    2012-06-01

    Clinical archetypes are modular definitions of clinical data, expressed using standard or open constraint-based data models as the CEN EN13606 and openEHR. There is an increasing archetype specification activity that raises the need for techniques to associate archetypes to support better management and user navigation in archetype repositories. This paper reports on a computational technique to generate tentative archetype associations by mapping them through term clusters obtained from the UMLS Metathesaurus. The terms are used to build a bipartite graph model and graph connectivity measures can be used for deriving associations.

  3. Introductory graph theory

    CERN Document Server

    Chartrand, Gary

    1984-01-01

    Graph theory is used today in the physical sciences, social sciences, computer science, and other areas. Introductory Graph Theory presents a nontechnical introduction to this exciting field in a clear, lively, and informative style. Author Gary Chartrand covers the important elementary topics of graph theory and its applications. In addition, he presents a large variety of proofs designed to strengthen mathematical techniques and offers challenging opportunities to have fun with mathematics. Ten major topics - profusely illustrated - include: Mathematical Models, Elementary Concepts of Grap

  4. Using Zipf-Mandelbrot law and graph theory to evaluate animal welfare

    Science.gov (United States)

    de Oliveira, Caprice G. L.; Miranda, José G. V.; Japyassú, Hilton F.; El-Hani, Charbel N.

    2018-02-01

    This work deals with the construction and testing of metrics of welfare based on behavioral complexity, using assumptions derived from Zipf-Mandelbrot law and graph theory. To test these metrics we compared yellow-breasted capuchins (Sapajus xanthosternos) (Wied-Neuwied, 1826) (PRIMATES CEBIDAE) found in two institutions, subjected to different captive conditions: a Zoobotanical Garden (hereafter, ZOO; n = 14), in good welfare condition, and a Wildlife Rescue Center (hereafter, WRC; n = 8), in poor welfare condition. In the Zipf-Mandelbrot-based analysis, the power law exponent was calculated using behavior frequency values versus behavior rank value. These values allow us to evaluate variations in individual behavioral complexity. For each individual we also constructed a graph using the sequence of behavioral units displayed in each recording (average recording time per individual: 4 h 26 min in the ZOO, 4 h 30 min in the WRC). Then, we calculated the values of the main graph attributes, which allowed us to analyze the complexity of the connectivity of the behaviors displayed in the individuals' behavioral sequences. We found significant differences between the two groups for the slope values in the Zipf-Mandelbrot analysis. The slope values for the ZOO individuals approached -1, with graphs representing a power law, while the values for the WRC individuals diverged from -1, differing from a power law pattern. Likewise, we found significant differences for the graph attributes average degree, weighted average degree, and clustering coefficient when comparing the ZOO and WRC individual graphs. However, no significant difference was found for the attributes modularity and average path length. Both analyses were effective in detecting differences between the patterns of behavioral complexity in the two groups. The slope values for the ZOO individuals indicated a higher behavioral complexity when compared to the WRC individuals. Similarly, graph construction and the

  5. Aligning Biomolecular Networks Using Modular Graph Kernels

    Science.gov (United States)

    Towfic, Fadi; Greenlee, M. Heather West; Honavar, Vasant

    Comparative analysis of biomolecular networks constructed using measurements from different conditions, tissues, and organisms offer a powerful approach to understanding the structure, function, dynamics, and evolution of complex biological systems. We explore a class of algorithms for aligning large biomolecular networks by breaking down such networks into subgraphs and computing the alignment of the networks based on the alignment of their subgraphs. The resulting subnetworks are compared using graph kernels as scoring functions. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit. Our experiments using Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository of protein-protein interaction data demonstrate that the performance of the proposed algorithms (as measured by % GO term enrichment of subnetworks identified by the alignment) is competitive with some of the state-of-the-art algorithms for pair-wise alignment of large protein-protein interaction networks. Our results also show that the inter-species similarity scores computed based on graph kernels can be used to cluster the species into a species tree that is consistent with the known phylogenetic relationships among the species.

  6. Creating more effective graphs

    CERN Document Server

    Robbins, Naomi B

    2012-01-01

    A succinct and highly readable guide to creating effective graphs The right graph can be a powerful tool for communicating information, improving a presentation, or conveying your point in print. If your professional endeavors call for you to present data graphically, here's a book that can help you do it more effectively. Creating More Effective Graphs gives you the basic knowledge and techniques required to choose and create appropriate graphs for a broad range of applications. Using real-world examples everyone can relate to, the author draws on her years of experience in gr

  7. Graph Compression by BFS

    Directory of Open Access Journals (Sweden)

    Alberto Apostolico

    2009-08-01

    Full Text Available The Web Graph is a large-scale graph that does not fit in main memory, so that lossless compression methods have been proposed for it. This paper introduces a compression scheme that combines efficient storage with fast retrieval for the information in a node. The scheme exploits the properties of the Web Graph without assuming an ordering of the URLs, so that it may be applied to more general graphs. Tests on some datasets of use achieve space savings of about 10% over existing methods.

  8. Graph theoretical analysis of EEG functional connectivity during music perception.

    Science.gov (United States)

    Wu, Junjie; Zhang, Junsong; Liu, Chu; Liu, Dongwei; Ding, Xiaojun; Zhou, Changle

    2012-11-05

    The present study evaluated the effect of music on large-scale structure of functional brain networks using graph theoretical concepts. While most studies on music perception used Western music as an acoustic stimulus, Guqin music, representative of Eastern music, was selected for this experiment to increase our knowledge of music perception. Electroencephalography (EEG) was recorded from non-musician volunteers in three conditions: Guqin music, noise and silence backgrounds. Phase coherence was calculated in the alpha band and between all pairs of EEG channels to construct correlation matrices. Each resulting matrix was converted into a weighted graph using a threshold, and two network measures: the clustering coefficient and characteristic path length were calculated. Music perception was found to display a higher level mean phase coherence. Over the whole range of thresholds, the clustering coefficient was larger while listening to music, whereas the path length was smaller. Networks in music background still had a shorter characteristic path length even after the correction for differences in mean synchronization level among background conditions. This topological change indicated a more optimal structure under music perception. Thus, prominent small-world properties are confirmed in functional brain networks. Furthermore, music perception shows an increase of functional connectivity and an enhancement of small-world network organizations. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. Graphing Inequalities, Connecting Meaning

    Science.gov (United States)

    Switzer, J. Matt

    2014-01-01

    Students often have difficulty with graphing inequalities (see Filloy, Rojano, and Rubio 2002; Drijvers 2002), and J. Matt Switzer's students were no exception. Although students can produce graphs for simple inequalities, they often struggle when the format of the inequality is unfamiliar. Even when producing a correct graph of an…

  10. Fuzzy Graph Language Recognizability

    OpenAIRE

    Kalampakas , Antonios; Spartalis , Stefanos; Iliadis , Lazaros

    2012-01-01

    Part 5: Fuzzy Logic; International audience; Fuzzy graph language recognizability is introduced along the lines of the established theory of syntactic graph language recognizability by virtue of the algebraic structure of magmoids. The main closure properties of the corresponding class are investigated and several interesting examples of fuzzy graph languages are examined.

  11. Quantitative graph theory mathematical foundations and applications

    CERN Document Server

    Dehmer, Matthias

    2014-01-01

    The first book devoted exclusively to quantitative graph theory, Quantitative Graph Theory: Mathematical Foundations and Applications presents and demonstrates existing and novel methods for analyzing graphs quantitatively. Incorporating interdisciplinary knowledge from graph theory, information theory, measurement theory, and statistical techniques, this book covers a wide range of quantitative-graph theoretical concepts and methods, including those pertaining to real and random graphs such as:Comparative approaches (graph similarity or distance)Graph measures to characterize graphs quantitat

  12. Dynamic Representations of Sparse Graphs

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Fagerberg, Rolf

    1999-01-01

    We present a linear space data structure for maintaining graphs with bounded arboricity—a large class of sparse graphs containing e.g. planar graphs and graphs of bounded treewidth—under edge insertions, edge deletions, and adjacency queries. The data structure supports adjacency queries in worst...... case O(c) time, and edge insertions and edge deletions in amortized O(1) and O(c+log n) time, respectively, where n is the number of nodes in the graph, and c is the bound on the arboricity....

  13. Spectral fluctuations of quantum graphs

    International Nuclear Information System (INIS)

    Pluhař, Z.; Weidenmüller, H. A.

    2014-01-01

    We prove the Bohigas-Giannoni-Schmit conjecture in its most general form for completely connected simple graphs with incommensurate bond lengths. We show that for graphs that are classically mixing (i.e., graphs for which the spectrum of the classical Perron-Frobenius operator possesses a finite gap), the generating functions for all (P,Q) correlation functions for both closed and open graphs coincide (in the limit of infinite graph size) with the corresponding expressions of random-matrix theory, both for orthogonal and for unitary symmetry

  14. Multiple graph regularized protein domain ranking.

    Science.gov (United States)

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2012-11-19

    Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.

  15. a Super Voxel-Based Riemannian Graph for Multi Scale Segmentation of LIDAR Point Clouds

    Science.gov (United States)

    Li, Minglei

    2018-04-01

    Automatically segmenting LiDAR points into respective independent partitions has become a topic of great importance in photogrammetry, remote sensing and computer vision. In this paper, we cast the problem of point cloud segmentation as a graph optimization problem by constructing a Riemannian graph. The scale space of the observed scene is explored by an octree-based over-segmentation with different depths. The over-segmentation produces many super voxels which restrict the structure of the scene and will be used as nodes of the graph. The Kruskal coordinates are used to compute edge weights that are proportional to the geodesic distance between nodes. Then we compute the edge-weight matrix in which the elements reflect the sectional curvatures associated with the geodesic paths between super voxel nodes on the scene surface. The final segmentation results are generated by clustering similar super voxels and cutting off the weak edges in the graph. The performance of this method was evaluated on LiDAR point clouds for both indoor and outdoor scenes. Additionally, extensive comparisons to state of the art techniques show that our algorithm outperforms on many metrics.

  16. Acyclicity in edge-colored graphs

    DEFF Research Database (Denmark)

    Gutin, Gregory; Jones, Mark; Sheng, Bin

    2017-01-01

    A walk W in edge-colored graphs is called properly colored (PC) if every pair of consecutive edges in W is of different color. We introduce and study five types of PC acyclicity in edge-colored graphs such that graphs of PC acyclicity of type i is a proper superset of graphs of acyclicity of type...

  17. Entropy and Graph Based Modelling of Document Coherence using Discourse Entities

    DEFF Research Database (Denmark)

    Petersen, Casper; Lioma, Christina; Simonsen, Jakob Grue

    2015-01-01

    We present two novel models of document coherence and their application to information retrieval (IR). Both models approximate document coherence using discourse entities, e.g. the subject or object of a sentence. Our first model views text as a Markov process generating sequences of discourse...... entities (entity n-grams); we use the entropy of these entity n-grams to approximate the rate at which new information appears in text, reasoning that as more new words appear, the topic increasingly drifts and text coherence decreases. Our second model extends the work of Guinaudeau & Strube [28......] that represents text as a graph of discourse entities, linked by different relations, such as their distance or adjacency in text. We use several graph topology metrics to approximate different aspects of the discourse flow that can indicate coherence, such as the average clustering or betweenness of discourse...

  18. ON BIPOLAR SINGLE VALUED NEUTROSOPHIC GRAPHS

    OpenAIRE

    Said Broumi; Mohamed Talea; Assia Bakali; Florentin Smarandache

    2016-01-01

    In this article, we combine the concept of bipolar neutrosophic set and graph theory. We introduce the notions of bipolar single valued neutrosophic graphs, strong bipolar single valued neutrosophic graphs, complete bipolar single valued neutrosophic graphs, regular bipolar single valued neutrosophic graphs and investigate some of their related properties.

  19. A seminar on graph theory

    CERN Document Server

    Harary, Frank

    2015-01-01

    Presented in 1962-63 by experts at University College, London, these lectures offer a variety of perspectives on graph theory. Although the opening chapters form a coherent body of graph theoretic concepts, this volume is not a text on the subject but rather an introduction to the extensive literature of graph theory. The seminar's topics are geared toward advanced undergraduate students of mathematics.Lectures by this volume's editor, Frank Harary, include ""Some Theorems and Concepts of Graph Theory,"" ""Topological Concepts in Graph Theory,"" ""Graphical Reconstruction,"" and other introduc

  20. Uniform Single Valued Neutrosophic Graphs

    Directory of Open Access Journals (Sweden)

    S. Broumi

    2017-09-01

    Full Text Available In this paper, we propose a new concept named the uniform single valued neutrosophic graph. An illustrative example and some properties are examined. Next, we develop an algorithmic approach for computing the complement of the single valued neutrosophic graph. A numerical example is demonstrated for computing the complement of single valued neutrosophic graphs and uniform single valued neutrosophic graph.

  1. Extremal graph theory

    CERN Document Server

    Bollobas, Bela

    2004-01-01

    The ever-expanding field of extremal graph theory encompasses a diverse array of problem-solving methods, including applications to economics, computer science, and optimization theory. This volume, based on a series of lectures delivered to graduate students at the University of Cambridge, presents a concise yet comprehensive treatment of extremal graph theory.Unlike most graph theory treatises, this text features complete proofs for almost all of its results. Further insights into theory are provided by the numerous exercises of varying degrees of difficulty that accompany each chapter. A

  2. Multiple graph regularized protein domain ranking

    KAUST Repository

    Wang, Jim Jing-Yan

    2012-11-19

    Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.

  3. Multiple graph regularized protein domain ranking

    KAUST Repository

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2012-01-01

    Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.

  4. Multiple graph regularized protein domain ranking

    Directory of Open Access Journals (Sweden)

    Wang Jim

    2012-11-01

    Full Text Available Abstract Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.

  5. Canonical Labelling of Site Graphs

    Directory of Open Access Journals (Sweden)

    Nicolas Oury

    2013-06-01

    Full Text Available We investigate algorithms for canonical labelling of site graphs, i.e. graphs in which edges bind vertices on sites with locally unique names. We first show that the problem of canonical labelling of site graphs reduces to the problem of canonical labelling of graphs with edge colourings. We then present two canonical labelling algorithms based on edge enumeration, and a third based on an extension of Hopcroft's partition refinement algorithm. All run in quadratic worst case time individually. However, one of the edge enumeration algorithms runs in sub-quadratic time for graphs with "many" automorphisms, and the partition refinement algorithm runs in sub-quadratic time for graphs with "few" bisimulation equivalences. This suite of algorithms was chosen based on the expectation that graphs fall in one of those two categories. If that is the case, a combined algorithm runs in sub-quadratic worst case time. Whether this expectation is reasonable remains an interesting open problem.

  6. Bayesian biostatistics

    CERN Document Server

    Lesaffre, Emmanuel

    2012-01-01

    The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd

  7. Bayesian data analysis for newcomers.

    Science.gov (United States)

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.

  8. Domination criticality in product graphs

    Directory of Open Access Journals (Sweden)

    M.R. Chithra

    2015-07-01

    Full Text Available A connected dominating set is an important notion and has many applications in routing and management of networks. Graph products have turned out to be a good model of interconnection networks. This motivated us to study the Cartesian product of graphs G with connected domination number, γc(G=2,3 and characterize such graphs. Also, we characterize the k−γ-vertex (edge critical graphs and k−γc-vertex (edge critical graphs for k=2,3 where γ denotes the domination number of G. We also discuss the vertex criticality in grids.

  9. Graph Creation, Visualisation and Transformation

    Directory of Open Access Journals (Sweden)

    Maribel Fernández

    2010-03-01

    Full Text Available We describe a tool to create, edit, visualise and compute with interaction nets - a form of graph rewriting systems. The editor, called GraphPaper, allows users to create and edit graphs and their transformation rules using an intuitive user interface. The editor uses the functionalities of the TULIP system, which gives us access to a wealth of visualisation algorithms. Interaction nets are not only a formalism for the specification of graphs, but also a rewrite-based computation model. We discuss graph rewriting strategies and a language to express them in order to perform strategic interaction net rewriting.

  10. A Hybrid Task Graph Scheduler for High Performance Image Processing Workflows.

    Science.gov (United States)

    Blattner, Timothy; Keyrouz, Walid; Bhattacharyya, Shuvra S; Halem, Milton; Brady, Mary

    2017-12-01

    Designing applications for scalability is key to improving their performance in hybrid and cluster computing. Scheduling code to utilize parallelism is difficult, particularly when dealing with data dependencies, memory management, data motion, and processor occupancy. The Hybrid Task Graph Scheduler (HTGS) improves programmer productivity when implementing hybrid workflows for multi-core and multi-GPU systems. The Hybrid Task Graph Scheduler (HTGS) is an abstract execution model, framework, and API that increases programmer productivity when implementing hybrid workflows for such systems. HTGS manages dependencies between tasks, represents CPU and GPU memories independently, overlaps computations with disk I/O and memory transfers, keeps multiple GPUs occupied, and uses all available compute resources. Through these abstractions, data motion and memory are explicit; this makes data locality decisions more accessible. To demonstrate the HTGS application program interface (API), we present implementations of two example algorithms: (1) a matrix multiplication that shows how easily task graphs can be used; and (2) a hybrid implementation of microscopy image stitching that reduces code size by ≈ 43% compared to a manually coded hybrid workflow implementation and showcases the minimal overhead of task graphs in HTGS. Both of the HTGS-based implementations show good performance. In image stitching the HTGS implementation achieves similar performance to the hybrid workflow implementation. Matrix multiplication with HTGS achieves 1.3× and 1.8× speedup over the multi-threaded OpenBLAS library for 16k × 16k and 32k × 32k size matrices, respectively.

  11. How to practise Bayesian statistics outside the Bayesian church: What philosophy for Bayesian statistical modelling?

    NARCIS (Netherlands)

    Borsboom, D.; Haig, B.D.

    2013-01-01

    Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science

  12. Graph Colouring Algorithms

    DEFF Research Database (Denmark)

    Husfeldt, Thore

    2015-01-01

    This chapter presents an introduction to graph colouring algorithms. The focus is on vertex-colouring algorithms that work for general classes of graphs with worst-case performance guarantees in a sequential model of computation. The presentation aims to demonstrate the breadth of available...

  13. Menopause and big data: Word Adjacency Graph modeling of menopause-related ChaCha data.

    Science.gov (United States)

    Carpenter, Janet S; Groves, Doyle; Chen, Chen X; Otte, Julie L; Miller, Wendy R

    2017-07-01

    To detect and visualize salient queries about menopause using Big Data from ChaCha. We used Word Adjacency Graph (WAG) modeling to detect clusters and visualize the range of menopause-related topics and their mutual proximity. The subset of relevant queries was fully modeled. We split each query into token words (ie, meaningful words and phrases) and removed stopwords (ie, not meaningful functional words). The remaining words were considered in sequence to build summary tables of words and two and three-word phrases. Phrases occurring at least 10 times were used to build a network graph model that was iteratively refined by observing and removing clusters of unrelated content. We identified two menopause-related subsets of queries by searching for questions containing menopause and menopause-related terms (eg, climacteric, hot flashes, night sweats, hormone replacement). The first contained 263,363 queries from individuals aged 13 and older and the second contained 5,892 queries from women aged 40 to 62 years. In the first set, we identified 12 topic clusters: 6 relevant to menopause and 6 less relevant. In the second set, we identified 15 topic clusters: 11 relevant to menopause and 4 less relevant. Queries about hormones were pervasive within both WAG models. Many of the queries reflected low literacy levels and/or feelings of embarrassment. We modeled menopause-related queries posed by ChaCha users between 2009 and 2012. ChaCha data may be used on its own or in combination with other Big Data sources to identify patient-driven educational needs and create patient-centered interventions.

  14. The fascinating world of graph theory

    CERN Document Server

    Benjamin, Arthur; Zhang, Ping

    2015-01-01

    Graph theory goes back several centuries and revolves around the study of graphs-mathematical structures showing relations between objects. With applications in biology, computer science, transportation science, and other areas, graph theory encompasses some of the most beautiful formulas in mathematics-and some of its most famous problems. The Fascinating World of Graph Theory explores the questions and puzzles that have been studied, and often solved, through graph theory. This book looks at graph theory's development and the vibrant individuals responsible for the field's growth. Introducin

  15. Guidelines for a graph-theoretic implementation of structural equation modeling

    Science.gov (United States)

    Grace, James B.; Schoolmaster, Donald R.; Guntenspergen, Glenn R.; Little, Amanda M.; Mitchell, Brian R.; Miller, Kathryn M.; Schweiger, E. William

    2012-01-01

    Structural equation modeling (SEM) is increasingly being chosen by researchers as a framework for gaining scientific insights from the quantitative analyses of data. New ideas and methods emerging from the study of causality, influences from the field of graphical modeling, and advances in statistics are expanding the rigor, capability, and even purpose of SEM. Guidelines for implementing the expanded capabilities of SEM are currently lacking. In this paper we describe new developments in SEM that we believe constitute a third-generation of the methodology. Most characteristic of this new approach is the generalization of the structural equation model as a causal graph. In this generalization, analyses are based on graph theoretic principles rather than analyses of matrices. Also, new devices such as metamodels and causal diagrams, as well as an increased emphasis on queries and probabilistic reasoning, are now included. Estimation under a graph theory framework permits the use of Bayesian or likelihood methods. The guidelines presented start from a declaration of the goals of the analysis. We then discuss how theory frames the modeling process, requirements for causal interpretation, model specification choices, selection of estimation method, model evaluation options, and use of queries, both to summarize retrospective results and for prospective analyses. The illustrative example presented involves monitoring data from wetlands on Mount Desert Island, home of Acadia National Park. Our presentation walks through the decision process involved in developing and evaluating models, as well as drawing inferences from the resulting prediction equations. In addition to evaluating hypotheses about the connections between human activities and biotic responses, we illustrate how the structural equation (SE) model can be queried to understand how interventions might take advantage of an environmental threshold to limit Typha invasions. The guidelines presented provide for

  16. Classical dynamics on graphs

    International Nuclear Information System (INIS)

    Barra, F.; Gaspard, P.

    2001-01-01

    We consider the classical evolution of a particle on a graph by using a time-continuous Frobenius-Perron operator that generalizes previous propositions. In this way, the relaxation rates as well as the chaotic properties can be defined for the time-continuous classical dynamics on graphs. These properties are given as the zeros of some periodic-orbit zeta functions. We consider in detail the case of infinite periodic graphs where the particle undergoes a diffusion process. The infinite spatial extension is taken into account by Fourier transforms that decompose the observables and probability densities into sectors corresponding to different values of the wave number. The hydrodynamic modes of diffusion are studied by an eigenvalue problem of a Frobenius-Perron operator corresponding to a given sector. The diffusion coefficient is obtained from the hydrodynamic modes of diffusion and has the Green-Kubo form. Moreover, we study finite but large open graphs that converge to the infinite periodic graph when their size goes to infinity. The lifetime of the particle on the open graph is shown to correspond to the lifetime of a system that undergoes a diffusion process before it escapes

  17. Groupies in multitype random graphs

    OpenAIRE

    Shang, Yilun

    2016-01-01

    A groupie in a graph is a vertex whose degree is not less than the average degree of its neighbors. Under some mild conditions, we show that the proportion of groupies is very close to 1/2 in multitype random graphs (such as stochastic block models), which include Erd?s-R?nyi random graphs, random bipartite, and multipartite graphs as special examples. Numerical examples are provided to illustrate the theoretical results.

  18. Bayesian Probability Theory

    Science.gov (United States)

    von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo

    2014-06-01

    Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.

  19. Quantum information processing with graph states

    International Nuclear Information System (INIS)

    Schlingemann, Dirk-Michael

    2005-04-01

    Graph states are multiparticle states which are associated with graphs. Each vertex of the graph corresponds to a single system or particle. The links describe quantum correlations (entanglement) between pairs of connected particles. Graph states were initiated independently by two research groups: On the one hand, graph states were introduced by Briegel and Raussendorf as a resource for a new model of one-way quantum computing, where algorithms are implemented by a sequence of measurements at single particles. On the other hand, graph states were developed by the author of this thesis and ReinhardWerner in Braunschweig, as a tool to build quantum error correcting codes, called graph codes. The connection between the two approaches was fully realized in close cooperation of both research groups. This habilitation thesis provides a survey of the theory of graph codes, focussing mainly, but not exclusively on the author's own research work. We present the theoretical and mathematical background for the analysis of graph codes. The concept of one-way quantum computing for general graph states is discussed. We explicitly show how to realize the encoding and decoding device of a graph code on a one-way quantum computer. This kind of implementation is to be seen as a mathematical description of a quantum memory device. In addition to that, we investigate interaction processes, which enable the creation of graph states on very large systems. Particular graph states can be created, for instance, by an Ising type interaction between next neighbor particles which sits at the points of an infinitely extended cubic lattice. Based on the theory of quantum cellular automata, we give a constructive characterization of general interactions which create a translationally invariant graph state. (orig.)

  20. Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time

    Science.gov (United States)

    2011-01-01

    523 10 Arabisopsis thaliana 1745 3098 71 12 Drosophila melanogaster 7282 24894 176 12 Homo Sapiens 9527 31182 308 12 Schizosaccharomyces pombe 2031...clusters of actors [6,14,28,40] and may be used as features in exponential random graph models for statistical analysis of social networks [17,19,20,44,49...29. R. Horaud and T. Skordas. Stereo correspondence through feature grouping and maximal cliques. IEEE Trans. Patt. An. Mach. Int. 11(11):1168–1180

  1. Fibonacci number of the tadpole graph

    Directory of Open Access Journals (Sweden)

    Joe DeMaio

    2014-10-01

    Full Text Available In 1982, Prodinger and Tichy defined the Fibonacci number of a graph G to be the number of independent sets of the graph G. They did so since the Fibonacci number of the path graph Pn is the Fibonacci number F(n+2 and the Fibonacci number of the cycle graph Cn is the Lucas number Ln. The tadpole graph Tn,k is the graph created by concatenating Cn and Pk with an edge from any vertex of Cn to a pendant of Pk for integers n=3 and k=0. This paper establishes formulae and identities for the Fibonacci number of the tadpole graph via algebraic and combinatorial methods.

  2. Model-based Clustering of Categorical Time Series with Multinomial Logit Classification

    Science.gov (United States)

    Frühwirth-Schnatter, Sylvia; Pamminger, Christoph; Winter-Ebmer, Rudolf; Weber, Andrea

    2010-09-01

    A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series. However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter [6], is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to further explain group membership we present an extension to the approach of Pamminger and Frühwirth-Schnatter [6] by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule by using a multinomial logit model. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo (MCMC) sampling scheme representing a (full) Gibbs-type sampler which involves only draws from standard distributions. Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market.

  3. On characterizing terrain visibility graphs

    Directory of Open Access Journals (Sweden)

    William Evans

    2015-06-01

    Full Text Available A terrain is an $x$-monotone polygonal line in the $xy$-plane. Two vertices of a terrain are mutually visible if and only if there is no terrain vertex on or above the open line segment connecting them. A graph whose vertices represent terrain vertices and whose edges represent mutually visible pairs of terrain vertices is called a terrain visibility graph. We would like to find properties that are both necessary and sufficient for a graph to be a terrain visibility graph; that is, we would like to characterize terrain visibility graphs.Abello et al. [Discrete and Computational Geometry, 14(3:331--358, 1995] showed that all terrain visibility graphs are “persistent”. They showed that the visibility information of a terrain point set implies some ordering requirements on the slopes of the lines connecting pairs of points in any realization, and as a step towards showing sufficiency, they proved that for any persistent graph $M$ there is a total order on the slopes of the (pseudo lines in a generalized configuration of points whose visibility graph is $M$.We give a much simpler proof of this result by establishing an orientation to every triple of vertices, reflecting some slope ordering requirements that are consistent with $M$ being the visibility graph, and prove that these requirements form a partial order. We give a faster algorithm to construct a total order on the slopes. Our approach attempts to clarify the implications of the graph theoretic properties on the ordering of the slopes, and may be interpreted as defining properties on an underlying oriented matroid that we show is a restricted type of $3$-signotope.

  4. Interaction graphs

    DEFF Research Database (Denmark)

    Seiller, Thomas

    2016-01-01

    Interaction graphs were introduced as a general, uniform, construction of dynamic models of linear logic, encompassing all Geometry of Interaction (GoI) constructions introduced so far. This series of work was inspired from Girard's hyperfinite GoI, and develops a quantitative approach that should...... be understood as a dynamic version of weighted relational models. Until now, the interaction graphs framework has been shown to deal with exponentials for the constrained system ELL (Elementary Linear Logic) while keeping its quantitative aspect. Adapting older constructions by Girard, one can clearly define...... "full" exponentials, but at the cost of these quantitative features. We show here that allowing interpretations of proofs to use continuous (yet finite in a measure-theoretic sense) sets of states, as opposed to earlier Interaction Graphs constructions were these sets of states were discrete (and finite...

  5. Groupies in multitype random graphs.

    Science.gov (United States)

    Shang, Yilun

    2016-01-01

    A groupie in a graph is a vertex whose degree is not less than the average degree of its neighbors. Under some mild conditions, we show that the proportion of groupies is very close to 1/2 in multitype random graphs (such as stochastic block models), which include Erdős-Rényi random graphs, random bipartite, and multipartite graphs as special examples. Numerical examples are provided to illustrate the theoretical results.

  6. The Harary index of a graph

    CERN Document Server

    Xu, Kexiang; Trinajstić, Nenad

    2015-01-01

    This is the first book to focus on the topological index, the Harary index, of a graph, including its mathematical properties, chemical applications and some related and attractive open problems. This book is dedicated to Professor Frank Harary (1921—2005), the grandmaster of graph theory and its applications. It has be written by experts in the field of graph theory and its applications. For a connected graph G, as an important distance-based topological index, the Harary index H(G) is defined as the sum of the reciprocals of the distance between any two unordered vertices of the graph G. In this book, the authors report on the newest results on the Harary index of a graph. These results mainly concern external graphs with respect to the Harary index; the relations to other topological indices; its properties and applications to pure graph theory and chemical graph theory; and two significant variants, i.e., additively and multiplicatively weighted Harary indices. In the last chapter, we present a number o...

  7. A Modal-Logic Based Graph Abstraction

    NARCIS (Netherlands)

    Bauer, J.; Boneva, I.B.; Kurban, M.E.; Rensink, Arend; Ehrig, H; Heckel, R.; Rozenberg, G.; Taentzer, G.

    2008-01-01

    Infinite or very large state spaces often prohibit the successful verification of graph transformation systems. Abstract graph transformation is an approach that tackles this problem by abstracting graphs to abstract graphs of bounded size and by lifting application of productions to abstract

  8. SU-F-R-44: Modeling Lung SBRT Tumor Response Using Bayesian Network Averaging

    International Nuclear Information System (INIS)

    Diamant, A; Ybarra, N; Seuntjens, J; El Naqa, I

    2016-01-01

    Purpose: The prediction of tumor control after a patient receives lung SBRT (stereotactic body radiation therapy) has proven to be challenging, due to the complex interactions between an individual’s biology and dose-volume metrics. Many of these variables have predictive power when combined, a feature that we exploit using a graph modeling approach based on Bayesian networks. This provides a probabilistic framework that allows for accurate and visually intuitive predictive modeling. The aim of this study is to uncover possible interactions between an individual patient’s characteristics and generate a robust model capable of predicting said patient’s treatment outcome. Methods: We investigated a cohort of 32 prospective patients from multiple institutions whom had received curative SBRT to the lung. The number of patients exhibiting tumor failure was observed to be 7 (event rate of 22%). The serum concentration of 5 biomarkers previously associated with NSCLC (non-small cell lung cancer) was measured pre-treatment. A total of 21 variables were analyzed including: dose-volume metrics with BED (biologically effective dose) correction and clinical variables. A Markov Chain Monte Carlo technique estimated the posterior probability distribution of the potential graphical structures. The probability of tumor failure was then estimated by averaging the top 100 graphs and applying Baye’s rule. Results: The optimal Bayesian model generated throughout this study incorporated the PTV volume, the serum concentration of the biomarker EGFR (epidermal growth factor receptor) and prescription BED. This predictive model recorded an area under the receiver operating characteristic curve of 0.94(1), providing better performance compared to competing methods in other literature. Conclusion: The use of biomarkers in conjunction with dose-volume metrics allows for the generation of a robust predictive model. The preliminary results of this report demonstrate that it is possible

  9. SU-F-R-44: Modeling Lung SBRT Tumor Response Using Bayesian Network Averaging

    Energy Technology Data Exchange (ETDEWEB)

    Diamant, A; Ybarra, N; Seuntjens, J [McGill University, Montreal, Quebec (Canada); El Naqa, I [University of Michigan, Ann Arbor, MI (United States)

    2016-06-15

    Purpose: The prediction of tumor control after a patient receives lung SBRT (stereotactic body radiation therapy) has proven to be challenging, due to the complex interactions between an individual’s biology and dose-volume metrics. Many of these variables have predictive power when combined, a feature that we exploit using a graph modeling approach based on Bayesian networks. This provides a probabilistic framework that allows for accurate and visually intuitive predictive modeling. The aim of this study is to uncover possible interactions between an individual patient’s characteristics and generate a robust model capable of predicting said patient’s treatment outcome. Methods: We investigated a cohort of 32 prospective patients from multiple institutions whom had received curative SBRT to the lung. The number of patients exhibiting tumor failure was observed to be 7 (event rate of 22%). The serum concentration of 5 biomarkers previously associated with NSCLC (non-small cell lung cancer) was measured pre-treatment. A total of 21 variables were analyzed including: dose-volume metrics with BED (biologically effective dose) correction and clinical variables. A Markov Chain Monte Carlo technique estimated the posterior probability distribution of the potential graphical structures. The probability of tumor failure was then estimated by averaging the top 100 graphs and applying Baye’s rule. Results: The optimal Bayesian model generated throughout this study incorporated the PTV volume, the serum concentration of the biomarker EGFR (epidermal growth factor receptor) and prescription BED. This predictive model recorded an area under the receiver operating characteristic curve of 0.94(1), providing better performance compared to competing methods in other literature. Conclusion: The use of biomarkers in conjunction with dose-volume metrics allows for the generation of a robust predictive model. The preliminary results of this report demonstrate that it is possible

  10. Graph mining for next generation sequencing: leveraging the assembly graph for biological insights.

    Science.gov (United States)

    Warnke-Sommer, Julia; Ali, Hesham

    2016-05-06

    The assembly of Next Generation Sequencing (NGS) reads remains a challenging task. This is especially true for the assembly of metagenomics data that originate from environmental samples potentially containing hundreds to thousands of unique species. The principle objective of current assembly tools is to assemble NGS reads into contiguous stretches of sequence called contigs while maximizing for both accuracy and contig length. The end goal of this process is to produce longer contigs with the major focus being on assembly only. Sequence read assembly is an aggregative process, during which read overlap relationship information is lost as reads are merged into longer sequences or contigs. The assembly graph is information rich and capable of capturing the genomic architecture of an input read data set. We have developed a novel hybrid graph in which nodes represent sequence regions at different levels of granularity. This model, utilized in the assembly and analysis pipeline Focus, presents a concise yet feature rich view of a given input data set, allowing for the extraction of biologically relevant graph structures for graph mining purposes. Focus was used to create hybrid graphs to model metagenomics data sets obtained from the gut microbiomes of five individuals with Crohn's disease and eight healthy individuals. Repetitive and mobile genetic elements are found to be associated with hybrid graph structure. Using graph mining techniques, a comparative study of the Crohn's disease and healthy data sets was conducted with focus on antibiotics resistance genes associated with transposase genes. Results demonstrated significant differences in the phylogenetic distribution of categories of antibiotics resistance genes in the healthy and diseased patients. Focus was also evaluated as a pure assembly tool and produced excellent results when compared against the Meta-velvet, Omega, and UD-IDBA assemblers. Mining the hybrid graph can reveal biological phenomena captured

  11. Distance-transitive graphs

    NARCIS (Netherlands)

    Cohen, A.M.; Beineke, L.W.; Wilson, R.J.; Cameron, P.J.

    2004-01-01

    In this chapter we investigate the classification of distance-transitive graphs: these are graphs whose automorphism groups are transitive on each of the sets of pairs of vertices at distance i, for i = 0, 1,.... We provide an introduction into the field. By use of the classification of finite

  12. TrajGraph: A Graph-Based Visual Analytics Approach to Studying Urban Network Centralities Using Taxi Trajectory Data.

    Science.gov (United States)

    Huang, Xiaoke; Zhao, Ye; Yang, Jing; Zhang, Chong; Ma, Chao; Ye, Xinyue

    2016-01-01

    We propose TrajGraph, a new visual analytics method, for studying urban mobility patterns by integrating graph modeling and visual analysis with taxi trajectory data. A special graph is created to store and manifest real traffic information recorded by taxi trajectories over city streets. It conveys urban transportation dynamics which can be discovered by applying graph analysis algorithms. To support interactive, multiscale visual analytics, a graph partitioning algorithm is applied to create region-level graphs which have smaller size than the original street-level graph. Graph centralities, including Pagerank and betweenness, are computed to characterize the time-varying importance of different urban regions. The centralities are visualized by three coordinated views including a node-link graph view, a map view and a temporal information view. Users can interactively examine the importance of streets to discover and assess city traffic patterns. We have implemented a fully working prototype of this approach and evaluated it using massive taxi trajectories of Shenzhen, China. TrajGraph's capability in revealing the importance of city streets was evaluated by comparing the calculated centralities with the subjective evaluations from a group of drivers in Shenzhen. Feedback from a domain expert was collected. The effectiveness of the visual interface was evaluated through a formal user study. We also present several examples and a case study to demonstrate the usefulness of TrajGraph in urban transportation analysis.

  13. Temporal Representation in Semantic Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Levandoski, J J; Abdulla, G M

    2007-08-07

    A wide range of knowledge discovery and analysis applications, ranging from business to biological, make use of semantic graphs when modeling relationships and concepts. Most of the semantic graphs used in these applications are assumed to be static pieces of information, meaning temporal evolution of concepts and relationships are not taken into account. Guided by the need for more advanced semantic graph queries involving temporal concepts, this paper surveys the existing work involving temporal representations in semantic graphs.

  14. Query Optimizations over Decentralized RDF Graphs

    KAUST Repository

    Abdelaziz, Ibrahim

    2017-05-18

    Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query processing over a small number of heterogeneous data sources by utilizing schema information. In the case of schema similarity and interlinks among sources, these approaches cause unnecessary data retrieval and communication, leading to poor scalability and response time. This paper addresses these limitations and presents Lusail, a system for scalable and efficient SPARQL query processing over decentralized graphs. Lusail achieves scalability and low query response time through various optimizations at compile and run times. At compile time, we use a novel locality-aware query decomposition technique that maximizes the number of query triple patterns sent together to a source based on the actual location of the instances satisfying these triple patterns. At run time, we use selectivity-awareness and parallel query execution to reduce network latency and to increase parallelism by delaying the execution of subqueries expected to return large results. We evaluate Lusail using real and synthetic benchmarks, with data sizes up to billions of triples on an in-house cluster and a public cloud. We show that Lusail outperforms state-of-the-art systems by orders of magnitude in terms of scalability and response time.

  15. The complexity of the matching-cut problem for planar graphs and other graph classes

    NARCIS (Netherlands)

    Bonsma, P.S.

    2009-01-01

    The Matching-Cut problem is the problem to decide whether a graph has an edge cut that is also a matching. Previously this problem was studied under the name of the Decomposable Graph Recognition problem, and proved to be -complete when restricted to graphs with maximum degree four. In this paper it

  16. PRIVATE GRAPHS – ACCESS RIGHTS ON GRAPHS FOR SEAMLESS NAVIGATION

    Directory of Open Access Journals (Sweden)

    W. Dorner

    2016-06-01

    Full Text Available After the success of GNSS (Global Navigational Satellite Systems and navigation services for public streets, indoor seems to be the next big development in navigational services, relying on RTLS – Real Time Locating Services (e.g. WIFI and allowing seamless navigation. In contrast to navigation and routing services on public streets, seamless navigation will cause an additional challenge: how to make routing data accessible to defined users or restrict access rights for defined areas or only to parts of the graph to a defined user group? The paper will present case studies and data from literature, where seamless and especially indoor navigation solutions are presented (hospitals, industrial complexes, building sites, but the problem of restricted access rights was only touched from a real world, but not a technical perspective. The analysis of case studies will show, that the objective of navigation and the different target groups for navigation solutions will demand well defined access rights and require solutions, how to make only parts of a graph to a user or application available to solve a navigational task. The paper will therefore introduce the concept of private graphs, which is defined as a graph for navigational purposes covering the street, road or floor network of an area behind a public street and suggest different approaches how to make graph data for navigational purposes available considering access rights and data protection, privacy and security issues as well.

  17. Quantification of three-dimensional cell-mediated collagen remodeling using graph theory.

    Science.gov (United States)

    Bilgin, Cemal Cagatay; Lund, Amanda W; Can, Ali; Plopper, George E; Yener, Bülent

    2010-09-30

    Cell cooperation is a critical event during tissue development. We present the first precise metrics to quantify the interaction between mesenchymal stem cells (MSCs) and extra cellular matrix (ECM). In particular, we describe cooperative collagen alignment process with respect to the spatio-temporal organization and function of mesenchymal stem cells in three dimensions. We defined two precise metrics: Collagen Alignment Index and Cell Dissatisfaction Level, for quantitatively tracking type I collagen and fibrillogenesis remodeling by mesenchymal stem cells over time. Computation of these metrics was based on graph theory and vector calculus. The cells and their three dimensional type I collagen microenvironment were modeled by three dimensional cell-graphs and collagen fiber organization was calculated from gradient vectors. With the enhancement of mesenchymal stem cell differentiation, acceleration through different phases was quantitatively demonstrated. The phases were clustered in a statistically significant manner based on collagen organization, with late phases of remodeling by untreated cells clustering strongly with early phases of remodeling by differentiating cells. The experiments were repeated three times to conclude that the metrics could successfully identify critical phases of collagen remodeling that were dependent upon cooperativity within the cell population. Definition of early metrics that are able to predict long-term functionality by linking engineered tissue structure to function is an important step toward optimizing biomaterials for the purposes of regenerative medicine.

  18. Quantification of three-dimensional cell-mediated collagen remodeling using graph theory.

    Directory of Open Access Journals (Sweden)

    Cemal Cagatay Bilgin

    2010-09-01

    Full Text Available Cell cooperation is a critical event during tissue development. We present the first precise metrics to quantify the interaction between mesenchymal stem cells (MSCs and extra cellular matrix (ECM. In particular, we describe cooperative collagen alignment process with respect to the spatio-temporal organization and function of mesenchymal stem cells in three dimensions.We defined two precise metrics: Collagen Alignment Index and Cell Dissatisfaction Level, for quantitatively tracking type I collagen and fibrillogenesis remodeling by mesenchymal stem cells over time. Computation of these metrics was based on graph theory and vector calculus. The cells and their three dimensional type I collagen microenvironment were modeled by three dimensional cell-graphs and collagen fiber organization was calculated from gradient vectors. With the enhancement of mesenchymal stem cell differentiation, acceleration through different phases was quantitatively demonstrated. The phases were clustered in a statistically significant manner based on collagen organization, with late phases of remodeling by untreated cells clustering strongly with early phases of remodeling by differentiating cells. The experiments were repeated three times to conclude that the metrics could successfully identify critical phases of collagen remodeling that were dependent upon cooperativity within the cell population.Definition of early metrics that are able to predict long-term functionality by linking engineered tissue structure to function is an important step toward optimizing biomaterials for the purposes of regenerative medicine.

  19. Integer Flows and Circuit Covers of Graphs and Signed Graphs

    Science.gov (United States)

    Cheng, Jian

    The work in Chapter 2 is motivated by Tutte and Jaeger's pioneering work on converting modulo flows into integer-valued flows for ordinary graphs. For a signed graphs (G, sigma), we first prove that for each k ∈ {2, 3}, if (G, sigma) is (k - 1)-edge-connected and contains an even number of negative edges when k = 2, then every modulo k-flow of (G, sigma) can be converted into an integer-valued ( k + 1)-ow with a larger or the same support. We also prove that if (G, sigma) is odd-(2p+1)-edge-connected, then (G, sigma) admits a modulo circular (2 + 1/ p)-flows if and only if it admits an integer-valued circular (2 + 1/p)-flows, which improves all previous result by Xu and Zhang (DM2005), Schubert and Steffen (EJC2015), and Zhu (JCTB2015). Shortest circuit cover conjecture is one of the major open problems in graph theory. It states that every bridgeless graph G contains a set of circuits F such that each edge is contained in at least one member of F and the length of F is at most 7/5∥E(G)∥. This concept was recently generalized to signed graphs by Macajova et al. (JGT2015). In Chapter 3, we improve their upper bound from 11∥E( G)∥ to 14/3 ∥E(G)∥, and if G is 2-edgeconnected and has even negativeness, then it can be further reduced to 11/3 ∥E(G)∥. Tutte's 3-flow conjecture has been studied by many graph theorists in the last several decades. As a new approach to this conjecture, DeVos and Thomassen considered the vectors as ow values and found that there is a close relation between vector S1-flows and integer 3-NZFs. Motivated by their observation, in Chapter 4, we prove that if a graph G admits a vector S1-flow with rank at most two, then G admits an integer 3-NZF. The concept of even factors is highly related to the famous Four Color Theorem. We conclude this dissertation in Chapter 5 with an improvement of a recent result by Chen and Fan (JCTB2016) on the upperbound of even factors. We show that if a graph G contains an even factor, then it

  20. Port-Hamiltonian Systems on Open Graphs

    NARCIS (Netherlands)

    Schaft, A.J. van der; Maschke, B.M.

    2010-01-01

    In this talk we discuss how to define in an intrinsic manner port-Hamiltonian dynamics on open graphs. Open graphs are graphs where some of the vertices are boundary vertices (terminals), which allow interconnection with other systems. We show that a directed graph carries two natural Dirac

  1. Towards a theory of geometric graphs

    CERN Document Server

    Pach, Janos

    2004-01-01

    The early development of graph theory was heavily motivated and influenced by topological and geometric themes, such as the Konigsberg Bridge Problem, Euler's Polyhedral Formula, or Kuratowski's characterization of planar graphs. In 1936, when Denes Konig published his classical Theory of Finite and Infinite Graphs, the first book ever written on the subject, he stressed this connection by adding the subtitle Combinatorial Topology of Systems of Segments. He wanted to emphasize that the subject of his investigations was very concrete: planar figures consisting of points connected by straight-line segments. However, in the second half of the twentieth century, graph theoretical research took an interesting turn. In the most popular and most rapidly growing areas (the theory of random graphs, Ramsey theory, extremal graph theory, algebraic graph theory, etc.), graphs were considered as abstract binary relations rather than geometric objects. Many of the powerful techniques developed in these fields have been su...

  2. A Class of Manifold Regularized Multiplicative Update Algorithms for Image Clustering.

    Science.gov (United States)

    Yang, Shangming; Yi, Zhang; He, Xiaofei; Li, Xuelong

    2015-12-01

    Multiplicative update algorithms are important tools for information retrieval, image processing, and pattern recognition. However, when the graph regularization is added to the cost function, different classes of sample data may be mapped to the same subspace, which leads to the increase of data clustering error rate. In this paper, an improved nonnegative matrix factorization (NMF) cost function is introduced. Based on the cost function, a class of novel graph regularized NMF algorithms is developed, which results in a class of extended multiplicative update algorithms with manifold structure regularization. Analysis shows that in the learning, the proposed algorithms can efficiently minimize the rank of the data representation matrix. Theoretical results presented in this paper are confirmed by simulations. For different initializations and data sets, variation curves of cost functions and decomposition data are presented to show the convergence features of the proposed update rules. Basis images, reconstructed images, and clustering results are utilized to present the efficiency of the new algorithms. Last, the clustering accuracies of different algorithms are also investigated, which shows that the proposed algorithms can achieve state-of-the-art performance in applications of image clustering.

  3. Generation of Individual Whole-Brain Atlases With Resting-State fMRI Data Using Simultaneous Graph Computation and Parcellation.

    Science.gov (United States)

    Wang, J; Hao, Z; Wang, H

    2018-01-01

    The human brain can be characterized as functional networks. Therefore, it is important to subdivide the brain appropriately in order to construct reliable networks. Resting-state functional connectivity-based parcellation is a commonly used technique to fulfill this goal. Here we propose a novel individual subject-level parcellation approach based on whole-brain resting-state functional magnetic resonance imaging (fMRI) data. We first used a supervoxel method known as simple linear iterative clustering directly on resting-state fMRI time series to generate supervoxels, and then combined similar supervoxels to generate clusters using a clustering method known as graph-without-cut (GWC). The GWC approach incorporates spatial information and multiple features of the supervoxels by energy minimization, simultaneously yielding an optimal graph and brain parcellation. Meanwhile, it theoretically guarantees that the actual cluster number is exactly equal to the initialized cluster number. By comparing the results of the GWC approach and those of the random GWC approach, we demonstrated that GWC does not rely heavily on spatial structures, thus avoiding the challenges encountered in some previous whole-brain parcellation approaches. In addition, by comparing the GWC approach to two competing approaches, we showed that GWC achieved better parcellation performances in terms of different evaluation metrics. The proposed approach can be used to generate individualized brain atlases for applications related to cognition, development, aging, disease, personalized medicine, etc. The major source codes of this study have been made publicly available at https://github.com/yuzhounh/GWC.

  4. Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction

    Science.gov (United States)

    Trivedi, Shubhendu; Pardos, Zachary A.; Sarkozy, Gabor N.; Heffernan, Neil T.

    2012-01-01

    Learning a more distributed representation of the input feature space is a powerful method to boost the performance of a given predictor. Often this is accomplished by partitioning the data into homogeneous groups by clustering so that separate models could be trained on each cluster. Intuitively each such predictor is a better representative of…

  5. Bayesian methods for data analysis

    CERN Document Server

    Carlin, Bradley P.

    2009-01-01

    Approaches for statistical inference Introduction Motivating Vignettes Defining the Approaches The Bayes-Frequentist Controversy Some Basic Bayesian Models The Bayes approach Introduction Prior Distributions Bayesian Inference Hierarchical Modeling Model Assessment Nonparametric Methods Bayesian computation Introduction Asymptotic Methods Noniterative Monte Carlo Methods Markov Chain Monte Carlo Methods Model criticism and selection Bayesian Modeling Bayesian Robustness Model Assessment Bayes Factors via Marginal Density Estimation Bayes Factors

  6. Quantum walk on a chimera graph

    Science.gov (United States)

    Xu, Shu; Sun, Xiangxiang; Wu, Jizhou; Zhang, Wei-Wei; Arshed, Nigum; Sanders, Barry C.

    2018-05-01

    We analyse a continuous-time quantum walk on a chimera graph, which is a graph of choice for designing quantum annealers, and we discover beautiful quantum walk features such as localization that starkly distinguishes classical from quantum behaviour. Motivated by technological thrusts, we study continuous-time quantum walk on enhanced variants of the chimera graph and on diminished chimera graph with a random removal of vertices. We explain the quantum walk by constructing a generating set for a suitable subgroup of graph isomorphisms and corresponding symmetry operators that commute with the quantum walk Hamiltonian; the Hamiltonian and these symmetry operators provide a complete set of labels for the spectrum and the stationary states. Our quantum walk characterization of the chimera graph and its variants yields valuable insights into graphs used for designing quantum-annealers.

  7. Software for Graph Analysis and Visualization

    Directory of Open Access Journals (Sweden)

    M. I. Kolomeychenko

    2014-01-01

    Full Text Available This paper describes the software for graph storage, analysis and visualization. The article presents a comparative analysis of existing software for analysis and visualization of graphs, describes the overall architecture of application and basic principles of construction and operation of the main modules. Furthermore, a description of the developed graph storage oriented to storage and processing of large-scale graphs is presented. The developed algorithm for finding communities and implemented algorithms of autolayouts of graphs are the main functionality of the product. The main advantage of the developed software is high speed processing of large size networks (up to millions of nodes and links. Moreover, the proposed graph storage architecture is unique and has no analogues. The developed approaches and algorithms are optimized for operating with big graphs and have high productivity.

  8. Mal-Netminer: Malware Classification Approach Based on Social Network Analysis of System Call Graph

    Directory of Open Access Journals (Sweden)

    Jae-wook Jang

    2015-01-01

    Full Text Available As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that “influence-based” graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.

  9. What Would a Graph Look Like in this Layout? A Machine Learning Approach to Large Graph Visualization.

    Science.gov (United States)

    Kwon, Oh-Hyun; Crnovrsanin, Tarik; Ma, Kwan-Liu

    2018-01-01

    Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information. Selecting a "good" layout method is thus important for visualizing a graph. The selection can be highly subjective and dependent on the given task. A common approach to selecting a good layout is to use aesthetic criteria and visual inspection. However, fully calculating various layouts and their associated aesthetic metrics is computationally expensive. In this paper, we present a machine learning approach to large graph visualization based on computing the topological similarity of graphs using graph kernels. For a given graph, our approach can show what the graph would look like in different layouts and estimate their corresponding aesthetic metrics. An important contribution of our work is the development of a new framework to design graph kernels. Our experimental study shows that our estimation calculation is considerably faster than computing the actual layouts and their aesthetic metrics. Also, our graph kernels outperform the state-of-the-art ones in both time and accuracy. In addition, we conducted a user study to demonstrate that the topological similarity computed with our graph kernel matches perceptual similarity assessed by human users.

  10. Text-Filled Stacked Area Graphs

    DEFF Research Database (Denmark)

    Kraus, Martin

    2011-01-01

    -filled stacked area graphs; i.e., graphs that feature stacked areas that are filled with small-typed text. Since these graphs allow for computing the text layout automatically, it is possible to include large amounts of textual detail with very little effort. We discuss the most important challenges and some...... solutions for the design of text-filled stacked area graphs with the help of an exemplary visualization of the genres, publication years, and titles of a database of several thousand PC games....

  11. Counting and confusion: Bayesian rate estimation with multiple populations

    Science.gov (United States)

    Farr, Will M.; Gair, Jonathan R.; Mandel, Ilya; Cutler, Curt

    2015-01-01

    We show how to obtain a Bayesian estimate of the rates or numbers of signal and background events from a set of events when the shapes of the signal and background distributions are known, can be estimated, or approximated; our method works well even if the foreground and background event distributions overlap significantly and the nature of any individual event cannot be determined with any certainty. We give examples of determining the rates of gravitational-wave events in the presence of background triggers from a template bank when noise parameters are known and/or can be fit from the trigger data. We also give an example of determining globular-cluster shape, location, and density from an observation of a stellar field that contains a nonuniform background density of stars superimposed on the cluster stars.

  12. On Graph Rewriting, Reduction and Evaluation

    DEFF Research Database (Denmark)

    Zerny, Ian

    2010-01-01

    We inter-derive two prototypical styles of graph reduction: reduction machines à la Turner and graph rewriting systems à la Barendregt et al. To this end, we adapt Danvy et al.'s mechanical program derivations from the world of terms to the world of graphs. We also outline how to inter-derive a t......We inter-derive two prototypical styles of graph reduction: reduction machines à la Turner and graph rewriting systems à la Barendregt et al. To this end, we adapt Danvy et al.'s mechanical program derivations from the world of terms to the world of graphs. We also outline how to inter...

  13. On the Equivalence of Nonnegative Matrix Factorization and K-means- Spectral Clustering

    Energy Technology Data Exchange (ETDEWEB)

    Ding, Chris; He, Xiaofeng; Simon, Horst D.; Jin, Rong

    2005-12-04

    We provide a systematic analysis of nonnegative matrix factorization (NMF) relating to data clustering. We generalize the usual X = FG{sup T} decomposition to the symmetric W = HH{sup T} and W = HSH{sup T} decompositions. We show that (1) W = HH{sup T} is equivalent to Kernel K-means clustering and the Laplacian-based spectral clustering. (2) X = FG{sup T} is equivalent to simultaneous clustering of rows and columns of a bipartite graph. We emphasizes the importance of orthogonality in NMF and soft clustering nature of NMF. These results are verified with experiments on face images and newsgroups.

  14. Graph transformation tool contest 2008

    NARCIS (Netherlands)

    Rensink, Arend; van Gorp, Pieter

    This special section is the outcome of the graph transformation tool contest organised during the Graph-Based Tools (GraBaTs) 2008 workshop, which took place as a satellite event of the International Conference on Graph Transformation (ICGT) 2008. The contest involved two parts: three “off-line case

  15. On dominator colorings in graphs

    Indian Academy of Sciences (India)

    colors required for a dominator coloring of G is called the dominator .... Theorem 1.3 shows that the complete graph Kn is the only connected graph of order n ... Conversely, if a graph G satisfies condition (i) or (ii), it is easy to see that χd(G) =.

  16. Graphs with branchwidth at most three

    NARCIS (Netherlands)

    Bodlaender, H.L.; Thilikos, D.M.

    1997-01-01

    In this paper we investigate both the structure of graphs with branchwidth at most three, as well as algorithms to recognise such graphs. We show that a graph has branchwidth at most three, if and only if it has treewidth at most three and does not contain the three-dimensional binary cube graph

  17. Planar graphs theory and algorithms

    CERN Document Server

    Nishizeki, T

    1988-01-01

    Collected in this volume are most of the important theorems and algorithms currently known for planar graphs, together with constructive proofs for the theorems. Many of the algorithms are written in Pidgin PASCAL, and are the best-known ones; the complexities are linear or 0(nlogn). The first two chapters provide the foundations of graph theoretic notions and algorithmic techniques. The remaining chapters discuss the topics of planarity testing, embedding, drawing, vertex- or edge-coloring, maximum independence set, subgraph listing, planar separator theorem, Hamiltonian cycles, and single- or multicommodity flows. Suitable for a course on algorithms, graph theory, or planar graphs, the volume will also be useful for computer scientists and graph theorists at the research level. An extensive reference section is included.

  18. Graph Quasicontinuous Functions and Densely Continuous Forms

    Directory of Open Access Journals (Sweden)

    Lubica Hola

    2017-07-01

    Full Text Available Let $X, Y$ be topological spaces. A function $f: X \\to Y$ is said to be graph quasicontinuous if there is a quasicontinuous function $g: X \\to Y$ with the graph of $g$ contained in the closure of the graph of $f$. There is a close relation between the notions of graph quasicontinuous functions and minimal usco maps as well as the notions of graph quasicontinuous functions and densely continuous forms. Every function with values in a compact Hausdorff space is graph quasicontinuous; more generally every locally compact function is graph quasicontinuous.

  19. Coloring sums of extensions of certain graphs

    Directory of Open Access Journals (Sweden)

    Johan Kok

    2017-12-01

    Full Text Available We recall that the minimum number of colors that allow a proper coloring of graph $G$ is called the chromatic number of $G$ and denoted $\\chi(G$. Motivated by the introduction of the concept of the $b$-chromatic sum of a graph the concept of $\\chi'$-chromatic sum and $\\chi^+$-chromatic sum are introduced in this paper. The extended graph $G^x$ of a graph $G$ was recently introduced for certain regular graphs. This paper furthers the concepts of $\\chi'$-chromatic sum and $\\chi^+$-chromatic sum to extended paths and cycles. Bipartite graphs also receive some attention. The paper concludes with patterned structured graphs. These last said graphs are typically found in chemical and biological structures.

  20. Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Voy, Brynn H [ORNL; Scharff, Jon [University of Tennessee, Knoxville (UTK); Perkins, Andy [University of Tennessee, Knoxville (UTK); Saxton, Arnold [University of Tennessee, Knoxville (UTK); Borate, Bhavesh [University of Tennessee, Knoxville (UTK); Chesler, Elissa J [ORNL; Branstetter, Lisa R [ORNL; Langston, Michael A [University of Tennessee, Knoxville (UTK)

    2006-01-01

    Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., ''guilt-by-association''). We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response.

  1. Extracting gene networks for low-dose radiation using graph theoretical algorithms.

    Directory of Open Access Journals (Sweden)

    Brynn H Voy

    2006-07-01

    Full Text Available Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., "guilt-by-association". We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response.

  2. Bayesian benefits with JASP

    NARCIS (Netherlands)

    Marsman, M.; Wagenmakers, E.-J.

    2017-01-01

    We illustrate the Bayesian approach to data analysis using the newly developed statistical software program JASP. With JASP, researchers are able to take advantage of the benefits that the Bayesian framework has to offer in terms of parameter estimation and hypothesis testing. The Bayesian

  3. Bayesian modeling using WinBUGS

    CERN Document Server

    Ntzoufras, Ioannis

    2009-01-01

    A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...

  4. On a conjecture concerning helly circle graphs

    Directory of Open Access Journals (Sweden)

    Durán Guillermo

    2003-01-01

    Full Text Available We say that G is an e-circle graph if there is a bijection between its vertices and straight lines on the cartesian plane such that two vertices are adjacent in G if and only if the corresponding lines intersect inside the circle of radius one. This definition suggests a method for deciding whether a given graph G is an e-circle graph, by constructing a convenient system S of equations and inequations which represents the structure of G, in such a way that G is an e-circle graph if and only if S has a solution. In fact, e-circle graphs are exactly the circle graphs (intersection graphs of chords in a circle, and thus this method provides an analytic way for recognizing circle graphs. A graph G is a Helly circle graph if G is a circle graph and there exists a model of G by chords such that every three pairwise intersecting chords intersect at the same point. A conjecture by Durán (2000 states that G is a Helly circle graph if and only if G is a circle graph and contains no induced diamonds (a diamond is a graph formed by four vertices and five edges. Many unsuccessful efforts - mainly based on combinatorial and geometrical approaches - have been done in order to validate this conjecture. In this work, we utilize the ideas behind the definition of e-circle graphs and restate this conjecture in terms of an equivalence between two systems of equations and inequations, providing a new, analytic tool to deal with it.

  5. Evaluating Spatial Variability in Sediment and Phosphorus Concentration-Discharge Relationships Using Bayesian Inference and Self-Organizing Maps

    Science.gov (United States)

    Underwood, Kristen L.; Rizzo, Donna M.; Schroth, Andrew W.; Dewoolkar, Mandar M.

    2017-12-01

    Given the variable biogeochemical, physical, and hydrological processes driving fluvial sediment and nutrient export, the water science and management communities need data-driven methods to identify regions prone to production and transport under variable hydrometeorological conditions. We use Bayesian analysis to segment concentration-discharge linear regression models for total suspended solids (TSS) and particulate and dissolved phosphorus (PP, DP) using 22 years of monitoring data from 18 Lake Champlain watersheds. Bayesian inference was leveraged to estimate segmented regression model parameters and identify threshold position. The identified threshold positions demonstrated a considerable range below and above the median discharge—which has been used previously as the default breakpoint in segmented regression models to discern differences between pre and post-threshold export regimes. We then applied a Self-Organizing Map (SOM), which partitioned the watersheds into clusters of TSS, PP, and DP export regimes using watershed characteristics, as well as Bayesian regression intercepts and slopes. A SOM defined two clusters of high-flux basins, one where PP flux was predominantly episodic and hydrologically driven; and another in which the sediment and nutrient sourcing and mobilization were more bimodal, resulting from both hydrologic processes at post-threshold discharges and reactive processes (e.g., nutrient cycling or lateral/vertical exchanges of fine sediment) at prethreshold discharges. A separate DP SOM defined two high-flux clusters exhibiting a bimodal concentration-discharge response, but driven by differing land use. Our novel framework shows promise as a tool with broad management application that provides insights into landscape drivers of riverine solute and sediment export.

  6. Optimization Problems on Threshold Graphs

    Directory of Open Access Journals (Sweden)

    Elena Nechita

    2010-06-01

    Full Text Available During the last three decades, different types of decompositions have been processed in the field of graph theory. Among these we mention: decompositions based on the additivity of some characteristics of the graph, decompositions where the adjacency law between the subsets of the partition is known, decompositions where the subgraph induced by every subset of the partition must have predeterminate properties, as well as combinations of such decompositions. In this paper we characterize threshold graphs using the weakly decomposition, determine: density and stability number, Wiener index and Wiener polynomial for threshold graphs.

  7. Hierarchy of modular graph identities

    Energy Technology Data Exchange (ETDEWEB)

    D’Hoker, Eric; Kaidi, Justin [Mani L. Bhaumik Institute for Theoretical Physics, Department of Physics and Astronomy,University of California,Los Angeles, CA 90095 (United States)

    2016-11-09

    The low energy expansion of Type II superstring amplitudes at genus one is organized in terms of modular graph functions associated with Feynman graphs of a conformal scalar field on the torus. In earlier work, surprising identities between two-loop graphs at all weights, and between higher-loop graphs of weights four and five were constructed. In the present paper, these results are generalized in two complementary directions. First, all identities at weight six and all dihedral identities at weight seven are obtained and proven. Whenever the Laurent polynomial at the cusp is available, the form of these identities confirms the pattern by which the vanishing of the Laurent polynomial governs the full modular identity. Second, the family of modular graph functions is extended to include all graphs with derivative couplings and worldsheet fermions. These extended families of modular graph functions are shown to obey a hierarchy of inhomogeneous Laplace eigenvalue equations. The eigenvalues are calculated analytically for the simplest infinite sub-families and obtained by Maple for successively more complicated sub-families. The spectrum is shown to consist solely of eigenvalues s(s−1) for positive integers s bounded by the weight, with multiplicities which exhibit rich representation-theoretic patterns.

  8. Hierarchy of modular graph identities

    International Nuclear Information System (INIS)

    D’Hoker, Eric; Kaidi, Justin

    2016-01-01

    The low energy expansion of Type II superstring amplitudes at genus one is organized in terms of modular graph functions associated with Feynman graphs of a conformal scalar field on the torus. In earlier work, surprising identities between two-loop graphs at all weights, and between higher-loop graphs of weights four and five were constructed. In the present paper, these results are generalized in two complementary directions. First, all identities at weight six and all dihedral identities at weight seven are obtained and proven. Whenever the Laurent polynomial at the cusp is available, the form of these identities confirms the pattern by which the vanishing of the Laurent polynomial governs the full modular identity. Second, the family of modular graph functions is extended to include all graphs with derivative couplings and worldsheet fermions. These extended families of modular graph functions are shown to obey a hierarchy of inhomogeneous Laplace eigenvalue equations. The eigenvalues are calculated analytically for the simplest infinite sub-families and obtained by Maple for successively more complicated sub-families. The spectrum is shown to consist solely of eigenvalues s(s−1) for positive integers s bounded by the weight, with multiplicities which exhibit rich representation-theoretic patterns.

  9. Well-covered graphs and factors

    DEFF Research Database (Denmark)

    Randerath, Bert; Vestergaard, Preben D.

    2006-01-01

    A maximum independent set of vertices in a graph is a set of pairwise nonadjacent vertices of largest cardinality α. Plummer defined a graph to be well-covered, if every independent set is contained in a maximum independent set of G. Every well-covered graph G without isolated vertices has a perf...

  10. On the centrality of some graphs

    Directory of Open Access Journals (Sweden)

    Vecdi Aytac

    2017-10-01

    Full Text Available A central issue in the analysis of complex networks is the assessment of their stability and vulnerability. A variety of measures have been proposed in the literature to quantify the stability of networks and a number of graph-theoretic parameters have been used to derive formulas for calculating network reliability. Different measures for graph vulnerability have been introduced so far to study different aspects of the graph behavior after removal of vertices or links such as connectivity, toughness, scattering number, binding number, residual closeness and integrity. In this paper, we consider betweenness centrality of a graph. Betweenness centrality of a vertex of a graph is portion of the shortest paths all pairs of vertices passing through a given vertex. In this paper, we obtain exact values for betweenness centrality for some wheel related graphs namely gear, helm, sunflower and friendship graphs.

  11. Robustifying Bayesian nonparametric mixtures for count data.

    Science.gov (United States)

    Canale, Antonio; Prünster, Igor

    2017-03-01

    Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.

  12. Enabling Graph Appliance for Genome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Singh, Rina [ORNL; Graves, Jeffrey A [ORNL; Lee, Sangkeun (Matt) [ORNL; Sukumar, Sreenivas R [ORNL; Shankar, Mallikarjun [ORNL

    2015-01-01

    In recent years, there has been a huge growth in the amount of genomic data available as reads generated from various genome sequencers. The number of reads generated can be huge, ranging from hundreds to billions of nucleotide, each varying in size. Assembling such large amounts of data is one of the challenging computational problems for both biomedical and data scientists. Most of the genome assemblers developed have used de Bruijn graph techniques. A de Bruijn graph represents a collection of read sequences by billions of vertices and edges, which require large amounts of memory and computational power to store and process. This is the major drawback to de Bruijn graph assembly. Massively parallel, multi-threaded, shared memory systems can be leveraged to overcome some of these issues. The objective of our research is to investigate the feasibility and scalability issues of de Bruijn graph assembly on Cray s Urika-GD system; Urika-GD is a high performance graph appliance with a large shared memory and massively multithreaded custom processor designed for executing SPARQL queries over large-scale RDF data sets. However, to the best of our knowledge, there is no research on representing a de Bruijn graph as an RDF graph or finding Eulerian paths in RDF graphs using SPARQL for potential genome discovery. In this paper, we address the issues involved in representing a de Bruin graphs as RDF graphs and propose an iterative querying approach for finding Eulerian paths in large RDF graphs. We evaluate the performance of our implementation on real world ebola genome datasets and illustrate how genome assembly can be accomplished with Urika-GD using iterative SPARQL queries.

  13. Detection of protein complex from protein-protein interaction network using Markov clustering

    International Nuclear Information System (INIS)

    Ochieng, P J; Kusuma, W A; Haryanto, T

    2017-01-01

    Detection of complexes, or groups of functionally related proteins, is an important challenge while analysing biological networks. However, existing algorithms to identify protein complexes are insufficient when applied to dense networks of experimentally derived interaction data. Therefore, we introduced a graph clustering method based on Markov clustering algorithm to identify protein complex within highly interconnected protein-protein interaction networks. Protein-protein interaction network was first constructed to develop geometrical network, the network was then partitioned using Markov clustering to detect protein complexes. The interest of the proposed method was illustrated by its application to Human Proteins associated to type II diabetes mellitus. Flow simulation of MCL algorithm was initially performed and topological properties of the resultant network were analysed for detection of the protein complex. The results indicated the proposed method successfully detect an overall of 34 complexes with 11 complexes consisting of overlapping modules and 20 non-overlapping modules. The major complex consisted of 102 proteins and 521 interactions with cluster modularity and density of 0.745 and 0.101 respectively. The comparison analysis revealed MCL out perform AP, MCODE and SCPS algorithms with high clustering coefficient (0.751) network density and modularity index (0.630). This demonstrated MCL was the most reliable and efficient graph clustering algorithm for detection of protein complexes from PPI networks. (paper)

  14. On 4-critical t-perfect graphs

    OpenAIRE

    Benchetrit, Yohann

    2016-01-01

    It is an open question whether the chromatic number of $t$-perfect graphs is bounded by a constant. The largest known value for this parameter is 4, and the only example of a 4-critical $t$-perfect graph, due to Laurent and Seymour, is the complement of the line graph of the prism $\\Pi$ (a graph is 4-critical if it has chromatic number 4 and all its proper induced subgraphs are 3-colorable). In this paper, we show a new example of a 4-critical $t$-perfect graph: the complement of the line gra...

  15. Local adjacency metric dimension of sun graph and stacked book graph

    Science.gov (United States)

    Yulisda Badri, Alifiah; Darmaji

    2018-03-01

    A graph is a mathematical system consisting of a non-empty set of nodes and a set of empty sides. One of the topics to be studied in graph theory is the metric dimension. Application in the metric dimension is the navigation robot system on a path. Robot moves from one vertex to another vertex in the field by minimizing the errors that occur in translating the instructions (code) obtained from the vertices of that location. To move the robot must give different instructions (code). In order for the robot to move efficiently, the robot must be fast to translate the code of the nodes of the location it passes. so that the location vertex has a minimum distance. However, if the robot must move with the vertex location on a very large field, so the robot can not detect because the distance is too far.[6] In this case, the robot can determine its position by utilizing location vertices based on adjacency. The problem is to find the minimum cardinality of the required location vertex, and where to put, so that the robot can determine its location. The solution to this problem is the dimension of adjacency metric and adjacency metric bases. Rodrguez-Velzquez and Fernau combine the adjacency metric dimensions with local metric dimensions, thus becoming the local adjacency metric dimension. In the local adjacency metric dimension each vertex in the graph may have the same adjacency representation as the terms of the vertices. To obtain the local metric dimension of values in the graph of the Sun and the stacked book graph is used the construction method by considering the representation of each adjacent vertex of the graph.

  16. Mechatronic modeling and simulation using bond graphs

    CERN Document Server

    Das, Shuvra

    2009-01-01

    Introduction to Mechatronics and System ModelingWhat Is Mechatronics?What Is a System and Why Model Systems?Mathematical Modeling Techniques Used in PracticeSoftwareBond Graphs: What Are They?Engineering SystemsPortsGeneralized VariablesBond GraphsBasic Components in SystemsA Brief Note about Bond Graph Power DirectionsSummary of Bond Direction RulesDrawing Bond Graphs for Simple Systems: Electrical and MechanicalSimplification Rules for Junction StructureDrawing Bond Graphs for Electrical SystemsDrawing Bond Graphs for Mechanical SystemsCausalityDrawing Bond Graphs for Hydraulic and Electronic Components and SystemsSome Basic Properties and Concepts for FluidsBond Graph Model of Hydraulic SystemsElectronic SystemsDeriving System Equations from Bond GraphsSystem VariablesDeriving System EquationsTackling Differential CausalityAlgebraic LoopsSolution of Model Equations and Their InterpretationZeroth Order SystemsFirst Order SystemsSecond Order SystemTransfer Functions and Frequency ResponsesNumerical Solution ...

  17. Intraplate seismicity in Canada: a graph theoretic approach to data analysis and interpretation

    Directory of Open Access Journals (Sweden)

    K. Vasudevan

    2010-10-01

    Full Text Available Intraplate seismicity occurs in central and northern Canada, but the underlying origin and dynamics remain poorly understood. Here, we apply a graph theoretic approach to characterize the statistical structure of spatiotemporal clustering exhibited by intraplate seismicity, a direct consequence of the underlying nonlinear dynamics. Using a recently proposed definition of "recurrences" based on record breaking processes (Davidsen et al., 2006, 2008, we have constructed directed graphs using catalogue data for three selected regions (Region 1: 45°−48° N/74°−80° W; Region 2: 51°−55° N/77°−83° W; and Region 3: 56°−70° N/65°−95° W, with attributes drawn from the location, origin time and the magnitude of the events. Based on comparisons with a null model derived from Poisson distribution or Monte Carlo shuffling of the catalogue data, our results provide strong evidence in support of spatiotemporal correlations of seismicity in all three regions considered. Similar evidence for spatiotemporal clustering has been documented using seismicity catalogues for southern California, suggesting possible similarities in underlying earthquake dynamics of both regions despite huge differences in the variability of seismic activity.

  18. Capabilities of R Package mixAK for Clustering Based on Multivariate Continuous and Discrete Longitudinal Data

    Directory of Open Access Journals (Sweden)

    Arnošt Komárek

    2014-09-01

    Full Text Available R package mixAK originally implemented routines primarily for Bayesian estimation of finite normal mixture models for possibly interval-censored data. The functionality of the package was considerably enhanced by implementing methods for Bayesian estimation of mixtures of multivariate generalized linear mixed models proposed in Komrek and Komrkov (2013. Among other things, this allows for a cluster analysis (classification based on multivariate continuous and discrete longitudinal data that arise whenever multiple outcomes of a different nature are recorded in a longitudinal study. This package also allows for a data-driven selection of a number of clusters as methods for selecting a number of mixture components were implemented. A model and clustering methodology for multivariate continuous and discrete longitudinal data is overviewed. Further, a step-by-step cluster analysis based jointly on three longitudinal variables of different types (continuous, count, dichotomous is given, which provides a user manual for using the package for similar problems.

  19. Bayesian optimization for materials science

    CERN Document Server

    Packwood, Daniel

    2017-01-01

    This book provides a short and concise introduction to Bayesian optimization specifically for experimental and computational materials scientists. After explaining the basic idea behind Bayesian optimization and some applications to materials science in Chapter 1, the mathematical theory of Bayesian optimization is outlined in Chapter 2. Finally, Chapter 3 discusses an application of Bayesian optimization to a complicated structure optimization problem in computational surface science. Bayesian optimization is a promising global optimization technique that originates in the field of machine learning and is starting to gain attention in materials science. For the purpose of materials design, Bayesian optimization can be used to predict new materials with novel properties without extensive screening of candidate materials. For the purpose of computational materials science, Bayesian optimization can be incorporated into first-principles calculations to perform efficient, global structure optimizations. While re...

  20. Expert interpretation of bar and line graphs: The role of graphicacy in reducing the effect of graph format.

    Directory of Open Access Journals (Sweden)

    David ePeebles

    2015-10-01

    Full Text Available The distinction between informational and computational equivalence of representations, first articulated by Larkin and Simon (1987 has been a fundamental principle in the analysis of diagrammatic reasoning which has been supported empirically on numerous occasions. We present an experiment that investigates this principle in relation to the performance of expert graph users of 2 x 2 'interaction' bar and line graphs. The study sought to determine whether expert interpretation is affected by graph format in the same way that novice interpretations are. The findings revealed that, unlike novices - and contrary to the assumptions of several graph comprehension models - experts' performance was the same for both graph formats, with their interpretation of bar graphs being no worse than that for line graphs. We discuss the implications of the study for guidelines for presenting such data and for models of expert graph comprehension.

  1. Proving termination of graph transformation systems using weighted type graphs over semirings

    NARCIS (Netherlands)

    Bruggink, H.J.S.; König, B.; Nolte, D.; Zantema, H.; Parisi-Presicce, F.; Westfechtel, B.

    2015-01-01

    We introduce techniques for proving uniform termination of graph transformation systems, based on matrix interpretations for string rewriting. We generalize this technique by adapting it to graph rewriting instead of string rewriting and by generalizing to ordered semirings. In this way we obtain a

  2. DIMENSI METRIK GRAPH LOBSTER Ln (q;r

    Directory of Open Access Journals (Sweden)

    PANDE GDE DONY GUMILAR

    2013-05-01

    Full Text Available The metric dimension of connected graph G is the cardinality of minimum resolving set in graph G. In this research, we study how to find the metric dimension of lobster graph Ln (q;r. Lobster graph Ln (q;r is a regular lobster graph with vertices backbone on the main path, every backbone vertex is connected to q hand vertices and every hand vertex is connected to r finger vertices, with n, q, r element of N. We obtain the metric dimension of lobster graph L2 (1;1 is 1, the metric dimension of lobster graph L2 (1;1 for n > 2 is 2.

  3. Summary: beyond fault trees to fault graphs

    International Nuclear Information System (INIS)

    Alesso, H.P.; Prassinos, P.; Smith, C.F.

    1984-09-01

    Fault Graphs are the natural evolutionary step over a traditional fault-tree model. A Fault Graph is a failure-oriented directed graph with logic connectives that allows cycles. We intentionally construct the Fault Graph to trace the piping and instrumentation drawing (P and ID) of the system, but with logical AND and OR conditions added. Then we evaluate the Fault Graph with computer codes based on graph-theoretic methods. Fault Graph computer codes are based on graph concepts, such as path set (a set of nodes traveled on a path from one node to another) and reachability (the complete set of all possible paths between any two nodes). These codes are used to find the cut-sets (any minimal set of component failures that will fail the system) and to evaluate the system reliability

  4. Eulerian Graphs and Related Topics

    CERN Document Server

    Fleischner, Herbert

    1990-01-01

    The two volumes comprising Part 1 of this work embrace the theme of Eulerian trails and covering walks. They should appeal both to researchers and students, as they contain enough material for an undergraduate or graduate graph theory course which emphasizes Eulerian graphs, and thus can be read by any mathematician not yet familiar with graph theory. But they are also of interest to researchers in graph theory because they contain many recent results, some of which are only partial solutions to more general problems. A number of conjectures have been included as well. Various problems (such a

  5. Understanding Computational Bayesian Statistics

    CERN Document Server

    Bolstad, William M

    2011-01-01

    A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic

  6. Bayesian statistics an introduction

    CERN Document Server

    Lee, Peter M

    2012-01-01

    Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee’s book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as wel

  7. The One Universal Graph — a free and open graph database

    Science.gov (United States)

    Ng, Liang S.; Champion, Corbin

    2016-02-01

    Recent developments in graph database mostly are huge projects involving big organizations, big operations and big capital, as the name Big Data attests. We proposed the concept of One Universal Graph (OUG) which states that all observable and known objects and concepts (physical, conceptual or digitally represented) can be connected with only one single graph; furthermore the OUG can be implemented with a very simple text file format with free software, capable of being executed on Android or smaller devices. As such the One Universal Graph Data Exchange (GOUDEX) modules can potentially be installed on hundreds of millions of Android devices and Intel compatible computers shipped annually. Coupled with its open nature and ability to connect to existing leading search engines and databases currently in operation, GOUDEX has the potential to become the largest and a better interface for users and programmers to interact with the data on the Internet. With a Web User Interface for users to use and program in native Linux environment, Free Crowdware implemented in GOUDEX can help inexperienced users learn programming with better organized documentation for free software, and is able to manage programmer's contribution down to a single line of code or a single variable in software projects. It can become the first practically realizable “Internet brain” on which a global artificial intelligence system can be implemented. Being practically free and open, One Universal Graph can have significant applications in robotics, artificial intelligence as well as social networks.

  8. DNA microarray data and contextual analysis of correlation graphs

    Directory of Open Access Journals (Sweden)

    Hingamp Pascal

    2003-04-01

    Full Text Available Abstract Background DNA microarrays are used to produce large sets of expression measurements from which specific biological information is sought. Their analysis requires efficient and reliable algorithms for dimensional reduction, classification and annotation. Results We study networks of co-expressed genes obtained from DNA microarray experiments. The mathematical concept of curvature on graphs is used to group genes or samples into clusters to which relevant gene or sample annotations are automatically assigned. Application to publicly available yeast and human lymphoma data demonstrates the reliability of the method in spite of its simplicity, especially with respect to the small number of parameters involved. Conclusions We provide a method for automatically determining relevant gene clusters among the many genes monitored with microarrays. The automatic annotations and the graphical interface improve the readability of the data. A C++ implementation, called Trixy, is available from http://tagc.univ-mrs.fr/bioinformatics/trixy.html.

  9. Degree-based graph construction

    International Nuclear Information System (INIS)

    Kim, Hyunju; Toroczkai, Zoltan; Erdos, Peter L; Miklos, Istvan; Szekely, Laszlo A

    2009-01-01

    Degree-based graph construction is a ubiquitous problem in network modelling (Newman et al 2006 The Structure and Dynamics of Networks (Princeton Studies in Complexity) (Princeton, NJ: Princeton University Press), Boccaletti et al 2006 Phys. Rep. 424 175), ranging from social sciences to chemical compounds and biochemical reaction networks in the cell. This problem includes existence, enumeration, exhaustive construction and sampling questions with aspects that are still open today. Here we give necessary and sufficient conditions for a sequence of nonnegative integers to be realized as a simple graph's degree sequence, such that a given (but otherwise arbitrary) set of connections from an arbitrarily given node is avoided. We then use this result to present a swap-free algorithm that builds all simple graphs realizing a given degree sequence. In a wider context, we show that our result provides a greedy construction method to build all the f-factor subgraphs (Tutte 1952 Can. J. Math. 4 314) embedded within K n setmn S k , where K n is the complete graph and S k is a star graph centred on one of the nodes. (fast track communication)

  10. Downhill Domination in Graphs

    Directory of Open Access Journals (Sweden)

    Haynes Teresa W.

    2014-08-01

    Full Text Available A path π = (v1, v2, . . . , vk+1 in a graph G = (V,E is a downhill path if for every i, 1 ≤ i ≤ k, deg(vi ≥ deg(vi+1, where deg(vi denotes the degree of vertex vi ∈ V. The downhill domination number equals the minimum cardinality of a set S ⊆ V having the property that every vertex v ∈ V lies on a downhill path originating from some vertex in S. We investigate downhill domination numbers of graphs and give upper bounds. In particular, we show that the downhill domination number of a graph is at most half its order, and that the downhill domination number of a tree is at most one third its order. We characterize the graphs obtaining each of these bounds

  11. Subsampling for graph power spectrum estimation

    KAUST Repository

    Chepuri, Sundeep Prabhakar; Leus, Geert

    2016-01-01

    In this paper we focus on subsampling stationary random signals that reside on the vertices of undirected graphs. Second-order stationary graph signals are obtained by filtering white noise and they admit a well-defined power spectrum. Estimating the graph power spectrum forms a central component of stationary graph signal processing and related inference tasks. We show that by sampling a significantly smaller subset of vertices and using simple least squares, we can reconstruct the power spectrum of the graph signal from the subsampled observations, without any spectral priors. In addition, a near-optimal greedy algorithm is developed to design the subsampling scheme.

  12. Proving relations between modular graph functions

    International Nuclear Information System (INIS)

    Basu, Anirban

    2016-01-01

    We consider modular graph functions that arise in the low energy expansion of the four graviton amplitude in type II string theory. The vertices of these graphs are the positions of insertions of vertex operators on the toroidal worldsheet, while the links are the scalar Green functions connecting the vertices. Graphs with four and five links satisfy several non-trivial relations, which have been proved recently. We prove these relations by using elementary properties of Green functions and the details of the graphs. We also prove a relation between modular graph functions with six links. (paper)

  13. Subsampling for graph power spectrum estimation

    KAUST Repository

    Chepuri, Sundeep Prabhakar

    2016-10-06

    In this paper we focus on subsampling stationary random signals that reside on the vertices of undirected graphs. Second-order stationary graph signals are obtained by filtering white noise and they admit a well-defined power spectrum. Estimating the graph power spectrum forms a central component of stationary graph signal processing and related inference tasks. We show that by sampling a significantly smaller subset of vertices and using simple least squares, we can reconstruct the power spectrum of the graph signal from the subsampled observations, without any spectral priors. In addition, a near-optimal greedy algorithm is developed to design the subsampling scheme.

  14. Semantic graphs and associative memories

    Science.gov (United States)

    Pomi, Andrés; Mizraji, Eduardo

    2004-12-01

    Graphs have been increasingly utilized in the characterization of complex networks from diverse origins, including different kinds of semantic networks. Human memories are associative and are known to support complex semantic nets; these nets are represented by graphs. However, it is not known how the brain can sustain these semantic graphs. The vision of cognitive brain activities, shown by modern functional imaging techniques, assigns renewed value to classical distributed associative memory models. Here we show that these neural network models, also known as correlation matrix memories, naturally support a graph representation of the stored semantic structure. We demonstrate that the adjacency matrix of this graph of associations is just the memory coded with the standard basis of the concept vector space, and that the spectrum of the graph is a code invariant of the memory. As long as the assumptions of the model remain valid this result provides a practical method to predict and modify the evolution of the cognitive dynamics. Also, it could provide us with a way to comprehend how individual brains that map the external reality, almost surely with different particular vector representations, are nevertheless able to communicate and share a common knowledge of the world. We finish presenting adaptive association graphs, an extension of the model that makes use of the tensor product, which provides a solution to the known problem of branching in semantic nets.

  15. RNA graph partitioning for the discovery of RNA modularity: a novel application of graph partition algorithm to biology.

    Directory of Open Access Journals (Sweden)

    Namhee Kim

    Full Text Available Graph representations have been widely used to analyze and design various economic, social, military, political, and biological networks. In systems biology, networks of cells and organs are useful for understanding disease and medical treatments and, in structural biology, structures of molecules can be described, including RNA structures. In our RNA-As-Graphs (RAG framework, we represent RNA structures as tree graphs by translating unpaired regions into vertices and helices into edges. Here we explore the modularity of RNA structures by applying graph partitioning known in graph theory to divide an RNA graph into subgraphs. To our knowledge, this is the first application of graph partitioning to biology, and the results suggest a systematic approach for modular design in general. The graph partitioning algorithms utilize mathematical properties of the Laplacian eigenvector (µ2 corresponding to the second eigenvalues (λ2 associated with the topology matrix defining the graph: λ2 describes the overall topology, and the sum of µ2's components is zero. The three types of algorithms, termed median, sign, and gap cuts, divide a graph by determining nodes of cut by median, zero, and largest gap of µ2's components, respectively. We apply these algorithms to 45 graphs corresponding to all solved RNA structures up through 11 vertices (∼ 220 nucleotides. While we observe that the median cut divides a graph into two similar-sized subgraphs, the sign and gap cuts partition a graph into two topologically-distinct subgraphs. We find that the gap cut produces the best biologically-relevant partitioning for RNA because it divides RNAs at less stable connections while maintaining junctions intact. The iterative gap cuts suggest basic modules and assembly protocols to design large RNA structures. Our graph substructuring thus suggests a systematic approach to explore the modularity of biological networks. In our applications to RNA structures, subgraphs

  16. A generalization of total graphs

    Indian Academy of Sciences (India)

    M Afkhami

    2018-04-12

    Apr 12, 2018 ... product of any lower triangular matrix with the transpose of any element of U belongs to U. The ... total graph of R, which is denoted by T( (R)), is a simple graph with all elements of R as vertices, and ...... [9] Badawi A, On dot-product graph of a commutative ring, Communications in Algebra 43 (2015). 43–50.

  17. Bayesian networks with examples in R

    CERN Document Server

    Scutari, Marco

    2014-01-01

    Introduction. The Discrete Case: Multinomial Bayesian Networks. The Continuous Case: Gaussian Bayesian Networks. More Complex Cases. Theory and Algorithms for Bayesian Networks. Real-World Applications of Bayesian Networks. Appendices. Bibliography.

  18. Generation of Individual Whole-Brain Atlases With Resting-State fMRI Data Using Simultaneous Graph Computation and Parcellation

    Directory of Open Access Journals (Sweden)

    J. Wang

    2018-05-01

    Full Text Available The human brain can be characterized as functional networks. Therefore, it is important to subdivide the brain appropriately in order to construct reliable networks. Resting-state functional connectivity-based parcellation is a commonly used technique to fulfill this goal. Here we propose a novel individual subject-level parcellation approach based on whole-brain resting-state functional magnetic resonance imaging (fMRI data. We first used a supervoxel method known as simple linear iterative clustering directly on resting-state fMRI time series to generate supervoxels, and then combined similar supervoxels to generate clusters using a clustering method known as graph-without-cut (GWC. The GWC approach incorporates spatial information and multiple features of the supervoxels by energy minimization, simultaneously yielding an optimal graph and brain parcellation. Meanwhile, it theoretically guarantees that the actual cluster number is exactly equal to the initialized cluster number. By comparing the results of the GWC approach and those of the random GWC approach, we demonstrated that GWC does not rely heavily on spatial structures, thus avoiding the challenges encountered in some previous whole-brain parcellation approaches. In addition, by comparing the GWC approach to two competing approaches, we showed that GWC achieved better parcellation performances in terms of different evaluation metrics. The proposed approach can be used to generate individualized brain atlases for applications related to cognition, development, aging, disease, personalized medicine, etc. The major source codes of this study have been made publicly available at https://github.com/yuzhounh/GWC.

  19. Decomposing a planar graph into an independent set and a 3-degenerate graph

    DEFF Research Database (Denmark)

    Thomassen, Carsten

    2001-01-01

    We prove the conjecture made by O. V. Borodin in 1976 that the vertex set of every planar graph can be decomposed into an independent set and a set inducing a 3-degenerate graph. (C) 2001 Academic Press....

  20. Commuting graphs of matrix algebras

    International Nuclear Information System (INIS)

    Akbari, S.; Bidkhori, H.; Mohammadian, A.

    2006-08-01

    The commuting graph of a ring R, denoted by Γ(R), is a graph whose vertices are all non- central elements of R and two distinct vertices x and y are adjacent if and only if xy = yx. The commuting graph of a group G, denoted by Γ(G), is similarly defined. In this paper we investigate some graph theoretic properties of Γ(M n (F)), where F is a field and n ≥ 2. Also we study the commuting graphs of some classical groups such as GL n (F) and SL n (F). We show that Γ(M n (F)) is a connected graph if and only if every field extension of F of degree n contains a proper intermediate field. We prove that apart from finitely many fields, a similar result is true for Γ(GL n (F)) and Γ(SL n (F)). Also we show that for two fields E and F and integers m, n ≥> 2, if Γ(M m (E)) ≅ Γ(M n (F)), then m = n and vertical bar E vertical bar = vertical bar F vertical bar. (author)

  1. Graph-theoretical concepts and physicochemical data

    Directory of Open Access Journals (Sweden)

    Lionello Pogliani

    2003-02-01

    Full Text Available Graph theoretical concepts have been used to model the molecular polarizabilities of fifty-four organic derivatives, and the induced dipole moment of a set of fifty-seven organic compounds divided into three subsets. The starting point of these modeling strategies is the hydrogen-suppressed chemical graph and pseudograph of a molecule, which works very well for second row atoms. From these types of graphs a set of graph-theoretical basis indices, the molecular connectivity indices, can be derived and used to model properties and activities of molecules. With the aid of the molecular connectivity basis indices it is then possible to build higher-order descriptors. The problem of 'graph' encoding the contribution of the inner-core electrons of heteroatoms can here be solved with the aid of odd complete graphs, Kp-(p-odd. The use of these graph tools allow to draw an optimal modeling of the molecular polarizabilities and a satisfactory modeling of the induced dipole moment of a wide set of organic derivatives.

  2. Graphs whose complement and square are isomorphic

    DEFF Research Database (Denmark)

    Pedersen, Anders Sune

    2014-01-01

    We study square-complementary graphs, that is, graphs whose complement and square are isomorphic. We prove several necessary conditions for a graph to be square-complementary, describe ways of building new square-complementary graphs from existing ones, construct infinite families of square-compl...

  3. Graphs & digraphs

    CERN Document Server

    Chartrand, Gary; Zhang, Ping

    2010-01-01

    Gary Chartrand has influenced the world of Graph Theory for almost half a century. He has supervised more than a score of Ph.D. dissertations and written several books on the subject. The most widely known of these texts, Graphs and Digraphs, … has much to recommend it, with clear exposition, and numerous challenging examples [that] make it an ideal textbook for the advanced undergraduate or beginning graduate course. The authors have updated their notation to reflect the current practice in this still-growing area of study. By the authors' estimation, the 5th edition is approximately 50% longer than the 4th edition. … the legendary Frank Harary, author of the second graph theory text ever produced, is one of the figures profiled. His book was the standard in the discipline for several decades. Chartrand, Lesniak and Zhang have produced a worthy successor.-John T. Saccoman, MAA Reviews, June 2012 (This book is in the MAA's basic library list.)As with the earlier editions, the current text emphasizes clear...

  4. Packing Degenerate Graphs Greedily

    Czech Academy of Sciences Publication Activity Database

    Allen, P.; Böttcher, J.; Hladký, J.; Piguet, Diana

    2017-01-01

    Roč. 61, August (2017), s. 45-51 ISSN 1571-0653 R&D Projects: GA ČR GJ16-07822Y Institutional support: RVO:67985807 Keywords : tree packing conjecture * graph packing * graph processes Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics

  5. Continuous-time quantum walks on star graphs

    International Nuclear Information System (INIS)

    Salimi, S.

    2009-01-01

    In this paper, we investigate continuous-time quantum walk on star graphs. It is shown that quantum central limit theorem for a continuous-time quantum walk on star graphs for N-fold star power graph, which are invariant under the quantum component of adjacency matrix, converges to continuous-time quantum walk on K 2 graphs (complete graph with two vertices) and the probability of observing walk tends to the uniform distribution.

  6. Tailored Random Graph Ensembles

    International Nuclear Information System (INIS)

    Roberts, E S; Annibale, A; Coolen, A C C

    2013-01-01

    Tailored graph ensembles are a developing bridge between biological networks and statistical mechanics. The aim is to use this concept to generate a suite of rigorous tools that can be used to quantify and compare the topology of cellular signalling networks, such as protein-protein interaction networks and gene regulation networks. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies of random graph ensembles constrained with degree distribution and degree-degree correlation. We also construct an ergodic detailed balance Markov chain with non-trivial acceptance probabilities which converges to a strictly uniform measure and is based on edge swaps that conserve all degrees. The acceptance probabilities can be generalized to define Markov chains that target any alternative desired measure on the space of directed or undirected graphs, in order to generate graphs with more sophisticated topological features.

  7. The Smallest Valid Extension-Based Efficient, Rare Graph Pattern Mining, Considering Length-Decreasing Support Constraints and Symmetry Characteristics of Graphs

    Directory of Open Access Journals (Sweden)

    Unil Yun

    2016-05-01

    Full Text Available Frequent graph mining has been proposed to find interesting patterns (i.e., frequent sub-graphs from databases composed of graph transaction data, which can effectively express complex and large data in the real world. In addition, various applications for graph mining have been suggested. Traditional graph pattern mining methods use a single minimum support threshold factor in order to check whether or not mined patterns are interesting. However, it is not a sufficient factor that can consider valuable characteristics of graphs such as graph sizes and features of graph elements. That is, previous methods cannot consider such important characteristics in their mining operations since they only use a fixed minimum support threshold in the mining process. For this reason, in this paper, we propose a novel graph mining algorithm that can consider various multiple, minimum support constraints according to the types of graph elements and changeable minimum support conditions, depending on lengths of graph patterns. In addition, the proposed algorithm performs in mining operations more efficiently because it can minimize duplicated operations and computational overheads by considering symmetry features of graphs. Experimental results provided in this paper demonstrate that the proposed algorithm outperforms previous mining approaches in terms of pattern generation, runtime and memory usage.

  8. Applying graphs and complex networks to football metric interpretation.

    Science.gov (United States)

    Arriaza-Ardiles, E; Martín-González, J M; Zuniga, M D; Sánchez-Flores, J; de Saa, Y; García-Manso, J M

    2018-02-01

    This work presents a methodology for analysing the interactions between players in a football team, from the point of view of graph theory and complex networks. We model the complex network of passing interactions between players of a same team in 32 official matches of the Liga de Fútbol Profesional (Spain), using a passing/reception graph. This methodology allows us to understand the play structure of the team, by analysing the offensive phases of game-play. We utilise two different strategies for characterising the contribution of the players to the team: the clustering coefficient, and centrality metrics (closeness and betweenness). We show the application of this methodology by analyzing the performance of a professional Spanish team according to these metrics and the distribution of passing/reception in the field. Keeping in mind the dynamic nature of collective sports, in the future we will incorporate metrics which allows us to analyse the performance of the team also according to the circumstances of game-play and to different contextual variables such as, the utilisation of the field space, the time, and the ball, according to specific tactical situations. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Graph state generation with noisy mirror-inverting spin chains

    International Nuclear Information System (INIS)

    Clark, Stephen R; Klein, Alexander; Bruderer, Martin; Jaksch, Dieter

    2007-01-01

    We investigate the influence of noise on a graph state generation scheme which exploits a mirror inverting spin chain. Within this scheme the spin chain is used repeatedly as an entanglement bus (EB) to create multi-partite entanglement. The noise model we consider comprises of each spin of this EB being exposed to independent local noise which degrades the capabilities of the EB. Here we concentrate on quantifying its performance as a single-qubit channel and as a mediator of a two-qubit entangling gate, since these are basic operations necessary for graph state generation using the EB. In particular, for the single-qubit case we numerically calculate the average channel fidelity and whether the channel becomes entanglement breaking, i.e. expunges any entanglement the transferred qubit may have with other external qubits. We find that neither local decay nor dephasing noise cause entanglement breaking. This is in contrast to local thermal and depolarizing noise where we determine a critical length and critical noise coupling, respectively, at which entanglement breaking occurs. The critical noise coupling for local depolarizing noise is found to exhibit a power-law dependence on the chain length. For two-qubits we similarly compute the average gate fidelity and whether the ability for this gate to create entanglement is maintained. The concatenation of these noisy gates for the construction of a five-qubit linear cluster state and a Greenberger-Horne-Zeilinger state indicates that the level of noise that can be tolerated for graph state generation is tightly constrained

  10. Relating zeta functions of discrete and quantum graphs

    Science.gov (United States)

    Harrison, Jonathan; Weyand, Tracy

    2018-02-01

    We write the spectral zeta function of the Laplace operator on an equilateral metric graph in terms of the spectral zeta function of the normalized Laplace operator on the corresponding discrete graph. To do this, we apply a relation between the spectrum of the Laplacian on a discrete graph and that of the Laplacian on an equilateral metric graph. As a by-product, we determine how the multiplicity of eigenvalues of the quantum graph, that are also in the spectrum of the graph with Dirichlet conditions at the vertices, depends on the graph geometry. Finally we apply the result to calculate the vacuum energy and spectral determinant of a complete bipartite graph and compare our results with those for a star graph, a graph in which all vertices are connected to a central vertex by a single edge.

  11. A graph-theory framework for evaluating landscape connectivity and conservation planning.

    Science.gov (United States)

    Minor, Emily S; Urban, Dean L

    2008-04-01

    Connectivity of habitat patches is thought to be important for movement of genes, individuals, populations, and species over multiple temporal and spatial scales. We used graph theory to characterize multiple aspects of landscape connectivity in a habitat network in the North Carolina Piedmont (U.S.A). We compared this landscape with simulated networks with known topology, resistance to disturbance, and rate of movement. We introduced graph measures such as compartmentalization and clustering, which can be used to identify locations on the landscape that may be especially resilient to human development or areas that may be most suitable for conservation. Our analyses indicated that for songbirds the Piedmont habitat network was well connected. Furthermore, the habitat network had commonalities with planar networks, which exhibit slow movement, and scale-free networks, which are resistant to random disturbances. These results suggest that connectivity in the habitat network was high enough to prevent the negative consequences of isolation but not so high as to allow rapid spread of disease. Our graph-theory framework provided insight into regional and emergent global network properties in an intuitive and visual way and allowed us to make inferences about rates and paths of species movements and vulnerability to disturbance. This approach can be applied easily to assessing habitat connectivity in any fragmented or patchy landscape.

  12. The One Universal Graph — a free and open graph database

    International Nuclear Information System (INIS)

    Ng, Liang S.; Champion, Corbin

    2016-01-01

    Recent developments in graph database mostly are huge projects involving big organizations, big operations and big capital, as the name Big Data attests. We proposed the concept of One Universal Graph (OUG) which states that all observable and known objects and concepts (physical, conceptual or digitally represented) can be connected with only one single graph; furthermore the OUG can be implemented with a very simple text file format with free software, capable of being executed on Android or smaller devices. As such the One Universal Graph Data Exchange (GOUDEX) modules can potentially be installed on hundreds of millions of Android devices and Intel compatible computers shipped annually. Coupled with its open nature and ability to connect to existing leading search engines and databases currently in operation, GOUDEX has the potential to become the largest and a better interface for users and programmers to interact with the data on the Internet. With a Web User Interface for users to use and program in native Linux environment, Free Crowdware implemented in GOUDEX can help inexperienced users learn programming with better organized documentation for free software, and is able to manage programmer's contribution down to a single line of code or a single variable in software projects. It can become the first practically realizable “Internet brain” on which a global artificial intelligence system can be implemented. Being practically free and open, One Universal Graph can have significant applications in robotics, artificial intelligence as well as social networks. (paper)

  13. Recognition of fractal graphs

    NARCIS (Netherlands)

    Perepelitsa, VA; Sergienko, [No Value; Kochkarov, AM

    1999-01-01

    Definitions of prefractal and fractal graphs are introduced, and they are used to formulate mathematical models in different fields of knowledge. The topicality of fractal-graph recognition from the point of view, of fundamental improvement in the efficiency of the solution of algorithmic problems

  14. Interactive Graph Layout of a Million Nodes

    OpenAIRE

    Peng Mi; Maoyuan Sun; Moeti Masiane; Yong Cao; Chris North

    2016-01-01

    Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially support sensemaking of large graphs. However, designing interactive algorithms to achieve this is challenging. In this paper, we tackle the scalability problem of interactive layout of large graphs, and contribute a new GPU-based force-directed layout algorithm that exploits graph to...

  15. RATGRAPH: Computer Graphing of Rational Functions.

    Science.gov (United States)

    Minch, Bradley A.

    1987-01-01

    Presents an easy-to-use Applesoft BASIC program that graphs rational functions and any asymptotes that the functions might have. Discusses the nature of rational functions, graphing them manually, employing a computer to graph rational functions, and describes how the program works. (TW)

  16. Graph algorithms in the titan toolkit.

    Energy Technology Data Exchange (ETDEWEB)

    McLendon, William Clarence, III; Wylie, Brian Neil

    2009-10-01

    Graph algorithms are a key component in a wide variety of intelligence analysis activities. The Graph-Based Informatics for Non-Proliferation and Counter-Terrorism project addresses the critical need of making these graph algorithms accessible to Sandia analysts in a manner that is both intuitive and effective. Specifically we describe the design and implementation of an open source toolkit for doing graph analysis, informatics, and visualization that provides Sandia with novel analysis capability for non-proliferation and counter-terrorism.

  17. A Graph Calculus for Predicate Logic

    Directory of Open Access Journals (Sweden)

    Paulo A. S. Veloso

    2013-03-01

    Full Text Available We introduce a refutation graph calculus for classical first-order predicate logic, which is an extension of previous ones for binary relations. One reduces logical consequence to establishing that a constructed graph has empty extension, i. e. it represents bottom. Our calculus establishes that a graph has empty extension by converting it to a normal form, which is expanded to other graphs until we can recognize conflicting situations (equivalent to a formula and its negation.

  18. Deep Learning with Dynamic Computation Graphs

    OpenAIRE

    Looks, Moshe; Herreshoff, Marcello; Hutchins, DeLesley; Norvig, Peter

    2017-01-01

    Neural networks that compute over graph structures are a natural fit for problems in a variety of domains, including natural language (parse trees) and cheminformatics (molecular graphs). However, since the computation graph has a different shape and size for every input, such networks do not directly support batched training or inference. They are also difficult to implement in popular deep learning libraries, which are based on static data-flow graphs. We introduce a technique called dynami...

  19. Constructs for Programming with Graph Rewrites

    OpenAIRE

    Rodgers, Peter

    2000-01-01

    Graph rewriting is becoming increasingly popular as a method for programming with graph based data structures. We present several modifications to a basic serial graph rewriting paradigm and discuss how they improve coding programs in the Grrr graph rewriting programming language. The constructs we present are once only nodes, attractor nodes and single match rewrites. We illustrate the operation of the constructs by example. The advantages of adding these new rewrite modifiers is to reduce t...

  20. Quantum chaos on discrete graphs

    International Nuclear Information System (INIS)

    Smilansky, Uzy

    2007-01-01

    Adapting a method developed for the study of quantum chaos on quantum (metric) graphs (Kottos and Smilansky 1997 Phys. Rev. Lett. 79 4794, Kottos and Smilansky 1999 Ann. Phys., NY 274 76), spectral ζ functions and trace formulae for discrete Laplacians on graphs are derived. This is achieved by expressing the spectral secular equation in terms of the periodic orbits of the graph and obtaining functions which belong to the class of ζ functions proposed originally by Ihara (1966 J. Mat. Soc. Japan 18 219) and expanded by subsequent authors (Stark and Terras 1996 Adv. Math. 121 124, Kotani and Sunada 2000 J. Math. Sci. Univ. Tokyo 7 7). Finally, a model of 'classical dynamics' on the discrete graph is proposed. It is analogous to the corresponding classical dynamics derived for quantum graphs (Kottos and Smilansky 1997 Phys. Rev. Lett. 79 4794, Kottos and Smilansky 1999 Ann. Phys., NY 274 76). (fast track communication)

  1. RJSplot: Interactive Graphs with R.

    Science.gov (United States)

    Barrios, David; Prieto, Carlos

    2018-03-01

    Data visualization techniques provide new methods for the generation of interactive graphs. These graphs allow a better exploration and interpretation of data but their creation requires advanced knowledge of graphical libraries. Recent packages have enabled the integration of interactive graphs in R. However, R provides limited graphical packages that allow the generation of interactive graphs for computational biology applications. The present project has joined the analytical power of R with the interactive graphical features of JavaScript in a new R package (RJSplot). It enables the easy generation of interactive graphs in R, provides new visualization capabilities, and contributes to the advance of computational biology analytical methods. At present, 16 interactive graphics are available in RJSplot, such as the genome viewer, Manhattan plots, 3D plots, heatmaps, dendrograms, networks, and so on. The RJSplot package is freely available online at http://rjsplot.net. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Bipartite separability and nonlocal quantum operations on graphs

    Science.gov (United States)

    Dutta, Supriyo; Adhikari, Bibhas; Banerjee, Subhashish; Srikanth, R.

    2016-07-01

    In this paper we consider the separability problem for bipartite quantum states arising from graphs. Earlier it was proved that the degree criterion is the graph-theoretic counterpart of the familiar positive partial transpose criterion for separability, although there are entangled states with positive partial transpose for which the degree criterion fails. Here we introduce the concept of partially symmetric graphs and degree symmetric graphs by using the well-known concept of partial transposition of a graph and degree criteria, respectively. Thus, we provide classes of bipartite separable states of dimension m ×n arising from partially symmetric graphs. We identify partially asymmetric graphs that lack the property of partial symmetry. We develop a combinatorial procedure to create a partially asymmetric graph from a given partially symmetric graph. We show that this combinatorial operation can act as an entanglement generator for mixed states arising from partially symmetric graphs.

  3. Algorithms and Data Structures for Graphs

    DEFF Research Database (Denmark)

    Rotenberg, Eva

    are planar graphs, which are those that can be drawn on a piece of paper without any pair of edges crossing. For planar graphs where each edge can only be traversed in one direction, a fundamental question is whether there is a route from vertex A to vertex B in the graph. We show how such a graph can...... of the form: "Is there an edge such that all paths between A and B go via that edge?" and which can quickly be updated when edges are inserted or deleted. We further show how to represent a planar graph such that we can quickly update our representation when an edge is deleted, and such that questions...

  4. On the nullity number of graphs

    Directory of Open Access Journals (Sweden)

    Mustapha Aouchiche

    2017-10-01

    Full Text Available The paper discusses bounds on the nullity number of graphs. It is proved in [B. Cheng and B. Liu, On the nullity of graphs. Electron. J. Linear Algebra 16 (2007 60--67] that $\\eta \\le n - D$, where $\\eta$, n and D denote the nullity number, the order and the diameter of a connected graph, respectively. We first give a necessary condition on the extremal graphs corresponding to that bound, and then we strengthen the bound itself using the maximum clique number. In addition, we prove bounds on the nullity using the number of pendant neighbors in a graph. One of those bounds is an improvement of a known bound involving the domination number.

  5. Electricity Purchase Optimization Decision Based on Data Mining and Bayesian Game

    Directory of Open Access Journals (Sweden)

    Yajing Gao

    2018-04-01

    Full Text Available The openness of the electricity retail market results in the power retailers facing fierce competition in the market. This article aims to analyze the electricity purchase optimization decision-making of each power retailer with the background of the big data era. First, in order to guide the power retailer to make a purchase of electricity, this paper considers the users’ historical electricity consumption data and a comprehensive consideration of multiple factors, then uses the wavelet neural network (WNN model based on “meteorological similarity day (MSD” to forecast the user load demand. Second, in order to guide the quotation of the power retailer, this paper considers the multiple factors affecting the electricity price to cluster the sample set, and establishes a Genetic algorithm- back propagation (GA-BP neural network model based on fuzzy clustering (FC to predict the short-term market clearing price (MCP. Thirdly, based on Sealed-bid Auction (SA in game theory, a Bayesian Game Model (BGM of the power retailer’s bidding strategy is constructed, and the optimal bidding strategy is obtained by obtaining the Bayesian Nash Equilibrium (BNE under different probability distributions. Finally, a practical example is proposed to prove that the model and method can provide an effective reference for the decision-making optimization of the sales company.

  6. Graph Transforming Java Data

    NARCIS (Netherlands)

    de Mol, M.J.; Rensink, Arend; Hunt, James J.

    This paper introduces an approach for adding graph transformation-based functionality to existing JAVA programs. The approach relies on a set of annotations to identify the intended graph structure, as well as on user methods to manipulate that structure, within the user’s own JAVA class

  7. Graph processing platforms at scale: practices and experiences

    Energy Technology Data Exchange (ETDEWEB)

    Lim, Seung-Hwan [ORNL; Lee, Sangkeun (Matt) [ORNL; Brown, Tyler C [ORNL; Sukumar, Sreenivas R [ORNL; Ganesh, Gautam [ORNL

    2015-01-01

    Graph analysis unveils hidden associations of data in many phenomena and artifacts, such as road network, social networks, genomic information, and scientific collaboration. Unfortunately, a wide diversity in the characteristics of graphs and graph operations make it challenging to find a right combination of tools and implementation of algorithms to discover desired knowledge from the target data set. This study presents an extensive empirical study of three representative graph processing platforms: Pegasus, GraphX, and Urika. Each system represents a combination of options in data model, processing paradigm, and infrastructure. We benchmarked each platform using three popular graph operations, degree distribution, connected components, and PageRank over a variety of real-world graphs. Our experiments show that each graph processing platform shows different strength, depending the type of graph operations. While Urika performs the best in non-iterative operations like degree distribution, GraphX outputforms iterative operations like connected components and PageRank. In addition, we discuss challenges to optimize the performance of each platform over large scale real world graphs.

  8. Generalized hypercube graph $\\Q_n(S$, graph products and self-orthogonal codes

    Directory of Open Access Journals (Sweden)

    Pani Seneviratne

    2016-01-01

    Full Text Available A generalized hypercube graph $\\Q_n(S$ has $\\F_{2}^{n}=\\{0,1\\}^n$ as the vertex set and two vertices being adjacent whenever their mutual Hamming distance belongs to $S$, where $n \\ge 1$ and $S\\subseteq \\{1,2,\\ldots, n\\}$. The graph $\\Q_n(\\{1\\}$ is the $n$-cube, usually denoted by $\\Q_n$.We study graph boolean products $G_1 = \\Q_n(S\\times \\Q_1, G_2 = \\Q_{n}(S\\wedge \\Q_1$, $G_3 = \\Q_{n}(S[\\Q_1]$ and show that binary codes from neighborhood designs of $G_1, G_2$ and $G_3$ are self-orthogonal for all choices of $n$ and $S$. More over, we show that the class of codes $C_1$ are self-dual. Further we find subgroups of the automorphism group of these graphs and use these subgroups to obtain PD-sets for permutation decoding. As an example we find a full error-correcting PD set for the binary $[32, 16, 8]$ extremal self-dual code.

  9. Survey of Approaches to Generate Realistic Synthetic Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Lim, Seung-Hwan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lee, Sangkeun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Powers, Sarah S [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Shankar, Mallikarjun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Imam, Neena [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2016-10-01

    A graph is a flexible data structure that can represent relationships between entities. As with other data analysis tasks, the use of realistic graphs is critical to obtaining valid research results. Unfortunately, using the actual ("real-world") graphs for research and new algorithm development is difficult due to the presence of sensitive information in the data or due to the scale of data. This results in practitioners developing algorithms and systems that employ synthetic graphs instead of real-world graphs. Generating realistic synthetic graphs that provide reliable statistical confidence to algorithmic analysis and system evaluation involves addressing technical hurdles in a broad set of areas. This report surveys the state of the art in approaches to generate realistic graphs that are derived from fitted graph models on real-world graphs.

  10. Constructing Dense Graphs with Unique Hamiltonian Cycles

    Science.gov (United States)

    Lynch, Mark A. M.

    2012-01-01

    It is not difficult to construct dense graphs containing Hamiltonian cycles, but it is difficult to generate dense graphs that are guaranteed to contain a unique Hamiltonian cycle. This article presents an algorithm for generating arbitrarily large simple graphs containing "unique" Hamiltonian cycles. These graphs can be turned into dense graphs…

  11. Chromatic polynomials of random graphs

    International Nuclear Information System (INIS)

    Van Bussel, Frank; Fliegner, Denny; Timme, Marc; Ehrlich, Christoph; Stolzenberg, Sebastian

    2010-01-01

    Chromatic polynomials and related graph invariants are central objects in both graph theory and statistical physics. Computational difficulties, however, have so far restricted studies of such polynomials to graphs that were either very small, very sparse or highly structured. Recent algorithmic advances (Timme et al 2009 New J. Phys. 11 023001) now make it possible to compute chromatic polynomials for moderately sized graphs of arbitrary structure and number of edges. Here we present chromatic polynomials of ensembles of random graphs with up to 30 vertices, over the entire range of edge density. We specifically focus on the locations of the zeros of the polynomial in the complex plane. The results indicate that the chromatic zeros of random graphs have a very consistent layout. In particular, the crossing point, the point at which the chromatic zeros with non-zero imaginary part approach the real axis, scales linearly with the average degree over most of the density range. While the scaling laws obtained are purely empirical, if they continue to hold in general there are significant implications: the crossing points of chromatic zeros in the thermodynamic limit separate systems with zero ground state entropy from systems with positive ground state entropy, the latter an exception to the third law of thermodynamics.

  12. On some labelings of triangular snake and central graph of triangular snake graph

    Science.gov (United States)

    Agasthi, P.; Parvathi, N.

    2018-04-01

    A Triangular snake Tn is obtained from a path u 1 u 2 … u n by joining ui and u i+1 to a new vertex wi for 1≤i≤n‑1. A Central graph of Triangular snake C(T n ) is obtained by subdividing each edge of Tn exactly once and joining all the non adjacent vertices of Tn . In this paper the ways to construct square sum, square difference, Root Mean square, strongly Multiplicative, Even Mean and Odd Mean labeling for Triangular Snake and Central graph of Triangular Snake graphs are reported.

  13. Orientations of infinite graphs with prescribed edge-connectivity

    DEFF Research Database (Denmark)

    Thomassen, Carsten

    2016-01-01

    We prove a decomposition result for locally finite graphs which can be used to extend results on edge-connectivity from finite to infinite graphs. It implies that every 4k-edge-connected graph G contains an immersion of some finite 2k-edge-connected Eulerian graph containing any prescribed vertex...... set (while planar graphs show that G need not containa subdivision of a simple finite graph of large edge-connectivity). Also, every 8k-edge connected infinite graph has a k-arc-connected orientation, as conjectured in 1989....

  14. Genus Ranges of 4-Regular Rigid Vertex Graphs.

    Science.gov (United States)

    Buck, Dorothy; Dolzhenko, Egor; Jonoska, Nataša; Saito, Masahico; Valencia, Karin

    2015-01-01

    A rigid vertex of a graph is one that has a prescribed cyclic order of its incident edges. We study orientable genus ranges of 4-regular rigid vertex graphs. The (orientable) genus range is a set of genera values over all orientable surfaces into which a graph is embedded cellularly, and the embeddings of rigid vertex graphs are required to preserve the prescribed cyclic order of incident edges at every vertex. The genus ranges of 4-regular rigid vertex graphs are sets of consecutive integers, and we address two questions: which intervals of integers appear as genus ranges of such graphs, and what types of graphs realize a given genus range. For graphs with 2 n vertices ( n > 1), we prove that all intervals [ a, b ] for all a genus ranges. For graphs with 2 n - 1 vertices ( n ≥ 1), we prove that all intervals [ a, b ] for all a genus ranges. We also provide constructions of graphs that realize these ranges.

  15. DENBRAN: A basic program for a significance test for multivariate normality of clusters from branching patterns in dendrograms

    Science.gov (United States)

    Sneath, P. H. A.

    A BASIC program is presented for significance tests to determine whether a dendrogram is derived from clustering of points that belong to a single multivariate normal distribution. The significance tests are based on statistics of the Kolmogorov—Smirnov type, obtained by comparing the observed cumulative graph of branch levels with a graph for the hypothesis of multivariate normality. The program also permits testing whether the dendrogram could be from a cluster of lower dimensionality due to character correlations. The program makes provision for three similarity coefficients, (1) Euclidean distances, (2) squared Euclidean distances, and (3) Simple Matching Coefficients, and for five cluster methods (1) WPGMA, (2) UPGMA, (3) Single Linkage (or Minimum Spanning Trees), (4) Complete Linkage, and (5) Ward's Increase in Sums of Squares. The program is entitled DENBRAN.

  16. Chemical Graph Transformation with Stereo-Information

    DEFF Research Database (Denmark)

    Andersen, Jakob Lykke; Flamm, Christoph; Merkle, Daniel

    2017-01-01

    Double Pushout graph transformation naturally facilitates the modelling of chemical reactions: labelled undirected graphs model molecules and direct derivations model chemical reactions. However, the most straightforward modelling approach ignores the relative placement of atoms and their neighbo......Double Pushout graph transformation naturally facilitates the modelling of chemical reactions: labelled undirected graphs model molecules and direct derivations model chemical reactions. However, the most straightforward modelling approach ignores the relative placement of atoms...... and their neighbours in space. Stereoisomers of chemical compounds thus cannot be distinguished, even though their chemical activity may differ substantially. In this contribution we propose an extended chemical graph transformation system with attributes that encode information about local geometry. The modelling...... of graph transformation, but we here propose a framework that also allows for partially specified stereoinformation. While there are several stereochemical configurations to be considered, we focus here on the tetrahedral molecular shape, and suggest general principles for how to treat all other chemically...

  17. Large-Scale Graph Processing Using Apache Giraph

    KAUST Repository

    Sakr, Sherif

    2017-01-07

    This book takes its reader on a journey through Apache Giraph, a popular distributed graph processing platform designed to bring the power of big data processing to graph data. Designed as a step-by-step self-study guide for everyone interested in large-scale graph processing, it describes the fundamental abstractions of the system, its programming models and various techniques for using the system to process graph data at scale, including the implementation of several popular and advanced graph analytics algorithms.

  18. Large-Scale Graph Processing Using Apache Giraph

    KAUST Repository

    Sakr, Sherif; Orakzai, Faisal Moeen; Abdelaziz, Ibrahim; Khayyat, Zuhair

    2017-01-01

    This book takes its reader on a journey through Apache Giraph, a popular distributed graph processing platform designed to bring the power of big data processing to graph data. Designed as a step-by-step self-study guide for everyone interested in large-scale graph processing, it describes the fundamental abstractions of the system, its programming models and various techniques for using the system to process graph data at scale, including the implementation of several popular and advanced graph analytics algorithms.

  19. Graph Algorithm Animation with Grrr

    OpenAIRE

    Rodgers, Peter; Vidal, Natalia

    2000-01-01

    We discuss geometric positioning, highlighting of visited nodes and user defined highlighting that form the algorithm animation facilities in the Grrr graph rewriting programming language. The main purpose of animation was initially for the debugging and profiling of Grrr code, but recently it has been extended for the purpose of teaching algorithms to undergraduate students. The animation is restricted to graph based algorithms such as graph drawing, list manipulation or more traditional gra...

  20. Bayesian Mediation Analysis

    Science.gov (United States)

    Yuan, Ying; MacKinnon, David P.

    2009-01-01

    In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…

  1. Statistical analysis using the Bayesian nonparametric method for irradiation embrittlement of reactor pressure vessels

    Energy Technology Data Exchange (ETDEWEB)

    Takamizawa, Hisashi, E-mail: takamizawa.hisashi@jaea.go.jp; Itoh, Hiroto, E-mail: ito.hiroto@jaea.go.jp; Nishiyama, Yutaka, E-mail: nishiyama.yutaka93@jaea.go.jp

    2016-10-15

    In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.

  2. My Bar Graph Tells a Story

    Science.gov (United States)

    McMillen, Sue; McMillen, Beth

    2010-01-01

    Connecting stories to qualitative coordinate graphs has been suggested as an effective instructional strategy. Even students who are able to "create" bar graphs may struggle to correctly "interpret" them. Giving children opportunities to work with qualitative graphs can help them develop the skills to interpret, describe, and compare information…

  3. An intersection graph of straight lines

    DEFF Research Database (Denmark)

    Thomassen, Carsten

    2002-01-01

    G. Ehrlich, S. Even, and R.E. Tarjan conjectured that the graph obtained from a complete 3 partite graph K4,4,4 by deleting the edges of four disjoint triangles is not the intersection graph of straight line segments in the plane. We show that it is....

  4. Trajectories entropy in dynamical graphs with memory

    Directory of Open Access Journals (Sweden)

    Francesco eCaravelli

    2016-04-01

    Full Text Available In this paper we investigate the application of non-local graph entropy to evolving and dynamical graphs. The measure is based upon the notion of Markov diffusion on a graph, and relies on the entropy applied to trajectories originating at a specific node. In particular, we study the model of reinforcement-decay graph dynamics, which leads to scale free graphs. We find that the node entropy characterizes the structure of the network in the two parameter phase-space describing the dynamical evolution of the weighted graph. We then apply an adapted version of the entropy measure to purely memristive circuits. We provide evidence that meanwhile in the case of DC voltage the entropy based on the forward probability is enough to characterize the graph properties, in the case of AC voltage generators one needs to consider both forward and backward based transition probabilities. We provide also evidence that the entropy highlights the self-organizing properties of memristive circuits, which re-organizes itself to satisfy the symmetries of the underlying graph.

  5. On revealing graph cycles via boundary measurements

    International Nuclear Information System (INIS)

    Belishev, M I; Wada, N

    2009-01-01

    This paper deals with boundary value inverse problems on a metric graph, the structure of the graph being assumed unknown. The question under consideration is how to detect from the dynamical and/or spectral inverse data whether the graph contains cycles (is not a tree). For any graph Ω, the dynamical as well as spectral boundary inverse data determine the so-called wave diameter d w : H -1 (Ω) → R defined on functionals supported in the graph. The known fact is that if Ω is a tree then d w ≥ 0 holds and, in this case, the inverse data determine Ω up to isometry. A graph Ω is said to be coordinate if the functions {dist Ω (., γ)} γin∂Ω constitute a coordinate system on Ω. For such graphs, we propose a procedure, which reveals the presence/absence of cycles. The hypothesis is that Ω contains cycles if and only if d w takes negative values. We do not justify this hypothesis in the general case but reduce it to a certain special class of graphs (suns)

  6. OPEX: Optimized Eccentricity Computation in Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Henderson, Keith [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2011-11-14

    Real-world graphs have many properties of interest, but often these properties are expensive to compute. We focus on eccentricity, radius and diameter in this work. These properties are useful measures of the global connectivity patterns in a graph. Unfortunately, computing eccentricity for all nodes is O(n2) for a graph with n nodes. We present OPEX, a novel combination of optimizations which improves computation time of these properties by orders of magnitude in real-world experiments on graphs of many different sizes. We run OPEX on graphs with up to millions of links. OPEX gives either exact results or bounded approximations, unlike its competitors which give probabilistic approximations or sacrifice node-level information (eccentricity) to compute graphlevel information (diameter).

  7. Pixels to Graphs by Associative Embedding

    KAUST Repository

    Newell, Alejandro

    2017-06-22

    Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and report a Recall@50 of 9.7% compared to the prior state-of-the-art at 3.4%, a nearly threefold improvement on the challenging task of scene graph generation.

  8. DPpackage: Bayesian Semi- and Nonparametric Modeling in R

    Directory of Open Access Journals (Sweden)

    Alejandro Jara

    2011-04-01

    Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.

  9. Bayesian semiparametric regression models to characterize molecular evolution

    Directory of Open Access Journals (Sweden)

    Datta Saheli

    2012-10-01

    Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.

  10. Discriminative clustering on manifold for adaptive transductive classification.

    Science.gov (United States)

    Zhang, Zhao; Jia, Lei; Zhang, Min; Li, Bing; Zhang, Li; Li, Fanzhang

    2017-10-01

    In this paper, we mainly propose a novel adaptive transductive label propagation approach by joint discriminative clustering on manifolds for representing and classifying high-dimensional data. Our framework seamlessly combines the unsupervised manifold learning, discriminative clustering and adaptive classification into a unified model. Also, our method incorporates the adaptive graph weight construction with label propagation. Specifically, our method is capable of propagating label information using adaptive weights over low-dimensional manifold features, which is different from most existing studies that usually predict the labels and construct the weights in the original Euclidean space. For transductive classification by our formulation, we first perform the joint discriminative K-means clustering and manifold learning to capture the low-dimensional nonlinear manifolds. Then, we construct the adaptive weights over the learnt manifold features, where the adaptive weights are calculated through performing the joint minimization of the reconstruction errors over features and soft labels so that the graph weights can be joint-optimal for data representation and classification. Using the adaptive weights, we can easily estimate the unknown labels of samples. After that, our method returns the updated weights for further updating the manifold features. Extensive simulations on image classification and segmentation show that our proposed algorithm can deliver the state-of-the-art performance on several public datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. A faithful functor among algebras and graphs

    OpenAIRE

    Falcón Ganfornina, Óscar Jesús; Falcón Ganfornina, Raúl Manuel; Núñez Valdés, Juan; Pacheco Martínez, Ana María; Villar Liñán, María Trinidad; Vigo Aguiar, Jesús (Coordinador)

    2016-01-01

    The problem of identifying a functor between the categories of algebras and graphs is currently open. Based on a known algorithm that identifies isomorphisms of Latin squares with isomorphism of vertex-colored graphs, we describe here a pair of graphs that enable us to find a faithful functor between finite-dimensional algebras over finite fields and these graphs.

  12. Cellular Automata on Graphs: Topological Properties of ER Graphs Evolved towards Low-Entropy Dynamics

    Directory of Open Access Journals (Sweden)

    Marc-Thorsten Hütt

    2012-06-01

    Full Text Available Cellular automata (CA are a remarkably  efficient tool for exploring general properties of complex systems and spatiotemporal patterns arising from local rules. Totalistic cellular automata,  where the update  rules depend  only on the density of neighboring states, are at the same time a versatile  tool for exploring  dynamical  processes on graphs. Here we briefly review our previous results on cellular automata on graphs, emphasizing some systematic relationships between network architecture and dynamics identified in this way. We then extend the investigation  towards graphs obtained in a simulated-evolution procedure, starting from Erdő s–Rényi (ER graphs and selecting for low entropies of the CA dynamics. Our key result is a strong association of low Shannon entropies with a broadening of the graph’s degree distribution.

  13. The color space of a graph

    DEFF Research Database (Denmark)

    Jensen, T.R.; Thomassen, Carsten

    2000-01-01

    If k is a prime power, and G is a graph with n vertices, then a k-coloring of G may be considered as a vector in GF(k)(n). We prove that the subspace of GF(3)(n) spanned by all 3-colorings of a planar triangle-free graph with n vertices has dimension n. In particular, any such graph has at least n...... - 1 nonequivalent 3-colorings, and the addition of any edge or any vertex of degree 3 results in a 3-colorable graph. (C) 2000 John Wiley & Sons, Inc....

  14. Box graphs and resolutions I

    Directory of Open Access Journals (Sweden)

    Andreas P. Braun

    2016-04-01

    Full Text Available Box graphs succinctly and comprehensively characterize singular fibers of elliptic fibrations in codimension two and three, as well as flop transitions connecting these, in terms of representation theoretic data. We develop a framework that provides a systematic map between a box graph and a crepant algebraic resolution of the singular elliptic fibration, thus allowing an explicit construction of the fibers from a singular Weierstrass or Tate model. The key tool is what we call a fiber face diagram, which shows the relevant information of a (partial toric triangulation and allows the inclusion of more general algebraic blowups. We shown that each such diagram defines a sequence of weighted algebraic blowups, thus providing a realization of the fiber defined by the box graph in terms of an explicit resolution. We show this correspondence explicitly for the case of SU(5 by providing a map between box graphs and fiber faces, and thereby a sequence of algebraic resolutions of the Tate model, which realizes each of the box graphs.

  15. Forbidden Structures for Planar Perfect Consecutively Colourable Graphs

    Directory of Open Access Journals (Sweden)

    Borowiecka-Olszewska Marta

    2017-05-01

    Full Text Available A consecutive colouring of a graph is a proper edge colouring with posi- tive integers in which the colours of edges incident with each vertex form an interval of integers. The idea of this colouring was introduced in 1987 by Asratian and Kamalian under the name of interval colouring. Sevast- janov showed that the corresponding decision problem is NP-complete even restricted to the class of bipartite graphs. We focus our attention on the class of consecutively colourable graphs whose all induced subgraphs are consecutively colourable, too. We call elements of this class perfect consecutively colourable to emphasise the conceptual similarity to perfect graphs. Obviously, the class of perfect consecutively colourable graphs is induced hereditary, so it can be characterized by the family of induced forbidden graphs. In this work we give a necessary and sufficient conditions that must be satisfied by the generalized Sevastjanov rosette to be an induced forbid- den graph for the class of perfect consecutively colourable graphs. Along the way, we show the exact values of the deficiency of all generalized Sevastjanov rosettes, which improves the earlier known estimating result. It should be mentioned that the deficiency of a graph measures its closeness to the class of consecutively colourable graphs. We motivate the investigation of graphs considered here by showing their connection to the class of planar perfect consecutively colourable graphs.

  16. 47 CFR 80.761 - Conversion graphs.

    Science.gov (United States)

    2010-10-01

    ... MARITIME SERVICES Standards for Computing Public Coast Station VHF Coverage § 80.761 Conversion graphs. The following graphs must be employed where conversion from one to the other of the indicated types of units is... 47 Telecommunication 5 2010-10-01 2010-10-01 false Conversion graphs. 80.761 Section 80.761...

  17. A Bayesian Approach to Real-Time Earthquake Phase Association

    Science.gov (United States)

    Benz, H.; Johnson, C. E.; Earle, P. S.; Patton, J. M.

    2014-12-01

    Real-time location of seismic events requires a robust and extremely efficient means of associating and identifying seismic phases with hypothetical sources. An association algorithm converts a series of phase arrival times into a catalog of earthquake hypocenters. The classical approach based on time-space stacking of the locus of possible hypocenters for each phase arrival using the principal of acoustic reciprocity has been in use now for many years. One of the most significant problems that has emerged over time with this approach is related to the extreme variations in seismic station density throughout the global seismic network. To address this problem we have developed a novel, Bayesian association algorithm, which looks at the association problem as a dynamically evolving complex system of "many to many relationships". While the end result must be an array of one to many relations (one earthquake, many phases), during the association process the situation is quite different. Both the evolving possible hypocenters and the relationships between phases and all nascent hypocenters is many to many (many earthquakes, many phases). The computational framework we are using to address this is a responsive, NoSQL graph database where the earthquake-phase associations are represented as intersecting Bayesian Learning Networks. The approach directly addresses the network inhomogeneity issue while at the same time allowing the inclusion of other kinds of data (e.g., seismic beams, station noise characteristics, priors on estimated location of the seismic source) by representing the locus of intersecting hypothetical loci for a given datum as joint probability density functions.

  18. Building Scalable Knowledge Graphs for Earth Science

    Science.gov (United States)

    Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian

    2017-01-01

    Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.

  19. On the diameter of dot-critical graphs

    Directory of Open Access Journals (Sweden)

    Doost Ali Mojdeh

    2009-01-01

    Full Text Available A graph G is \\(k\\-dot-critical (totaly \\(k\\-dot-critical if \\(G\\ is dot-critical (totaly dot-critical and the domination number is \\(k\\. In the paper [T. Burtona, D. P. Sumner, Domination dot-critical graphs, Discrete Math, 306 (2006, 11-18] the following question is posed: What are the best bounds for the diameter of a \\(k\\-dot-critical graph and a totally \\(k\\-dot-critical graph \\(G\\ with no critical vertices for \\(k \\geq 4\\? We find the best bound for the diameter of a \\(k\\-dot-critical graph, where \\(k \\in\\{4,5,6\\}\\ and we give a family of \\(k\\-dot-critical graphs (with no critical vertices with sharp diameter \\(2k-3\\ for even \\(k \\geq 4\\.

  20. Fluvial reservoir characterization using topological descriptors based on spectral analysis of graphs

    Science.gov (United States)

    Viseur, Sophie; Chiaberge, Christophe; Rhomer, Jérémy; Audigane, Pascal

    2015-04-01

    computed for each reservoir rock geobody and studied through a graph spectral analysis. To achieve this, the skeleton is converted into a graph structure. The spectral analysis applied on this graph structure allows a distance to be defined between pairs of graphs. Therefore, this distance is used as support for clustering analysis to gather models that share the same reservoir rock topology. To show the ability of the defined distances to discriminate different types of reservoir connectivity, a synthetic data set of fluvial models with different geological settings was generated and studied using the proposed approach. The results of the clustering analysis are shown and discussed.

  1. Graph anomalies in cyber communications

    Energy Technology Data Exchange (ETDEWEB)

    Vander Wiel, Scott A [Los Alamos National Laboratory; Storlie, Curtis B [Los Alamos National Laboratory; Sandine, Gary [Los Alamos National Laboratory; Hagberg, Aric A [Los Alamos National Laboratory; Fisk, Michael [Los Alamos National Laboratory

    2011-01-11

    Enterprises monitor cyber traffic for viruses, intruders and stolen information. Detection methods look for known signatures of malicious traffic or search for anomalies with respect to a nominal reference model. Traditional anomaly detection focuses on aggregate traffic at central nodes or on user-level monitoring. More recently, however, traffic is being viewed more holistically as a dynamic communication graph. Attention to the graph nature of the traffic has expanded the types of anomalies that are being sought. We give an overview of several cyber data streams collected at Los Alamos National Laboratory and discuss current work in modeling the graph dynamics of traffic over the network. We consider global properties and local properties within the communication graph. A method for monitoring relative entropy on multiple correlated properties is discussed in detail.

  2. Bayesian Inference on Gravitational Waves

    Directory of Open Access Journals (Sweden)

    Asad Ali

    2015-12-01

    Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an  overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.

  3. Graphs, groups and surfaces

    CERN Document Server

    White, AT

    1985-01-01

    The field of topological graph theory has expanded greatly in the ten years since the first edition of this book appeared. The original nine chapters of this classic work have therefore been revised and updated. Six new chapters have been added, dealing with: voltage graphs, non-orientable imbeddings, block designs associated with graph imbeddings, hypergraph imbeddings, map automorphism groups and change ringing.Thirty-two new problems have been added to this new edition, so that there are now 181 in all; 22 of these have been designated as ``difficult'''' and 9 as ``unsolved''''. Three of the four unsolved problems from the first edition have been solved in the ten years between editions; they are now marked as ``difficult''''.

  4. Subdominant pseudoultrametric on graphs

    Energy Technology Data Exchange (ETDEWEB)

    Dovgoshei, A A; Petrov, E A [Institute of Applied Mathematics and Mechanics, National Academy of Sciences of Ukraine, Donetsk (Ukraine)

    2013-08-31

    Let (G,w) be a weighted graph. We find necessary and sufficient conditions under which the weight w:E(G)→R{sup +} can be extended to a pseudoultrametric on V(G), and establish a criterion for the uniqueness of such an extension. We demonstrate that (G,w) is a complete k-partite graph, for k≥2, if and only if for any weight that can be extended to a pseudoultrametric, among all such extensions one can find the least pseudoultrametric consistent with w. We give a structural characterization of graphs for which the subdominant pseudoultrametric is an ultrametric for any strictly positive weight that can be extended to a pseudoultrametric. Bibliography: 14 titles.

  5. Equitable Colorings Of Corona Multiproducts Of Graphs

    Directory of Open Access Journals (Sweden)

    Furmánczyk Hanna

    2017-11-01

    Full Text Available A graph is equitably k-colorable if its vertices can be partitioned into k independent sets in such a way that the numbers of vertices in any two sets differ by at most one. The smallest k for which such a coloring exists is known as the equitable chromatic number of G and denoted by =(G. It is known that the problem of computation of =(G is NP-hard in general and remains so for corona graphs. In this paper we consider the same model of coloring in the case of corona multiproducts of graphs. In particular, we obtain some results regarding the equitable chromatic number for the l-corona product G ◦l H, where G is an equitably 3- or 4-colorable graph and H is an r-partite graph, a cycle or a complete graph. Our proofs are mostly constructive in that they lead to polynomial algorithms for equitable coloring of such graph products provided that there is given an equitable coloring of G. Moreover, we confirm the Equitable Coloring Conjecture for corona products of such graphs. This paper extends the results from [H. Furmánczyk, K. Kaliraj, M. Kubale and V.J. Vivin, Equitable coloring of corona products of graphs, Adv. Appl. Discrete Math. 11 (2013 103–120].

  6. Clinical Outcome Prediction in Aneurysmal Subarachnoid Hemorrhage Using Bayesian Neural Networks with Fuzzy Logic Inferences

    Directory of Open Access Journals (Sweden)

    Benjamin W. Y. Lo

    2013-01-01

    Full Text Available Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH. Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients. Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs. Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.

  7. Sphere and dot product representations of graphs

    NARCIS (Netherlands)

    R.J. Kang (Ross); T. Müller (Tobias)

    2012-01-01

    textabstractA graph $G$ is a $k$-sphere graph if there are $k$-dimensional real vectors $v_1,\\dots,v_n$ such that $ij\\in E(G)$ if and only if the distance between $v_i$ and $v_j$ is at most $1$. A graph $G$ is a $k$-dot product graph if there are $k$-dimensional real vectors $v_1,\\dots,v_n$ such

  8. Graph Query Portal

    OpenAIRE

    Dayal, Amit; Brock, David

    2018-01-01

    Prashant Chandrasekar, a lead developer for the Social Interactome project, has tasked the team with creating a graph representation of the data collected from the social networks involved in that project. The data is currently stored in a MySQL database. The client requested that the graph database be Cayley, but after a literature review, Neo4j was chosen. The reasons for this shift will be explained in the design section. Secondarily, the team was tasked with coming up with three scena...

  9. An Association-Oriented Partitioning Approach for Streaming Graph Query

    Directory of Open Access Journals (Sweden)

    Yun Hao

    2017-01-01

    Full Text Available The volumes of real-world graphs like knowledge graph are increasing rapidly, which makes streaming graph processing a hot research area. Processing graphs in streaming setting poses significant challenges from different perspectives, among which graph partitioning method plays a key role. Regarding graph query, a well-designed partitioning method is essential for achieving better performance. Existing offline graph partitioning methods often require full knowledge of the graph, which is not possible during streaming graph processing. In order to handle this problem, we propose an association-oriented streaming graph partitioning method named Assc. This approach first computes the rank values of vertices with a hybrid approximate PageRank algorithm. After splitting these vertices with an adapted variant affinity propagation algorithm, the process order on vertices in the sliding window can be determined. Finally, according to the level of these vertices and their association, the partition where the vertices should be distributed is decided. We compare its performance with a set of streaming graph partition methods and METIS, a widely adopted offline approach. The results show that our solution can partition graphs with hundreds of millions of vertices in streaming setting on a large collection of graph datasets and our approach outperforms other graph partitioning methods.

  10. A comparative study of applying Mason’s Rule in the case of flow-graphs and bond-graphs

    Directory of Open Access Journals (Sweden)

    Adriana Grava

    2009-05-01

    Full Text Available The paper presents two methods to analyzethe electric circuits using the flow-graphs and thebond-graphs studying the differences between thesemethods.As it can be noticed, the two methods are totallydifferent; their common point being Mason’s rule thatis applied in both cases but it is applied differently foreach type of graphs.

  11. An algebraic approach to graph codes

    DEFF Research Database (Denmark)

    Pinero, Fernando

    This thesis consists of six chapters. The first chapter, contains a short introduction to coding theory in which we explain the coding theory concepts we use. In the second chapter, we present the required theory for evaluation codes and also give an example of some fundamental codes in coding...... theory as evaluation codes. Chapter three consists of the introduction to graph based codes, such as Tanner codes and graph codes. In Chapter four, we compute the dimension of some graph based codes with a result combining graph based codes and subfield subcodes. Moreover, some codes in chapter four...

  12. Replica methods for loopy sparse random graphs

    International Nuclear Information System (INIS)

    Coolen, ACC

    2016-01-01

    I report on the development of a novel statistical mechanical formalism for the analysis of random graphs with many short loops, and processes on such graphs. The graphs are defined via maximum entropy ensembles, in which both the degrees (via hard constraints) and the adjacency matrix spectrum (via a soft constraint) are prescribed. The sum over graphs can be done analytically, using a replica formalism with complex replica dimensions. All known results for tree-like graphs are recovered in a suitable limit. For loopy graphs, the emerging theory has an appealing and intuitive structure, suggests how message passing algorithms should be adapted, and what is the structure of theories describing spin systems on loopy architectures. However, the formalism is still largely untested, and may require further adjustment and refinement. (paper)

  13. The STAPL Parallel Graph Library

    KAUST Repository

    Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence

    2013-01-01

    This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable

  14. Coloring and The Lonely Graph

    OpenAIRE

    Rabern, Landon

    2007-01-01

    We improve upper bounds on the chromatic number proven independently in \\cite{reedNote} and \\cite{ingo}. Our main lemma gives a sufficient condition for two paths in graph to be completely joined. Using this, we prove that if a graph has an optimal coloring with more than $\\frac{\\omega}{2}$ singleton color classes, then it satisfies $\\chi \\leq \\frac{\\omega + \\Delta + 1}{2}$. It follows that a graph satisfying $n - \\Delta < \\alpha + \\frac{\\omega - 1}{2}$ must also satisfy $\\chi \\leq \\frac{\\ome...

  15. Graph Mining Meets the Semantic Web

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkeun (Matt) [ORNL; Sukumar, Sreenivas R [ORNL; Lim, Seung-Hwan [ORNL

    2015-01-01

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today, data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. We address that need through implementation of three popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, and PageRank). We implement these algorithms as SPARQL queries, wrapped within Python scripts. We evaluate the performance of our implementation on 6 real world data sets and show graph mining algorithms (that have a linear-algebra formulation) can indeed be unleashed on data represented as RDF graphs using the SPARQL query interface.

  16. Bayesian analysis in plant pathology.

    Science.gov (United States)

    Mila, A L; Carriquiry, A L

    2004-09-01

    ABSTRACT Bayesian methods are currently much discussed and applied in several disciplines from molecular biology to engineering. Bayesian inference is the process of fitting a probability model to a set of data and summarizing the results via probability distributions on the parameters of the model and unobserved quantities such as predictions for new observations. In this paper, after a short introduction of Bayesian inference, we present the basic features of Bayesian methodology using examples from sequencing genomic fragments and analyzing microarray gene-expressing levels, reconstructing disease maps, and designing experiments.

  17. Incremental Frequent Subgraph Mining on Large Evolving Graphs

    KAUST Repository

    Abdelhamid, Ehab

    2017-08-22

    Frequent subgraph mining is a core graph operation used in many domains, such as graph data management and knowledge exploration, bioinformatics and security. Most existing techniques target static graphs. However, modern applications, such as social networks, utilize large evolving graphs. Mining these graphs using existing techniques is infeasible, due to the high computational cost. In this paper, we propose IncGM+, a fast incremental approach for continuous frequent subgraph mining problem on a single large evolving graph. We adapt the notion of “fringe” to the graph context, that is the set of subgraphs on the border between frequent and infrequent subgraphs. IncGM+ maintains fringe subgraphs and exploits them to prune the search space. To boost the efficiency, we propose an efficient index structure to maintain selected embeddings with minimal memory overhead. These embeddings are utilized to avoid redundant expensive subgraph isomorphism operations. Moreover, the proposed system supports batch updates. Using large real-world graphs, we experimentally verify that IncGM+ outperforms existing methods by up to three orders of magnitude, scales to much larger graphs and consumes less memory.

  18. Smooth Bundling of Large Streaming and Sequence Graphs

    NARCIS (Netherlands)

    Hurter, C.; Ersoy, O.; Telea, A.

    2013-01-01

    Dynamic graphs are increasingly pervasive in modern information systems. However, understanding how a graph changes in time is difficult. We present here two techniques for simplified visualization of dynamic graphs using edge bundles. The first technique uses a recent image-based graph bundling

  19. The groupies of random multipartite graphs

    OpenAIRE

    Portmann, Marius; Wang, Hongyun

    2012-01-01

    If a vertex $v$ in a graph $G$ has degree larger than the average of the degrees of its neighbors, we call it a groupie in $G$. In the current work, we study the behavior of groupie in random multipartite graphs with the link probability between sets of nodes fixed. Our results extend the previous ones on random (bipartite) graphs.

  20. On the structure of Bayesian network for Indonesian text document paraphrase identification

    Science.gov (United States)

    Prayogo, Ario Harry; Syahrul Mubarok, Mohamad; Adiwijaya

    2018-03-01

    Paraphrase identification is an important process within natural language processing. The idea is to automatically recognize phrases that have different forms but contain same meanings. For examples if we input query “causing fire hazard”, then the computer has to recognize this query that this query has same meaning as “the cause of fire hazard. Paraphrasing is an activity that reveals the meaning of an expression, writing, or speech using different words or forms, especially to achieve greater clarity. In this research we will focus on classifying two Indonesian sentences whether it is a paraphrase to each other or not. There are four steps in this research, first is preprocessing, second is feature extraction, third is classifier building, and the last is performance evaluation. Preprocessing consists of tokenization, non-alphanumerical removal, and stemming. After preprocessing we will conduct feature extraction in order to build new features from given dataset. There are two kinds of features in the research, syntactic features and semantic features. Syntactic features consist of normalized levenshtein distance feature, term-frequency based cosine similarity feature, and LCS (Longest Common Subsequence) feature. Semantic features consist of Wu and Palmer feature and Shortest Path Feature. We use Bayesian Networks as the method of training the classifier. Parameter estimation that we use is called MAP (Maximum A Posteriori). For structure learning of Bayesian Networks DAG (Directed Acyclic Graph), we use BDeu (Bayesian Dirichlet equivalent uniform) scoring function and for finding DAG with the best BDeu score, we use K2 algorithm. In evaluation step we perform cross-validation. The average result that we get from testing the classifier as follows: Precision 75.2%, Recall 76.5%, F1-Measure 75.8% and Accuracy 75.6%.