Dynamic exponents for potts model cluster algorithms
Coddington, Paul D.; Baillie, Clive F.
We have studied the Swendsen-Wang and Wolff cluster update algorithms for the Ising model in 2, 3 and 4 dimensions. The data indicate simple relations between the specific heat and the Wolff autocorrelations, and between the magnetization and the Swendsen-Wang autocorrelations. This implies that the dynamic critical exponents are related to the static exponents of the Ising model. We also investigate the possibility of similar relationships for the Q-state Potts model.
Critical dynamics of cluster algorithms in the dilute Ising model
Hennecke, M.; Heyken, U.
1993-08-01
Autocorrelation times for thermodynamic quantities at T C are calculated from Monte Carlo simulations of the site-diluted simple cubic Ising model, using the Swendsen-Wang and Wolff cluster algorithms. Our results show that for these algorithms the autocorrelation times decrease when reducing the concentration of magnetic sites from 100% down to 40%. This is of crucial importance when estimating static properties of the model, since the variances of these estimators increase with autocorrelation time. The dynamical critical exponents are calculated for both algorithms, observing pronounced finite-size effects in the energy autocorrelation data for the algorithm of Wolff. We conclude that, when applied to the dilute Ising model, cluster algorithms become even more effective than local algorithms, for which increasing autocorrelation times are expected.
A dynamic fuzzy clustering method based on genetic algorithm
Institute of Scientific and Technical Information of China (English)
ZHENG Yan; ZHOU Chunguang; LIANG Yanchun; GUO Dongwei
2003-01-01
A dynamic fuzzy clustering method is presented based on the genetic algorithm. By calculating the fuzzy dissimilarity between samples the essential associations among samples are modeled factually. The fuzzy dissimilarity between two samples is mapped into their Euclidean distance, that is, the high dimensional samples are mapped into the two-dimensional plane. The mapping is optimized globally by the genetic algorithm, which adjusts the coordinates of each sample, and thus the Euclidean distance, to approximate to the fuzzy dissimilarity between samples gradually. A key advantage of the proposed method is that the clustering is independent of the space distribution of input samples, which improves the flexibility and visualization. This method possesses characteristics of a faster convergence rate and more exact clustering than some typical clustering algorithms. Simulated experiments show the feasibility and availability of the proposed method.
Functional clustering algorithm for the analysis of dynamic network data
Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.
2009-05-01
We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.
DYNAMIC REQUEST DISPATCHING ALGORITHM FOR WEB SERVER CLUSTER
Institute of Scientific and Technical Information of China (English)
Yang Zhenjiang; Zhang Deyun; Sun Qindong; Sun Qing
2006-01-01
Distributed architectures support increased load on popular web sites by dispatching client requests transparently among multiple servers in a cluster. Packet Single-Rewriting technology and client address hashing algorithm in ONE-IP technology which can ensure application-session-keep have been analyzed, an improved request dispatching algorithm which is simple, effective and supports dynamic load balance has been proposed. In this algorithm, dispatcher evaluates which server node will process request by applying a hash function to the client IP address and comparing the result with its assigned identifier subset; it adjusts the size of the subset according to the performance and current load of each server, so as to utilize all servers' resource effectively. Simulation shows that the improved algorithm has better performance than the original one.
Dynamic and static properties of the invaded cluster algorithm
Moriarty, K.; Machta, J.; Chayes, L. Y.
1999-02-01
Simulations of the two-dimensional Ising and three-state Potts models at their critical points are performed using the invaded cluster (IC) algorithm. It is argued that observables measured on a sublattice of size l should exhibit a crossover to Swendsen-Wang (SW) behavior for l sufficiently less than the lattice size L, and a scaling form is proposed to describe the crossover phenomenon. It is found that the energy autocorrelation time τɛ(l,L) for an l×l sublattice attains a maximum in the crossover region, and a dynamic exponent zIC for the IC algorithm is defined according to τɛ,max~LzIC. Simulation results for the three-state model yield zIC=0.346+/-0.002, which is smaller than values of the dynamic exponent found for the SW and Wolff algorithms and also less than the Li-Sokal bound. The results are less conclusive for the Ising model, but it appears that zICWolff algorithms.
Parallel Genetic Algorithms with Dynamic Topology using Cluster Computing
Directory of Open Access Journals (Sweden)
ADAR, N.
2016-08-01
Full Text Available A parallel genetic algorithm (PGA conducts a distributed meta-heuristic search by employing genetic algorithms on more than one subpopulation simultaneously. PGAs migrate a number of individuals between subpopulations over generations. The layout that facilitates the interactions of the subpopulations is called the topology. Static migration topologies have been widely incorporated into PGAs. In this article, a PGA with a dynamic migration topology (D-PGA is proposed. D-PGA generates a new migration topology in every epoch based on the average fitness values of the subpopulations. The D-PGA has been tested against ring and fully connected migration topologies in a Beowulf Cluster. The D-PGA has outperformed the ring migration topology with comparable communication cost and has provided competitive or better results than a fully connected migration topology with significantly lower communication cost. PGA convergence behaviors have been analyzed in terms of the diversities within and between subpopulations. Conventional diversity can be considered as the diversity within a subpopulation. A new concept of permeability has been introduced to measure the diversity between subpopulations. It is shown that the success of the proposed D-PGA can be attributed to maintaining a high level of permeability while preserving diversity within subpopulations.
A Novel Dynamic Clustering Algorithm Based on Immune Network and Tabu Search
Institute of Scientific and Technical Information of China (English)
ZHONGJiang; WUZhongfu; WUKaigui; YANGQiang
2005-01-01
It's difficult to indicate the rational number of partitions in the data set before clustering usually.The problem can't be solved by traditional clustering algorithm, such as k-means or its variations. This paper proposes a novel Dynamic clustering algorithm based on the artificial immune network and tabu search (DCBIT). It optimizes the number and the location of the clusters at the same time. The algorithm includes two phases, it begins by running immune network algorithm to find a Clustering feasible solution (CFS), then it employs tabu search to get the optimum cluster number and cluster centers on the CFS. Also, the probabilities acquiring the CFS through immune network algorithm have been discussed in this paper. Some experimental results show that new algorithm has satisfied convergent probability and convergent speed.
Clustering dynamic textures with the hierarchical em algorithm for modeling video.
Mumtaz, Adeel; Coviello, Emanuele; Lanckriet, Gert R G; Chan, Antoni B
2013-07-01
Dynamic texture (DT) is a probabilistic generative model, defined over space and time, that represents a video as the output of a linear dynamical system (LDS). The DT model has been applied to a wide variety of computer vision problems, such as motion segmentation, motion classification, and video registration. In this paper, we derive a new algorithm for clustering DT models that is based on the hierarchical EM algorithm. The proposed clustering algorithm is capable of both clustering DTs and learning novel DT cluster centers that are representative of the cluster members in a manner that is consistent with the underlying generative probabilistic model of the DT. We also derive an efficient recursive algorithm for sensitivity analysis of the discrete-time Kalman smoothing filter, which is used as the basis for computing expectations in the E-step of the HEM algorithm. Finally, we demonstrate the efficacy of the clustering algorithm on several applications in motion analysis, including hierarchical motion clustering, semantic motion annotation, and learning bag-of-systems (BoS) codebooks for dynamic texture recognition.
Dynamic Head Cluster Election Algorithm for Clustered Ad-Hoc Networks
Directory of Open Access Journals (Sweden)
Arwa Zabian
2008-01-01
Full Text Available In distributed system, the concept of clustering consists on dividing the geographical area covered by a set of nodes into small zones. In mobile network, the clustering mechanism varied due to the mobility of the nodes any time in any direction. That causes the partitioning of the network or the joining of nodes. Several existing centralized or globalized algorithm have been proposed for clustering technique, in a manner that no one node becomes isolated and no cluster becomes overloaded. A particular node called head cluster or leader is elected, has the role to organize the distribution of nodes in clusters. We propose a distributed clustering and leader election mechanism for Ad-Hoc mobile networks, in which the leader is a mobile node. Our results show that, in the case of leader mobility the time needed to elect a new leader is smaller than the time needed a significant topological change in the network is happens.
Empirical relations between static and dynamic exponents for Ising model cluster algorithms
Coddington, Paul D.; Baillie, Clive F.
1992-02-01
We have measured the autocorrelations for the Swendsen-Wang and the Wolff cluster update algorithms for the Ising model in two, three, and four dimensions. The data for the Wolff algorithm suggest that the autocorrelations are linearly related to the specific heat, in which case the dynamic critical exponent is zint,EW=α/ν. For the Swendsen-Wang algorithm, scaling the autocorrelations by the average maximum cluster size gives either a constant or a logarithm, which implies that zint,ESW=β/ν for the Ising model.
Empirical relations between static and dynamic exponents for Ising model cluster algorithms
Energy Technology Data Exchange (ETDEWEB)
Coddington, P.D. (Department of Physics, Syracuse University, Syracuse, New York 13244 (United States)); Baillie, C.F. (Department of Physics, University of Colorado, Boulder, Colorado 80309 (United States))
1992-02-17
We have measured the autocorrelations for the Swendsen-Wang and the Wolff cluster update algorithms for the Ising model in two, three, and four dimensions. The data for the Wolff algorithm suggest that the autocorrelations are linearly related to the specific heat, in which case the dynamic critical exponent is {ital z}{sub int,}{ital E}{sup W}={alpha}/{nu}. For the Swendsen-Wang algorithm, scaling the autocorrelations by the average maximum cluster size gives either a constant or a logarithm, which implies that {ital z}{sub int,}{ital E}{sup SW}={beta}/{nu} for the Ising model.
Pluchino, Alessandro; Latora, Vito
2008-01-01
We have recently introduced an efficient method for the detection and identification of modules in complex networks, based on the de-synchronization properties (dynamical clustering) of phase oscillators. In this paper we apply the dynamical clustering tecnique to the identification of communities of marine organisms living in the Chesapeake Bay food web. We show that our algorithm is able to perform a very reliable classification of the real communities existing in this ecosystem by using different kinds of dynamical oscillators. We compare also our results with those of other methods for the detection of community structures in complex networks.
DYNAMIC REQUEST DISPATCHING ALGORITHM FOR WEB SERVER CLUSTER
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
The overall increase in traffic on the WWWcauses a disproportionate increase in client requeststo popular web sites.Site administrators constantlyface the requirement to i mprove server's capacity.Web server cluster is a popular solution.It usesgroup of independent servers that are managed as asingle systemfor higher availability,easier manage-ability and greater scalability.Many web sites haveadopted this solution.Request dispatching[1-2]is one of the core tech-nologies used by parallel web server clusters...
Multi-Parameter Signal Sorting Algorithm Based on Dynamic Distance Clustering
Institute of Scientific and Technical Information of China (English)
Ai-Ling He; De-Guo Zeng; Jun Wang; Bin Tang
2009-01-01
A multi-parameter signal sorting algo- rithm for interleaved radar pulses in dense emitter environment is presented. The algorithm includes two parts, pulse classification and pulse repetition interval (PRI) analysis. Firstly, we propose the dynamic distance clustering (DDC) for classification. In the clustering algorithm, the multi-dimension features of radar pulse are used for reliable classification. The similarity threshold estimation method in DDC is derived, which contributes to the efficiency of the algorithm. However, DDC has large computation with many signal pulses. Then, in order to sort radar signals in real time, the improved DDC (IDDC) algorithm is proposed. Finally, PRI analysis is adopted to complete the process of sorting. The simulation experiments and hardware implementations show both algorithms are effective.
Lelu, Alain; Cuxac, Pascal
2008-01-01
We address here two major challenges presented by dynamic data mining: 1) the stability challenge: we have implemented a rigorous incremental density-based clustering algorithm, independent from any initial conditions and ordering of the data-vectors stream, 2) the cognitive challenge: we have implemented a stringent selection process of association rules between clusters at time t-1 and time t for directly generating the main conclusions about the dynamics of a data-stream. We illustrate these points with an application to a two years and 2600 documents scientific information database.
Institute of Scientific and Technical Information of China (English)
Mohammed A.M. Ibrahim; Lu Xinda; M. SaifMokbel
2005-01-01
The rapid growth of interconnected high performance workstations has produced a new computing paradigm called clustered of workstations computing. In these systems load balance problem is a serious impediment to achieve good performance. The main concern of this paper is the implementation of dynamic load balancing algorithm,asynchronous Round Robin (ARR), for balancing workload of parallel tree computation depth-first-search algorithm on Cluster of Heterogeneous Workstations (COW) Many algorithms in artificial intelligence and other areas of computer science are based on depth first search in implicitty defined trees. For these algorithms a loadbalancing scheme is required, which is able to evenly distribute parts of an irregularly shaped tree over the workstations with minimal interprocessor communication and without prior knowledge of the tree's shape. For the( ARR ) algorithm only minimal interpreeessor communication is needed when necessary and it runs under the MPI (Message passing interface) that allows parallel execution on heterogeneous SUN cluster of workstation platform. The program code is written in C language and executed under UNIX operating system (Solaris version).
Pluchino, A.; Rapisarda, A.; Latora, V.
2008-10-01
We have recently introduced [Phys. Rev. E 75, 045102(R) (2007); AIP Conference Proceedings 965, 2007, p. 323] an efficient method for the detection and identification of modules in complex networks, based on the de-synchronization properties (dynamical clustering) of phase oscillators. In this paper we apply the dynamical clustering tecnique to the identification of communities of marine organisms living in the Chesapeake Bay food web. We show that our algorithm is able to perform a very reliable classification of the real communities existing in this ecosystem by using different kinds of dynamical oscillators. We compare also our results with those of other methods for the detection of community structures in complex networks.
Incremental Density-Based Link Clustering Algorithm for Community Detection in Dynamic Networks
Directory of Open Access Journals (Sweden)
Fanrong Meng
2016-01-01
Full Text Available Community detection in complex networks has become a research hotspot in recent years. However, most of the existing community detection algorithms are designed for the static networks; namely, the connections between the nodes are invariable. In this paper, we propose an incremental density-based link clustering algorithm for community detection in dynamic networks, iDBLINK. This algorithm is an extended version of DBLINK which is proposed in our previous work. It can update the local link community structure in the current moment through the change of similarity between the edges at the adjacent moments, which includes the creation, growth, merging, deletion, contraction, and division of link communities. Extensive experimental results demonstrate that iDBLINK not only has a great time efficiency, but also maintains a high quality community detection performance when the network topology is changing.
Partitional clustering algorithms
2015-01-01
This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised classification of patterns into groups, is one of the most important tasks in exploratory data analysis. Primary goals of clustering include gaining insight into, classifying, and compressing data. Clustering has a long and rich history that spans a variety of scientific disciplines including anthropology, biology, medicine, psychology, statistics, mathematics, engineering, and computer science. As a result, numerous clustering algorithms have been proposed since the early 1950s. Among these algorithms, partitional (nonhierarchical) ones have found many applications, especially in engineering and computer science. This book provides coverage of consensus clustering, constrained clustering, large scale and/or high dimensional clustering, cluster validity, cluster visualization, and applications of clustering. Examines clustering as it applies to large and/or high-dimensional data sets commonly encountered in reali...
Parallel Wolff Cluster Algorithms
Bae, S.; Ko, S. H.; Coddington, P. D.
The Wolff single-cluster algorithm is the most efficient method known for Monte Carlo simulation of many spin models. Due to the irregular size, shape and position of the Wolff clusters, this method does not easily lend itself to efficient parallel implementation, so that simulations using this method have thus far been confined to workstations and vector machines. Here we present two parallel implementations of this algorithm, and show that one gives fairly good performance on a MIMD parallel computer.
Cluster Synchronization Algorithms
Xia, Weiguo; Cao, Ming
2010-01-01
This paper presents two approaches to achieving cluster synchronization in dynamical multi-agent systems. In contrast to the widely studied synchronization behavior, where all the coupled agents converge to the same value asymptotically, in the cluster synchronization problem studied in this paper,
Energy Technology Data Exchange (ETDEWEB)
Yin, Jiandong; Yang, Jiawen; Guo, Qiyong [Shengjing Hospital of China Medical University, Department of Radiology, Shenyang (China)
2015-05-01
Arterial input function (AIF) plays an important role in the quantification of cerebral hemodynamics. The purpose of this study was to select the best reproducible clustering method for AIF detection by comparing three algorithms reported previously in terms of detection accuracy and computational complexity. First, three reproducible clustering methods, normalized cut (Ncut), hierarchy (HIER), and fast affine propagation (FastAP), were applied independently to simulated data which contained the true AIF. Next, a clinical verification was performed where 42 subjects participated in dynamic susceptibility contrast MRI (DSC-MRI) scanning. The manual AIF and AIFs based on the different algorithms were obtained. The performance of each algorithm was evaluated based on shape parameters of the estimated AIFs and the true or manual AIF. Moreover, the execution time of each algorithm was recorded to determine the algorithm that operated more rapidly in clinical practice. In terms of the detection accuracy, Ncut and HIER method produced similar AIF detection results, which were closer to the expected AIF and more accurate than those obtained using FastAP method; in terms of the computational efficiency, the Ncut method required the shortest execution time. Ncut clustering appears promising because it facilitates the automatic and robust determination of AIF with high accuracy and efficiency. (orig.)
Liu, Song; Zhu, Lizhe; Sheong, Fu Kit; Wang, Wei; Huang, Xuhui
2017-01-30
We present an efficient density-based adaptive-resolution clustering method APLoD for analyzing large-scale molecular dynamics (MD) trajectories. APLoD performs the k-nearest-neighbors search to estimate the density of MD conformations in a local fashion, which can group MD conformations in the same high-density region into a cluster. APLoD greatly improves the popular density peaks algorithm by reducing the running time and the memory usage by 2-3 orders of magnitude for systems ranging from alanine dipeptide to a 370-residue Maltose-binding protein. In addition, we demonstrate that APLoD can produce clusters with various sizes that are adaptive to the underlying density (i.e., larger clusters at low-density regions, while smaller clusters at high-density regions), which is a clear advantage over other popular clustering algorithms including k-centers and k-medoids. We anticipate that APLoD can be widely applied to split ultra-large MD datasets containing millions of conformations for subsequent construction of Markov State Models. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Recovery Rate of Clustering Algorithms
Li, Fajie; Klette, Reinhard; Wada, T; Huang, F; Lin, S
2009-01-01
This article provides a simple and general way for defining the recovery rate of clustering algorithms using a given family of old clusters for evaluating the performance of the algorithm when calculating a family of new clusters. Under the assumption of dealing with simulated data (i.e., known old
Application of a New Fuzzy Clustering Algorithm in Intrusion Detection
Institute of Scientific and Technical Information of China (English)
无
2008-01-01
This paper presents a new Section Set Adaptive FCM algorithm. The algorithm solved the shortcomings of localoptimality, unsure classification and clustering numbers ascertained previously. And it improved on the architecture of FCM al-gorithm, enhanced the analysis for effective clustering. During the clustering processing, it may adjust clustering numbers dy-namically. Finally, it used the method of section set decreasing the time of classification. By experiments, the algorithm can im-prove dependability of clustering and correctness of classification.
Data clustering algorithms and applications
Aggarwal, Charu C
2013-01-01
Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data mining, and machine learning communities. Addressing this problem in a unified way, Data Clustering: Algorithms and Applications provides complete coverage of the entire area of clustering, from basic methods to more refined and complex data clustering approaches. It pays special attention to recent issues in graphs, social networks, and other domains.The book focuses on three primary aspects of data clustering: Methods, describing key techniques commonly used for clustering, such as fea
Nonomura, Yoshihiko
2014-11-01
Nonequilibrium relaxation behaviors in the Ising model on a square lattice based on the Wolff algorithm are totally different from those based on local-update algorithms. In particular, the critical relaxation is described by the stretched-exponential decay. We propose a novel scaling procedure to connect nonequilibrium and equilibrium behaviors continuously, and find that the stretched-exponential scaling region in the Wolff algorithm is as wide as the power-law scaling region in local-update algorithms. We also find that relaxation to the spontaneous magnetization in the ordered phase is characterized by the exponential decay, not the stretched-exponential decay based on local-update algorithms.
Kernel Generalized Noise Clustering Algorithm
Institute of Scientific and Technical Information of China (English)
WU Xiao-hong; ZHOU Jian-jiang
2007-01-01
To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and the GNC model which are based on Euclidean distance, the presented model is based on kernel-induced distance by using kernel method. By kernel method the input data are nonlinearly and implicitly mapped into a high-dimensional feature space, where the nonlinear pattern appears linear and the GNC algorithm is performed. It is unnecessary to calculate in high-dimensional feature space because the kernel function can do itjust in input space. The effectiveness of the proposed algorithm is verified by experiments on three data sets. It is concluded that the KGNC algorithm has better clustering accuracy than FCM and GNC in clustering data sets containing noisy data.
Intuitionistic fuzzy hierarchical clustering algorithms
Institute of Scientific and Technical Information of China (English)
Xu Zeshui
2009-01-01
Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a mem-bership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set (IVIFS), whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty. However, it seems that little attention has been focused on the clus-tering analysis of IFSs and IVIFSs. An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs, which is based on the traditional hierarchical clustering procedure, the intuitionistic fuzzy aggregation operator, and the basic distance measures between IFSs: the Hamming distance, normalized Hamming, weighted Hamming, the Euclidean distance, the normalized Euclidean distance, and the weighted Euclidean distance. Subsequently, the algorithm is extended for clustering IVIFSs. Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.
Single-cluster dynamics for the random-cluster model
Deng, Youjin; Qian, Xiaofeng; Blöte, Henk W. J.
2009-09-01
We formulate a single-cluster Monte Carlo algorithm for the simulation of the random-cluster model. This algorithm is a generalization of the Wolff single-cluster method for the q -state Potts model to noninteger values q>1 . Its results for static quantities are in a satisfactory agreement with those of the existing Swendsen-Wang-Chayes-Machta (SWCM) algorithm, which involves a full-cluster decomposition of random-cluster configurations. We explore the critical dynamics of this algorithm for several two-dimensional Potts and random-cluster models. For integer q , the single-cluster algorithm can be reduced to the Wolff algorithm, for which case we find that the autocorrelation functions decay almost purely exponentially, with dynamic exponents zexp=0.07 (1), 0.521 (7), and 1.007 (9) for q=2 , 3, and 4, respectively. For noninteger q , the dynamical behavior of the single-cluster algorithm appears to be very dissimilar to that of the SWCM algorithm. For large critical systems, the autocorrelation function displays a range of power-law behavior as a function of time. The dynamic exponents are relatively large. We provide an explanation for this peculiar dynamic behavior.
Extended Fuzzy Clustering Algorithms
U. Kaymak (Uzay); M. Setnes
2000-01-01
textabstractFuzzy clustering is a widely applied method for obtaining fuzzy models from data. It has been applied successfully in various fields including finance and marketing. Despite the successful applications, there are a number of issues that must be dealt with in practical applications of fuz
Symplectic algebraic dynamics algorithm
Institute of Scientific and Technical Information of China (English)
2007-01-01
Based on the algebraic dynamics solution of ordinary differential equations andintegration of ,the symplectic algebraic dynamics algorithm sn is designed,which preserves the local symplectic geometric structure of a Hamiltonian systemand possesses the same precision of the na ve algebraic dynamics algorithm n.Computer experiments for the 4th order algorithms are made for five test modelsand the numerical results are compared with the conventional symplectic geometric algorithm,indicating that sn has higher precision,the algorithm-inducedphase shift of the conventional symplectic geometric algorithm can be reduced,and the dynamical fidelity can be improved by one order of magnitude.
Research of Web Documents Clustering Based on Dynamic Concept
Institute of Scientific and Technical Information of China (English)
WANG Yun-hua; CHEN Shi-hong
2004-01-01
Conceptual clustering is mainly used for solving the deficiency and incompleteness of domain knowledge.Based on conceptual clustering technology and aiming at the institutional framework and characteristic of Web theme information, this paper proposes and implements dynamic conceptual clustering algorithm and merging algorithm for Web documents, and also analyses the super performance of the clustering algorithm in efficiency and clustering accuracy.
Genetic Algorithms for Auto-Clustering in KDD
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
In solving the clustering problem in the context of knowledge discovery in databases (KDD), the traditional methods, for example, the K-means algorithm and its variants, usually require the users to provide the number of clusters in advance based on the pro-information. Unfortunately, the number of clusters in general is unknown to the users who are usually short of pro-information. Therefore, the clustering calculation becomes a tedious trial-and-error work, and the result is often not global optimal especially when the number of clusters is large. In this paper, a new dynamic clustering method based on genetic algorithms (GA) is proposed and applied for auto-clustering of data entities in large databases. The algorithm can automatically cluster the data according to their similarities and find the exact number of clusters. Experiment results indicate that the method is of global optimization by dynamically clustering logic.
PROPOSED A HETEROGENEOUS CLUSTERING ALGORITHM TO IMPROVE QOS IN WSN
Directory of Open Access Journals (Sweden)
Mehran Mokhtari
2016-07-01
Full Text Available In this article it has presented leach extended hierarchical 3-level clustered heterogeneous and dynamics algorithm. On suggested protocol (LEH3LA with planning of selected auction cluster head, and alternative cluster head node, problem of delay on processing, processing of selecting members, decrease of expenses, and energy consumption, decrease of sending message, and receiving messages inside the clusters, selecting of cluster heads in large sensor networks were solved. This algorithm uses hierarchical heterogeneous network (3-levels, collective intelligence, and intra-cluster interaction for communications. Also it will solve the problems of sending data in Multi-BS mobile networks, expanding inter-cluster networks, overlap cluster, genesis orphan nodes, boundary change dynamically clusters, using backbone networks, cloud sensor. Using sleep/wake scheduling algorithm or TDMA-schedule alternative cluster head node provides redundancy, and fault tolerance. Local processing in cluster head nodes, and alternative cluster head, intra-cluster and inter-cluster communications such as Multi-HOP cause increase on processing speed, and sending data intra-cluster and inter-cluster. Decrease of overhead network, and increase the load balancing among cluster heads. Using encapsulation of data method, by cluster head nodes, energy consumption decrease during sending data. Also by improving quality of service (QoS in CBRP, LEACH, 802.15.4, decrease of energy consumption in sensors, cluster heads and alternative cluster head nodes, cause increase on lift time of sensor networks
Parallel algorithms and cluster computing
Hoffmann, Karl Heinz
2007-01-01
This book presents major advances in high performance computing as well as major advances due to high performance computing. It contains a collection of papers in which results achieved in the collaboration of scientists from computer science, mathematics, physics, and mechanical engineering are presented. From the science problems to the mathematical algorithms and on to the effective implementation of these algorithms on massively parallel and cluster computers we present state-of-the-art methods and technology as well as exemplary results in these fields. This book shows that problems which seem superficially distinct become intimately connected on a computational level.
Determination of atomic cluster structure with cluster fusion algorithm
DEFF Research Database (Denmark)
Obolensky, Oleg I.; Solov'yov, Ilia; Solov'yov, Andrey V.
2005-01-01
We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters.......We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters....
Particle identification using clustering algorithms
Wirth, R; Löher, B; Savran, D; Silva, J; Pol, H Álvarez; Gil, D Cortina; Pietras, B; Bloch, T; Kröll, T; Nácher, E; Perea, Á; Tengblad, O; Bendel, M; Dierigl, M; Gernhäuser, R; Bleis, T Le; Winkel, M
2013-01-01
A method that uses fuzzy clustering algorithms to achieve particle identification based on pulse shape analysis is presented. The fuzzy c-means clustering algorithm is used to compute mean (principal) pulse shapes induced by different particle species in an automatic and unsupervised fashion from a mixed set of data. A discrimination amplitude is proposed using these principal pulse shapes to identify the originating particle species of a detector pulse. Since this method does not make any assumptions about the specific features of the pulse shapes, it is very generic and suitable for multiple types of detectors. The method is applied to discriminate between photon- and proton-induced signals in CsI(Tl) scintillator detectors and the results are compared to the well-known integration method.
Parallel FFT Algorithm on Computer Clusters
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
DFT is widely applied in the field of signal process and others. Most present rapid ways of calculation are either based on paralleled computers connected by such particular systems like butterfly network, hypercube etc;or based on the assumption of instant transportation, non-conflict communication, complete connection of paralleled processors and unlimited usable processors. However, the delay of communication in the system of information transmission cannot be ignored. This paper works on the following aspects: instant transmission, dispatching missions, and the path of information through the communication link in the computer cluster systems;layout of the dynamic FFT algorithm under the different structures of computer clusters.
An Improved Weighted Clustering Algorithm in MANET
Institute of Scientific and Technical Information of China (English)
WANG Jin; XU Li; ZHENG Bao-yu
2004-01-01
The original clustering algorithms in Mobile Ad hoc Network (MANET) are firstly analyzed in this paper.Based on which, an Improved Weighted Clustering Algorithm (IWCA) is proposed. Then, the principle and steps of our algorithm are explained in detail, and a comparison is made between the original algorithms and our improved method in the aspects of average cluster number, topology stability, clusterhead load balance and network lifetime. The experimental results show that our improved algorithm has the best performance on average.
Kernel method-based fuzzy clustering algorithm
Institute of Scientific and Technical Information of China (English)
Wu Zhongdong; Gao Xinbo; Xie Weixin; Yu Jianping
2005-01-01
The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, data with noise, data with mixture of heterogeneous cluster prototypes, asymmetric data, etc. Based on the Mercer kernel, FKCM clustering algorithm is derived from FCM algorithm united with kernel method. The results of experiments with the synthetic and real data show that the FKCM clustering algorithm is universality and can effectively unsupervised analyze datasets with variform structures in contrast to FCM algorithm. It is can be imagined that kernel-based clustering algorithm is one of important research direction of fuzzy clustering analysis.
A new cluster algorithm for graphs
Dongen, S. van
1998-01-01
A new cluster algorithm for graphs called the emph{Markov Cluster algorithm ($MCL$ algorithm) is introduced. The graphs may be both weighted (with nonnegative weight) and directed. Let~$G$~be such a graph. The $MCL$ algorithm simulates flow in $G$ by first identifying $G$ in a canonical way with
动态分层的水下传感器网络分簇路由算法%Dynamic Layered Clustering Routing Algorithm in Underwater Sensor Networks
Institute of Scientific and Technical Information of China (English)
洪昌建; 吴伟杰; 唐平鹏
2015-01-01
To deal with the limitation that flat routing can hardly be accustomed to large scale Underwater Sensor Networks (USN), a new clustering routing algorithm Dynamic Layered Clustering Routing (DLCR) is proposed, which can be accustomed to larger scale networks. This algorithm divides the networks into several layers from top to bottom, and selects the nodes which have more remaining energy and shorter distance to sink as the cluster head nodes, thus, clusters' communication energy consumption are reduced. In order to avoid the same nodes being elected to be cluster head nodes continuously, a dynamic layered mechanism that the networks are divided into different layers in each circle of data gathering is proposed. The experiment shows that DLCR not only has a better stability, but also reduces the energy consumption and prolongs the lifetime of the whole networks.%针对平面路由难以适应较大规模水下传感器网络的局限,该文提出一种能更好地适用于较大规模网络的分簇路由算法DLCR(Dynamic Layered Clustering Routing).该算法将网络自上向下划分为多层,并选择层内与sink节点距离较近、剩余能量较高的节点作为簇头节点,从而降低簇头节点的通信能耗.为了避免同一节点连续被选举为簇头节点,提出一种动态分层机制,每一轮数据采集周期都将网络重新划分为多层.实验证明DLCR不仅具有良好的稳定性,还降低了网络的能耗,延长了网络的寿命.
Introduction to cluster dynamics
Reinhard, Paul-Gerhard
2008-01-01
Clusters as mesoscopic particles represent an intermediate state of matter between single atoms and solid material. The tendency to miniaturise technical objects requires knowledge about systems which contain a ""small"" number of atoms or molecules only. This is all the more true for dynamical aspects, particularly in relation to the qick development of laser technology and femtosecond spectroscopy. Here, for the first time is a highly qualitative introduction to cluster physics. With its emphasis on cluster dynamics, this will be vital to everyone involved in this interdisciplinary subje
Frequent Pattern Mining Algorithms for Data Clustering
DEFF Research Database (Denmark)
Zimek, Arthur; Assent, Ira; Vreeken, Jilles
2014-01-01
that frequent pattern mining was at the cradle of subspace clustering—yet, it quickly developed into an independent research field. In this chapter, we discuss how frequent pattern mining algorithms have been extended and generalized towards the discovery of local clusters in high-dimensional data......Discovering clusters in subspaces, or subspace clustering and related clustering paradigms, is a research field where we find many frequent pattern mining related influences. In fact, as the first algorithms for subspace clustering were based on frequent pattern mining algorithms, it is fair to say....... In particular, we discuss several example algorithms for subspace clustering or projected clustering as well as point out recent research questions and open topics in this area relevant to researchers in either clustering or pattern mining...
Introduction to Cluster Monte Carlo Algorithms
Luijten, E.
This chapter provides an introduction to cluster Monte Carlo algorithms for classical statistical-mechanical systems. A brief review of the conventional Metropolis algorithm is given, followed by a detailed discussion of the lattice cluster algorithm developed by Swendsen and Wang and the single-cluster variant introduced by Wolff. For continuum systems, the geometric cluster algorithm of Dress and Krauth is described. It is shown how their geometric approach can be generalized to incorporate particle interactions beyond hardcore repulsions, thus forging a connection between the lattice and continuum approaches. Several illustrative examples are discussed.
A Novel Research on Rough Clustering Algorithm
Directory of Open Access Journals (Sweden)
Tao Qu
2014-01-01
Full Text Available The aim of this study is focusing the issue of traditional clustering algorithm subjects to data space distribution influence, a novel clustering algortihm combined with rough set theory is employed to the normal clustering. The proposed rough clustering algorithm takes the condition attributes and decision attributes displayed in the information table as the consistency principle, meanwhile it takes the data supercubic and information entropy to realize data attribute shortcutting and discretizing. Based on above discussion, by applying assemble feature vector addition principle computiation only one scanning information table can realize clustering for the data subject. Experiments reveal that the proposed algorithm is efficient and feasible.
Hesitant fuzzy agglomerative hierarchical clustering algorithms
Zhang, Xiaolu; Xu, Zeshui
2015-02-01
Recently, hesitant fuzzy sets (HFSs) have been studied by many researchers as a powerful tool to describe and deal with uncertain data, but relatively, very few studies focus on the clustering analysis of HFSs. In this paper, we propose a novel hesitant fuzzy agglomerative hierarchical clustering algorithm for HFSs. The algorithm considers each of the given HFSs as a unique cluster in the first stage, and then compares each pair of the HFSs by utilising the weighted Hamming distance or the weighted Euclidean distance. The two clusters with smaller distance are jointed. The procedure is then repeated time and again until the desirable number of clusters is achieved. Moreover, we extend the algorithm to cluster the interval-valued hesitant fuzzy sets, and finally illustrate the effectiveness of our clustering algorithms by experimental results.
A functional clustering algorithm for the analysis of neural relationships
Feldt, S; Hetrick, V L; Berke, J D; Zochowski, M
2008-01-01
We formulate a novel technique for the detection of functional clusters in neural data. In contrast to prior network clustering algorithms, our procedure progressively combines spike trains and derives the optimal clustering cutoff in a simple and intuitive manner. To demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. We observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.
Constructing Product Ontologies with an Improved Conceptual Clustering Algorithm
Institute of Scientific and Technical Information of China (English)
曹大军; 徐良贤
2002-01-01
In a distributed eMarketplace, recommended product ontologies are required for trading between buyers and sellers. Conceptual clustering can be employed to build dynamic recommended product ontologies. Traditional methods of conceptual clustering (e. g. COBWEB or Cluster/2) do not take heterogeneous attributes of a concept into account.Moreover, the result of these methods is clusters other than recommended concepts. A center recommendation clustering algorithm is provided. According to the values of heterogeneous attributes, recommended product names can be selected at the clusters, which are produced by this algorithm. This algorithm can also create the hierarchical relations between product names. The definitions of product names given by all participants are collected in a distributed eMarketplace.Recommended product ontologies are built. These ontologies include relations and definitions of product names, which come from different participants in the distributed eMarketplace. Finally a case is given to illustrate this method. The result shows that this method is feasible.
Intuitionistic Fuzzy Possibilistic C Means Clustering Algorithms
Directory of Open Access Journals (Sweden)
Arindam Chaudhuri
2015-01-01
Full Text Available Intuitionistic fuzzy sets (IFSs provide mathematical framework based on fuzzy sets to describe vagueness in data. It finds interesting and promising applications in different domains. Here, we develop an intuitionistic fuzzy possibilistic C means (IFPCM algorithm to cluster IFSs by hybridizing concepts of FPCM, IFSs, and distance measures. IFPCM resolves inherent problems encountered with information regarding membership values of objects to each cluster by generalizing membership and nonmembership with hesitancy degree. The algorithm is extended for clustering interval valued intuitionistic fuzzy sets (IVIFSs leading to interval valued intuitionistic fuzzy possibilistic C means (IVIFPCM. The clustering algorithm has membership and nonmembership degrees as intervals. Information regarding membership and typicality degrees of samples to all clusters is given by algorithm. The experiments are performed on both real and simulated datasets. It generates valuable information and produces overlapped clusters with different membership degrees. It takes into account inherent uncertainty in information captured by IFSs. Some advantages of algorithms are simplicity, flexibility, and low computational complexity. The algorithm is evaluated through cluster validity measures. The clustering accuracy of algorithm is investigated by classification datasets with labeled patterns. The algorithm maintains appreciable performance compared to other methods in terms of pureness ratio.
Relevance of Dynamic Clustering to Biological Networks
Kaneko, K
1993-01-01
Abstract Network of nonlinear dynamical elements often show clustering of synchronization by chaotic instability. Relevance of the clustering to ecological, immune, neural, and cellular networks is discussed, with the emphasis of partially ordered states with chaotic itinerancy. First, clustering with bit structures in a hypercubic lattice is studied. Spontaneous formation and destruction of relevant bits are found, which give self-organizing, and chaotic genetic algorithms. When spontaneous changes of effective couplings are introduced, chaotic itinerancy of clusterings is widely seen through a feedback mechanism, which supports dynamic stability allowing for complexity and diversity, known as homeochaos. Second, synaptic dynamics of couplings is studied in relation with neural dynamics. The clustering structure is formed with a balance between external inputs and internal dynamics. Last, an extension allowing for the growth of the number of elements is given, in connection with cell differentiation. Effecti...
Algorithm for Spatial Clustering with Obstacles
El-Sharkawi, Mohamed E
2009-01-01
In this paper, we propose an efficient clustering technique to solve the problem of clustering in the presence of obstacles. The proposed algorithm divides the spatial area into rectangular cells. Each cell is associated with statistical information that enables us to label the cell as dense or non-dense. We also label each cell as obstructed (i.e. intersects any obstacle) or non-obstructed. Then the algorithm finds the regions (clusters) of connected, dense, non-obstructed cells. Finally, the algorithm finds a center for each such region and returns those centers as centers of the relatively dense regions (clusters) in the spatial area.
A new fusion algorithm for fuzzy clustering
Directory of Open Access Journals (Sweden)
Ivan Vidović
2014-12-01
Full Text Available In this paper, we have considered the merging problem of two ellipsoidal clusters in order to construct a new fusion algorithm for fuzzy clustering. We have proposed a criterion for merging two ellipsoidal clusters ∏1, ∏2 with associated main Mahalanobis circles Ej(cj,σj, where cj is the centroid and σ^2j is the Mahalanobis variance of cluster ∏j . Based on the well-known Davies-Bouldin index, we have constructed a new fusion algorithm. The criterion has been tested on several data sets, and the performance of the fusion algorithm has been demonstrated on an illustrative example.
Novel Cluster Validity Index for FCM Algorithm
Institute of Scientific and Technical Information of China (English)
Jian Yu; Cui-Xia Li
2006-01-01
How to determine an appropriate number of clusters is very important when implementing a specific clustering algorithm, like c-means, fuzzy c-means (FCM). In the literature, most cluster validity indices are originated from partition or geometrical property of the data set. In this paper, the authors developed a novel cluster validity index for FCM, based on the optimality test of FCM. Unlike the previous cluster validity indices, this novel cluster validity index is inherent in FCM itself. Comparison experiments show that the stability index can be used as cluster validity index for the fuzzy c-means.
An object-oriented cluster search algorithm
Energy Technology Data Exchange (ETDEWEB)
Silin, Dmitry; Patzek, Tad
2003-01-24
In this work we describe two object-oriented cluster search algorithms, which can be applied to a network of an arbitrary structure. First algorithm calculates all connected clusters, whereas the second one finds a path with the minimal number of connections. We estimate the complexity of the algorithm and infer that the number of operations has linear growth with respect to the size of the network.
Mean-field behavior of cluster dynamics
Persky, N.; Ben-Av, R.; Kanter, I.; Domany, E.
1996-09-01
The dynamic behavior of cluster algorithms is analyzed in the classical mean-field limit. Rigorous analytical results below Tc establish that the dynamic exponent has the value zSW=1 for the Swendsen-Wang algorithm and zW=0 for the Wolff algorithm. An efficient Monte Carlo implementation is introduced, adapted for using these algorithms for fully connected graphs. Extensive simulations both above and below Tc demonstrate scaling and evaluate the finite-size scaling function by means of a rather impressive collapse of the data.
An extended EM algorithm for subspace clustering
Institute of Scientific and Technical Information of China (English)
Lifei CHEN; Qingshan JIANG
2008-01-01
Clustering high dimensional data has become a challenge in data mining due to the curse of dimension-ality. To solve this problem, subspace clustering has been defined as an extension of traditional clustering that seeks to find clusters in subspaces spanned by different combinations of dimensions within a dataset. This paper presents a new subspace clustering algorithm that calcu-lates the local feature weights automatically in an EM-based clustering process. In the algorithm, the features are locally weighted by using a new unsupervised weight-ing method, as a means to minimize a proposed cluster-ing criterion that takes into account both the average intra-clusters compactness and the average inter-clusters separation for subspace clustering. For the purposes of capturing accurate subspace information, an additional outlier detection process is presented to identify the pos-sible local outliers of subspace clusters, and is embedded between the E-step and M-step of the algorithm. The method has been evaluated in clustering real-world gene expression data and high dimensional artificial data with outliers, and the experimental results have shown its effectiveness.
Data clustering theory, algorithms, and applications
Gan, Guojun; Wu, Jianhong
2007-01-01
Cluster analysis is an unsupervised process that divides a set of objects into homogeneous groups. This book starts with basic information on cluster analysis, including the classification of data and the corresponding similarity measures, followed by the presentation of over 50 clustering algorithms in groups according to some specific baseline methodologies such as hierarchical, center-based, and search-based methods. As a result, readers and users can easily identify an appropriate algorithm for their applications and compare novel ideas with existing results. The book also provides examples of clustering applications to illustrate the advantages and shortcomings of different clustering architectures and algorithms. Application areas include pattern recognition, artificial intelligence, information technology, image processing, biology, psychology, and marketing. Readers also learn how to perform cluster analysis with the C/C++ and MATLAB® programming languages.
Load Balancing Algorithm for Cache Cluster
Institute of Scientific and Technical Information of China (English)
刘美华; 古志民; 曹元大
2003-01-01
By the load definition of cluster, the request is regarded as granularity to compute load and implement the load balancing in cache cluster. First, the processing power of cache-node is studied from four aspects: network bandwidth, memory capacity, disk access rate and CPU usage. Then, the weighted load of cache-node is customized. Based on this, a load-balancing algorithm that can be applied to the cache cluster is proposed. Finally, Polygraph is used as a benchmarking tool to test the cache cluster possessing the load-balancing algorithm and the cache cluster with cache array routing protocol respectively. The results show the load-balancing algorithm can improve the performance of the cache cluster.
Semantic Based Cluster Content Discovery in Description First Clustering Algorithm
Directory of Open Access Journals (Sweden)
MUHAMMAD WASEEM KHAN
2017-01-01
Full Text Available In the field of data analytics grouping of like documents in textual data is a serious problem. A lot of work has been done in this field and many algorithms have purposed. One of them is a category of algorithms which firstly group the documents on the basis of similarity and then assign the meaningful labels to those groups. Description first clustering algorithm belong to the category in which the meaningful description is deduced first and then relevant documents are assigned to that description. LINGO (Label Induction Grouping Algorithm is the algorithm of description first clustering category which is used for the automatic grouping of documents obtained from search results. It uses LSI (Latent Semantic Indexing; an IR (Information Retrieval technique for induction of meaningful labels for clusters and VSM (Vector Space Model for cluster content discovery. In this paper we present the LINGO while it is using LSI during cluster label induction and cluster content discovery phase. Finally, we compare results obtained from the said algorithm while it uses VSM and Latent semantic analysis during cluster content discovery phase.
Efficient Cluster Algorithm for CP(N-1) Models
Beard, B B; Riederer, S; Wiese, U J
2006-01-01
Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard Wilson formulation of lattice field theory. In fact, there is a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. In this paper, we construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a regularization for CP(N-1) models in the framework of D-theory. We present detailed studies of the autocorrelations and find a dynamical critical exponent that is consistent with z = 0.
Efficient cluster algorithm for CP(N-1) models
Beard, B. B.; Pepe, M.; Riederer, S.; Wiese, U.-J.
2006-11-01
Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard Wilson formulation of lattice field theory. In fact, there is a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. In this paper, we construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a regularization for CP(N-1) models in the framework of D-theory. We present detailed studies of the autocorrelations and find a dynamical critical exponent that is consistent with z=0.
Self-organization and clustering algorithms
Bezdek, James C.
1991-01-01
Kohonen's feature maps approach to clustering is often likened to the k or c-means clustering algorithms. Here, the author identifies some similarities and differences between the hard and fuzzy c-Means (HCM/FCM) or ISODATA algorithms and Kohonen's self-organizing approach. The author concludes that some differences are significant, but at the same time there may be some important unknown relationships between the two methodologies. Several avenues of research are proposed.
Non-convex polygons clustering algorithm
Directory of Open Access Journals (Sweden)
Kruglikov Alexey
2016-01-01
Full Text Available A clustering algorithm is proposed, to be used as a preliminary step in motion planning. It is tightly coupled to the applied problem statement, i.e. uses parameters meaningful only with respect to it. Use of geometrical properties for polygons clustering allows for a better calculation time as opposed to general-purpose algorithms. A special form of map optimized for quick motion planning is constructed as a result.
The Georgi Algorithms of Jet Clustering
Ge, Shao-Feng
2014-01-01
We reveal the direct link between the jet clustering algorithms recently proposed by Howard Georgi and parton shower kinematics, providing firm foundation from the theoretical side. The kinematics of this class of elegant algorithms is explored systematically for partons with arbitrary masses and the jet function is generalized to $J^{(n)}_\\beta$ with a jet function index $n$ in order to achieve more degrees of freedom. Based on three basic requirements that, the result of jet clustering is p...
Spatial cluster detection using dynamic programming
Directory of Open Access Journals (Sweden)
Sverchkov Yuriy
2012-03-01
Full Text Available Abstract Background The task of spatial cluster detection involves finding spatial regions where some property deviates from the norm or the expected value. In a probabilistic setting this task can be expressed as finding a region where some event is significantly more likely than usual. Spatial cluster detection is of interest in fields such as biosurveillance, mining of astronomical data, military surveillance, and analysis of fMRI images. In almost all such applications we are interested both in the question of whether a cluster exists in the data, and if it exists, we are interested in finding the most accurate characterization of the cluster. Methods We present a general dynamic programming algorithm for grid-based spatial cluster detection. The algorithm can be used for both Bayesian maximum a-posteriori (MAP estimation of the most likely spatial distribution of clusters and Bayesian model averaging over a large space of spatial cluster distributions to compute the posterior probability of an unusual spatial clustering. The algorithm is explained and evaluated in the context of a biosurveillance application, specifically the detection and identification of Influenza outbreaks based on emergency department visits. A relatively simple underlying model is constructed for the purpose of evaluating the algorithm, and the algorithm is evaluated using the model and semi-synthetic test data. Results When compared to baseline methods, tests indicate that the new algorithm can improve MAP estimates under certain conditions: the greedy algorithm we compared our method to was found to be more sensitive to smaller outbreaks, while as the size of the outbreaks increases, in terms of area affected and proportion of individuals affected, our method overtakes the greedy algorithm in spatial precision and recall. The new algorithm performs on-par with baseline methods in the task of Bayesian model averaging. Conclusions We conclude that the dynamic
Optimal Hops-Based Adaptive Clustering Algorithm
Xuan, Xin; Chen, Jian; Zhen, Shanshan; Kuo, Yonghong
This paper proposes an optimal hops-based adaptive clustering algorithm (OHACA). The algorithm sets an energy selection threshold before the cluster forms so that the nodes with less energy are more likely to go to sleep immediately. In setup phase, OHACA introduces an adaptive mechanism to adjust cluster head and load balance. And the optimal distance theory is applied to discover the practical optimal routing path to minimize the total energy for transmission. Simulation results show that OHACA prolongs the life of network, improves utilizing rate and transmits more data because of energy balance.
Issues Challenges and Tools of Clustering Algorithms
Directory of Open Access Journals (Sweden)
Parul Agarwal
2011-05-01
Full Text Available Clustering is an unsupervised technique of Data Mining. It means grouping similar objects together and separating the dissimilar ones. Each object in the data set is assigned a class label in the clustering process using a distance measure. This paper has captured the problems that are faced in real when clustering algorithms are implemented .It also considers the most extensively used tools which are readily available and support functions which ease the programming. Once algorithms have been implemented, they also need to be tested for its validity. There exist several validation indexes for testing the performance and accuracy which have also been discussed here.
Blockspin Cluster Algorithms for Quantum Spin Systems
Wiese, U J
1992-01-01
Cluster algorithms are developed for simulating quantum spin systems like the one- and two-dimensional Heisenberg ferro- and anti-ferromagnets. The corresponding two- and three-dimensional classical spin models with four-spin couplings are maped to blockspin models with two-blockspin interactions. Clusters of blockspins are updated collectively. The efficiency of the method is investigated in detail for one-dimensional spin chains. Then in most cases the new algorithms solve the problems of slowing down from which standard algorithms are suffering.
A New Clustering Algorithm for Face Classification
Directory of Open Access Journals (Sweden)
Shaker K. Ali
2016-06-01
Full Text Available In This paper, we proposed new clustering algorithm depend on other clustering algorithm ideas. The proposed algorithm idea is based on getting distance matrix, then the exclusion of the matrix points which will be clustered by saving the location (row, column of these points and determine the minimum distance of these points which will be belongs the group (class and keep the other points which are not clustering yet. The propose algorithm is applied to image data base of the human face with different environment (direction, angles... etc.. These data are collected from different resource (ORL site and real images collected from random sample of Thi_Qar city population in lraq. Our algorithm has been implemented on three types of distance to calculate the minimum distance between points (Euclidean, Correlation and Minkowski distance .The efficiency ratio of proposed algorithm has varied according to the data base and threshold, the efficiency of our algorithm is exceeded (96%. Matlab (2014 has been used in this work.
A Survey of Grid Based Clustering Algorithms
Directory of Open Access Journals (Sweden)
MR ILANGO
2010-08-01
Full Text Available Cluster Analysis, an automatic process to find similar objects from a database, is a fundamental operation in data mining. A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. Clustering techniques have been discussed extensively in SimilaritySearch, Segmentation, Statistics, Machine Learning, Trend Analysis, Pattern Recognition and Classification [1]. Clustering methods can be classified into i Partitioning methods ii Hierarchical methods iii Density-based methods iv Grid-based methods v Model-based methods. Grid based methods quantize the object space into a finite number of cells (hyper-rectangles and then perform the required operations on the quantized space. The main advantage of Grid based method is its fast processing time which depends on number of cells in each dimension in quantized space. In this research paper, we present some of the grid based methods such as CLIQUE (CLustering In QUEst [2], STING (STatistical INformation Grid [3], MAFIA (Merging of Adaptive Intervals Approach to Spatial Data Mining [4], Wave Cluster [5]and O-CLUSTER (Orthogonal partitioning CLUSTERing [6], as a survey andalso compare their effectiveness in clustering data objects. We also present some of the latest developments in Grid Based methods such as Axis Shifted Grid Clustering Algorithm [7] and Adaptive Mesh Refinement [Wei-Keng Liao etc] [8] to improve the processing time of objects.
Cluster hybrid Monte Carlo simulation algorithms
Plascak, J. A.; Ferrenberg, Alan M.; Landau, D. P.
2002-06-01
We show that addition of Metropolis single spin flips to the Wolff cluster-flipping Monte Carlo procedure leads to a dramatic increase in performance for the spin-1/2 Ising model. We also show that adding Wolff cluster flipping to the Metropolis or heat bath algorithms in systems where just cluster flipping is not immediately obvious (such as the spin-3/2 Ising model) can substantially reduce the statistical errors of the simulations. A further advantage of these methods is that systematic errors introduced by the use of imperfect random-number generation may be largely healed by hybridizing single spin flips with cluster flipping.
Clustered Self Organising Migrating Algorithm for the Quadratic Assignment Problem
Davendra, Donald; Zelinka, Ivan; Senkerik, Roman
2009-08-01
An approach of population dynamics and clustering for permutative problems is presented in this paper. Diversity indicators are created from solution ordering and its mapping is shown as an advantage for population control in metaheuristics. Self Organising Migrating Algorithm (SOMA) is modified using this approach and vetted with the Quadratic Assignment Problem (QAP). Extensive experimentation is conducted on benchmark problems in this area.
Blockspin Scheme and Cluster Algorithm for Quantum Spin Systems
Ying, H P; Ying, He-Ping; Wiese, Uwe-Jens
1992-01-01
We present a numerical study using a cluster algorithm for the 1-d $S=1/2$ quantum Heisenberg models. The dynamical critical exponent for anti-ferromagnetic chains is $z=0.0(1)$ such that critical slowing down is eliminated.
Polyclonal clustering algorithm and its convergence
Institute of Scientific and Technical Information of China (English)
MA Li; JIAO Li-cheng; BAI Lin; CHEN Chang-guo
2008-01-01
Being characteristic of non-teacher learning, self-organization, memory, and noise resistance, the artificial immune system is a research focus in the field of intelligent information processing. Based on the basic principles of organism immune and clonal selection, this article presents a polyclonal clustering algorithm characteristic of self-adaptation. According to the core idea of the algorithm, various immune operators in the artificial immune system are employed in the clustering process; moreover, clustering numbers are adjusted in accordance with the affinity function. Introduction of the recombination operator can effectively enhance the diversity of the individual antibody in a generation population, so that the searching scope for solutions is enlarged and the premature phenomenon of the algorithm is avoided. Besides, introduction of the inconsistent mutation operator enhances the adaptability and optimizes the performance of local solution seeking. Meanwhile, the convergence of the algorithm is accelerated. In addition, the article also proves the convergence of the algorithm by employing the Markov chain. Results of the data simulation experiment show that the algorithm is capable of obtaining reasonable and effective cluster.
Research on optimization of dynamic nearest neighbor clustering algorithm%动态最近邻聚类算法的优化研究
Institute of Scientific and Technical Information of China (English)
储岳中; 徐波
2011-01-01
To solve the problem of the sensitivity for clustering radius and difficult to obtain the optimal solution of the nearest neighbor clustering algorithm, an optimization method based on Bayesian information criterion (BIC) is proposed.Firstly, the initial data set is to be preprocessed to remove noise data by DBSCAN algorithm.Then, the nearest neighbor clustering algorithm is to be used in the parameter space of cluster radius, and the value of Bayesian information for each cluster is to be calculated.Finally, the maximum value of the corresponding Bayesian information is obtained by comparing the results of various cluster, which is just the optimal clustering.Experimental results show that the optimization of nearest neighbor clustering algorithm is a best solution for the selecting of clustering radius.%针对最近邻聚类算法对聚类半径敏感、不易获得最优解的问题,提出了基于贝叶斯信息测度BIC(Bayesian informationcriterion)的优化方法.通过DBSCAN算法对初始数据集进行预处理,去除噪声数据.在参数空间内逐步调整聚类半径,利用最近邻聚类算法对数据集进行聚类,并计算每次聚类结果的贝叶斯信息测度值.比较各次聚类结果的贝叶斯信息测度值,最大贝叶斯信息测度值对应的聚类即为最优聚类结果.实验结果表明,优化的最近邻聚类算法很好地解决了合适的聚类半径选取问题.
Maximum-entropy clustering algorithm and its global convergence analysis
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
Constructing a batch of differentiable entropy functions touniformly approximate an objective function by means of the maximum-entropy principle, a new clustering algorithm, called maximum-entropy clustering algorithm, is proposed based on optimization theory. This algorithm is a soft generalization of the hard C-means algorithm and possesses global convergence. Its relations with other clustering algorithms are discussed.
An Adaptive Clustering Algorithm for Intrusion Detection
Institute of Scientific and Technical Information of China (English)
QIU Juli
2007-01-01
In this paper,we introduce an adaptive clustering algorithm for intrusion detection based on wavecluster which was introduced by Gholamhosein in 1999 and used with success in image processing.Because of the non-stationary characteristic of network traffic,we extend and develop an adaptive wavecluster algorithm for intrusion detection.Using the multiresolution property of wavelet transforms,we can effectively identify arbitrarily shaped clusters at different scales and degrees of detail,moreover,applying wavelet transform removes the noise from the original feature space and make more accurate cluster found.Experimental results on KDD-99 intrusion detection dataset show the efficiency and accuracy of this algorithm.A detection rate above 96% and a false alarm rate below 3% are achieved.
Efficient Cluster Head Selection Algorithm for MANET
Directory of Open Access Journals (Sweden)
Khalid Hussain
2013-01-01
Full Text Available In mobile ad hoc network (MANET cluster head selection is considered a gigantic challenge. In wireless sensor network LEACH protocol can be used to select cluster head on the bases of energy, but it is still a dispute in mobil ad hoc networks and especially when nodes are itinerant. In this paper we proposed an efficient cluster head selection algorithm (ECHSA, for selection of the cluster head efficiently in Mobile ad hoc networks. We evaluate our proposed algorithm through simulation in OMNet++ as well as on test bed; we experience the result according to our assumption. For further evaluation we also compare our proposed protocol with several other protocols like LEACH-C and consequences show perfection.
Performance Analysis of Hierarchical Clustering Algorithm
Directory of Open Access Journals (Sweden)
K.Ranjini
2011-07-01
Full Text Available Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters, so that the data in each subset (ideally share some common trait - often proximity according to some defined distance measure. Data clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. This paper explains the implementation of agglomerative and divisive clustering algorithms applied on various types of data. The details of the victims of Tsunami in Thailand during the year 2004, was taken as the test data. Visual programming is used for implementation and running time of the algorithms using different linkages (agglomerative to different types of data are taken for analysis.
Parallel Clustering Algorithms for Structured AMR
Energy Technology Data Exchange (ETDEWEB)
Gunney, B T; Wissink, A M; Hysom, D A
2005-10-26
We compare several different parallel implementation approaches for the clustering operations performed during adaptive gridding operations in patch-based structured adaptive mesh refinement (SAMR) applications. Specifically, we target the clustering algorithm of Berger and Rigoutsos (BR91), which is commonly used in many SAMR applications. The baseline for comparison is a simplistic parallel extension of the original algorithm that works well for up to O(10{sup 2}) processors. Our goal is a clustering algorithm for machines of up to O(10{sup 5}) processors, such as the 64K-processor IBM BlueGene/Light system. We first present an algorithm that avoids the unneeded communications of the simplistic approach to improve the clustering speed by up to an order of magnitude. We then present a new task-parallel implementation to further reduce communication wait time, adding another order of magnitude of improvement. The new algorithms also exhibit more favorable scaling behavior for our test problems. Performance is evaluated on a number of large scale parallel computer systems, including a 16K-processor BlueGene/Light system.
Dynamical behavior of the Niedermayer algorithm applied to Potts models
Girardi, D.; Penna, T. J. P.; Branco, N. S.
2012-01-01
In this work we make a numerical study of the dynamic universality class of the Niedermayer algorithm applied to the two-dimensional Potts model with 2, 3, and 4 states. This algorithm updates clusters of spins and has a free parameter, $E_0$, which controls the size of these clusters, such that $E_0=1$ is the Metropolis algorithm and $E_0=0$ regains the Wolff algorithm, for the Potts model. For $-1
Analysis of Stemming Algorithm for Text Clustering
Directory of Open Access Journals (Sweden)
N.Sandhya
2011-09-01
Full Text Available Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. In Bag of words representation of documents the words that appear in documents often have many morphological variants and in most cases, morphological variants of words have similar semantic interpretations and can be considered as equivalent for the purpose of clustering applications. For this reason, a number of stemming Algorithms, or stemmers, have been developed, which attempt to reduce a word to its stem or root form. Thus, the key terms of a document are represented by stems rather than by the original words. In this work we have studied the impact of stemming algorithm along with four popular similarity measures (Euclidean, cosine, Pearson correlation and extended Jaccard in conjunction with different types of vector representation (boolean, term frequency and term frequency and inverse document frequency on cluster quality. For Clustering documents we have used partitional based clustering technique K Means. Performance is measured against a human-imposed classification of Classic data set. We conducted a number of experiments and used entropy measure to assure statistical significance of results. Cosine, Pearson correlation and extended Jaccard similarities emerge as the best measures to capture human categorization behavior, while Euclidean measures perform poor. After applying the Stemming algorithm Euclidean measure shows little improvement.
High-Performance Broadcasting Algorithms on Cluster
Institute of Scientific and Technical Information of China (English)
舒继武; 魏英霞; 王鼎兴
2004-01-01
In many clusters connected by high-speed communication networks, the exact structure of the underlying communication network and the latency difference between different sending and receiving pairs may be ignored when they broadcast, such as in the approach adopted by the broadcasting method in MPICH,a widely used MPI implementation. However, the underlying network cluster topologies are becoming more and more complicated and the performance of traditional broadcasting algorithms, such as MPICH's MPI_Bcast, is far from good. This paper analyzed the impact of communication latencies and the underlying topologies on the performance of broadcasting algorithms for multilevel clusters. A multilevel model was developed for broadcasting in clusters with complicated topologies, which divides the cluster topology into many levels based on the underlying topology. The multilevel model was used to develop a new broadcast algorithm,MLM broadcast-2 (MLMB-2), that adapts to a wide range of clusters. Comparison of the performance of the counterpart MPI operation MPI_Bcast and MLMB-2 shows that MLMB-2 outperforms MPl_Bcast by decreasing the broadcast running time by 60%-90%.
Cluster Algorithm Special Purpose Processor
Talapov, A. L.; Shchur, L. N.; Andreichenko, V. B.; Dotsenko, Vl. S.
We describe a Special Purpose Processor, realizing the Wolff algorithm in hardware, which is fast enough to study the critical behaviour of 2D Ising-like systems containing more than one million spins. The processor has been checked to produce correct results for a pure Ising model and for Ising model with random bonds. Its data also agree with the Nishimori exact results for spin glass. Only minor changes of the SPP design are necessary to increase the dimensionality and to take into account more complex systems such as Potts models.
Cluster algorithm special purpose processor
Energy Technology Data Exchange (ETDEWEB)
Talapov, A.L.; Shchur, L.N.; Andreichenko, V.B.; Dotsenko, V.S. (Landau Inst. for Theoretical Physics, GSP-1 117940 Moscow V-334 (USSR))
1992-08-10
In this paper, the authors describe a Special Purpose Processor, realizing the Wolff algorithm in hardware, which is fast enough to study the critical behaviour of 2D Ising-like systems containing more than one million spins. The processor has been checked to produce correct results for a pure Ising model and for Ising model with random bonds. Its data also agree with the Nishimori exact results for spin glass. Only minor changes of the SPP design are necessary to increase the dimensionality and to take into account more complex systems such as Potts models.
An Improved Heuristic Ant-Clustering Algorithm
Institute of Scientific and Technical Information of China (English)
Yunfei Chen; Yushu Liu; Jihai Zhao
2004-01-01
An improved heuristic ant-clustering algorithm(HAC)is presented in this paper. A device of ＇memory bank＇ is proposed,which can bring forth heuristic knowledge guiding ant to move in the bi-dimension grid space.The device experiments on real data sets and synthetic data sets.The results demonstrate that HAC has superiority in misclassification error rate and runtime over the classical algorithm.
A Task-parallel Clustering Algorithm for Structured AMR
Energy Technology Data Exchange (ETDEWEB)
Gunney, B N; Wissink, A M
2004-11-02
A new parallel algorithm, based on the Berger-Rigoutsos algorithm for clustering grid points into logically rectangular regions, is presented. The clustering operation is frequently performed in the dynamic gridding steps of structured adaptive mesh refinement (SAMR) calculations. A previous study revealed that although the cost of clustering is generally insignificant for smaller problems run on relatively few processors, the algorithm scaled inefficiently in parallel and its cost grows with problem size. Hence, it can become significant for large scale problems run on very large parallel machines, such as the new BlueGene system (which has {Omicron}(10{sup 4}) processors). We propose a new task-parallel algorithm designed to reduce communication wait times. Performance was assessed using dynamic SAMR re-gridding operations on up to 16K processors of currently available computers at Lawrence Livermore National Laboratory. The new algorithm was shown to be up to an order of magnitude faster than the baseline algorithm and had better scaling trends.
Cluster-based control of nonlinear dynamics
Kaiser, Eurika; Spohn, Andreas; Cattafesta, Louis N; Morzynski, Marek
2016-01-01
The ability to manipulate and control fluid flows is of great importance in many scientific and engineering applications. Here, a cluster-based control framework is proposed to determine optimal control laws with respect to a cost function for unsteady flows. The proposed methodology frames high-dimensional, nonlinear dynamics into low-dimensional, probabilistic, linear dynamics which considerably simplifies the optimal control problem while preserving nonlinear actuation mechanisms. The data-driven approach builds upon a state space discretization using a clustering algorithm which groups kinematically similar flow states into a low number of clusters. The temporal evolution of the probability distribution on this set of clusters is then described by a Markov model. The Markov model can be used as predictor for the ergodic probability distribution for a particular control law. This probability distribution approximates the long-term behavior of the original system on which basis the optimal control law is de...
Dynamical Processes in Globular Clusters
McMillan, Stephen L W
2014-01-01
Globular clusters are among the most congested stellar systems in the Universe. Internal dynamical evolution drives them toward states of high central density, while simultaneously concentrating the most massive stars and binary systems in their cores. As a result, these clusters are expected to be sites of frequent close encounters and physical collisions between stars and binaries, making them efficient factories for the production of interesting and observable astrophysical exotica. I describe some elements of the competition among stellar dynamics, stellar evolution, and other processes that control globular cluster dynamics, with particular emphasis on pathways that may lead to the formation of blue stragglers.
A Fast Algorithm for Support Vector Clustering
Institute of Scientific and Technical Information of China (English)
吕常魁; 姜澄宇; 王宁生
2004-01-01
Support Vector Clustering (SVC) is a kernel-based unsupervised learning clustering method. The main drawback of SVC is its high computational complexity in getting the adjacency matrix describing the connectivity for each pairs of points. Based on the proximity graph model[3] , the Euclidean distance in Hilbert space is calculated using a Gaussian kernel, which is the right criterion to generate a minimum spanning tree using Kruskal's algorithm. Then the connectivity estimation is lowered by only checking the linkages between the edges that construct the main stem of the MST (Minimum Spanning Tree), in which the non-compatibility degree is originally defined to support the edge selection during linkage estimations. This new approach is experimentally analyzed.The results show that the revised algorithm has a better performance than the proximity graph model with faster speed, optimized clustering quality and strong ability to noise suppression, which makes SVC scalable to large data sets.
Fuzzy Rules for Ant Based Clustering Algorithm
Directory of Open Access Journals (Sweden)
Amira Hamdi
2016-01-01
Full Text Available This paper provides a new intelligent technique for semisupervised data clustering problem that combines the Ant System (AS algorithm with the fuzzy c-means (FCM clustering algorithm. Our proposed approach, called F-ASClass algorithm, is a distributed algorithm inspired by foraging behavior observed in ant colonyT. The ability of ants to find the shortest path forms the basis of our proposed approach. In the first step, several colonies of cooperating entities, called artificial ants, are used to find shortest paths in a complete graph that we called graph-data. The number of colonies used in F-ASClass is equal to the number of clusters in dataset. Hence, the partition matrix of dataset founded by artificial ants is given in the second step, to the fuzzy c-means technique in order to assign unclassified objects generated in the first step. The proposed approach is tested on artificial and real datasets, and its performance is compared with those of K-means, K-medoid, and FCM algorithms. Experimental section shows that F-ASClass performs better according to the error rate classification, accuracy, and separation index.
A clustering method of Chinese medicine prescriptions based on modified firefly algorithm.
Yuan, Feng; Liu, Hong; Chen, Shou-Qiang; Xu, Liang
2016-12-01
This paper is aimed to study the clustering method for Chinese medicine (CM) medical cases. The traditional K-means clustering algorithm had shortcomings such as dependence of results on the selection of initial value, trapping in local optimum when processing prescriptions form CM medical cases. Therefore, a new clustering method based on the collaboration of firefly algorithm and simulated annealing algorithm was proposed. This algorithm dynamically determined the iteration of firefly algorithm and simulates sampling of annealing algorithm by fitness changes, and increased the diversity of swarm through expansion of the scope of the sudden jump, thereby effectively avoiding premature problem. The results from confirmatory experiments for CM medical cases suggested that, comparing with traditional K-means clustering algorithms, this method was greatly improved in the individual diversity and the obtained clustering results, the computing results from this method had a certain reference value for cluster analysis on CM prescriptions.
Single-cluster dynamics for the random-cluster model
Deng, Y.; Qian, X.; Blöte, H.W.J.
2009-01-01
We formulate a single-cluster Monte Carlo algorithm for the simulation of the random-cluster model. This algorithm is a generalization of the Wolff single-cluster method for the q-state Potts model to noninteger values q>1. Its results for static quantities are in a satisfactory agreement with those
Single-cluster dynamics for the random-cluster model
Deng, Y.; Qian, X.; Blöte, H.W.J.
2009-01-01
We formulate a single-cluster Monte Carlo algorithm for the simulation of the random-cluster model. This algorithm is a generalization of the Wolff single-cluster method for the q-state Potts model to noninteger values q>1. Its results for static quantities are in a satisfactory agreement with those
Comparative study of several Clustering Algorithms
Directory of Open Access Journals (Sweden)
Prof. Neha Soni, Dr. Amit Ganatra
2012-12-01
Full Text Available Cluster Analysis is a process of grouping theobjects, where objects can be physical like a studentor can be an abstract such as behaviour of acustomer or handwriting of a person. The clusteranalysis is as old as a human life and has its rootsin many fields such as statistics, machine learning,biology, artificial intelligence. It is an unsupervisedlearning and faces many challenges such as a highdimension of the dataset, arbitrary shapes ofclusters, scalability, input parameter, domainknowledge and noisy data. Large number ofclustering algorithms had been proposed till date toaddress these challenges. There do not exist a singlealgorithm which can adequately handle all sorts ofrequirement. This makes a great challenge for theuser to do selection among the available algorithmfor the specific task. The purpose of this paper is toprovide a detailed analytical comparison of some ofthe very well known clustering algorithms, whichprovides guidance for the selection of clusteringalgorithm for a specific application.
An incremental clustering algorithm based on Mahalanobis distance
Aik, Lim Eng; Choon, Tan Wee
2014-12-01
Classical fuzzy c-means clustering algorithm is insufficient to cluster non-spherical or elliptical distributed datasets. The paper replaces classical fuzzy c-means clustering euclidean distance with Mahalanobis distance. It applies Mahalanobis distance to incremental learning for its merits. A Mahalanobis distance based fuzzy incremental clustering learning algorithm is proposed. Experimental results show the algorithm is an effective remedy for the defect in fuzzy c-means algorithm but also increase training accuracy.
CABOSFV algorithm for high dimensional sparse data clustering
Institute of Scientific and Technical Information of China (English)
Sen Wu; Xuedong Gao
2004-01-01
An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.
First Cluster Algorithm Special Purpose Processor
Talapov, A. L.; Andreichenko, V. B.; Dotsenko S., Vi.; Shchur, L. N.
We describe the architecture of the special purpose processor built to realize in hardware cluster Wolff algorithm, which is not hampered by a critical slowing down. The processor simulates two-dimensional Ising-like spin systems. With minor changes the same very effective architecture, which can be defined as a Memory Machine, can be used to study phase transitions in a wide range of models in two or three dimensions.
Enhanced Unequal Clustering Algorithm for Wireless Sensor Networks
Talbi, Said; Zaouche, Lotfi
2015-01-01
International audience; Clustering is considered as solution for more energy conservation during communications in wireless sensor networks. Recently, a new clustering algorithm named Unequal Clustering Algorithm (UCA) is proposed to avoid the burdened cluster-heads located around the sink due to the traffic coming from others which are far to the base station. This paper presents an Enhanced Unequal Clustering Algorithm called EUCA. This solution reduces the control traffic during a clusteri...
Institute of Scientific and Technical Information of China (English)
王宽全; 刘莉; 邬向前; 张大鹏
2003-01-01
Observing palm is one of diagnosis methods in Traditional Chinese Medicine and HolographicMedicine. Generally, the shape, color, ridge and line features of palm are all important for palm diagnosis. As thefirst attempt for automated palm diagnosis, the color is used and a new statistical feature of color, moment feature, isdefined in this paper. Multi-central dynamic clustering algorithm based on our new feature is proposed to recognizecancerous palm images. Applying our approach to the images in the palm database including all kinds of pathologicaland healthy palm images, the experimental results indicate that it is effective to recognize cancerous palm images andsuperior over the K-mean algorithm.
ITS Cluster Finding Algorithm on GPU
Changaival, Boonyarit
2014-01-01
ITS cluster finding algorithm is one of the data reduction algorithms at ALICE. It needs to be processed fast due to a high amount of data readout from the detector. A variety of platforms were studied for the system design. My work is to design, implement and benchmark this algorithm on a GPU platform. GPU is one of many platform that promote parallel computing. A high-end GPU can contain over 2000 processing cores comparing to the commodity CPUs which have only four cores. The program is written in C and CUDA library. The throughput (Number of events per second) is used as a metric to measure the performance. With the latest implementation, the throughput was increased by a factor of 5.
Epistemic communities and cluster dynamics
DEFF Research Database (Denmark)
Håkanson, Lars
2003-01-01
This paper questions the prevailing notions that firms within industrial clusters have privi-leged access to `tacit knowledge' that is unavailable - or available only at high cost - to firms located elsewhere, and that such access provides competitive advantages that help to explain the growth...... and development of both firms and regions. It outlines a model of cluster dynam-ics emphasizing two mutually interdependent processes: the concentration of specialized and complementary epistemic communities, on the one hand, and entrepreneurship and a high rate of new firm formation on the other....
A Cluster Maintenance Algorithm Based on Relative Mobility for Mobile Ad Hoc Network Management
Institute of Scientific and Technical Information of China (English)
SHENZhong; CHANGYilin; ZHANGXin
2005-01-01
The dynamic topology of mobile ad hoc networks makes network management significantly more challenging than wireline networks. The traditional Client/Server (Manager/Agent) management paradigm could not work well in such a dynamic environment, while the hierarchical network management architecture based on clustering is more feasible. Although the movement of nodes makes the cluster structure changeable and introduces new challenges for network management, the mobility is a relative concept. A node with high relative mobility is more prone to unstable behavior than a node with less relative mobility, thus the relative mobility of a node can be used to predict future node behavior. This paper presents the cluster availability which provides a quantitative measurement of cluster stability. Furthermore, a cluster maintenance algorithm based on cluster availability is proposed. The simulation results show that, compared to the Minimum ID clustering algorithm, our algorithm successfully alleviates the influence caused by node mobility and make the network management more efficient.
Investigation of Melting Dynamics of Hafnium Clusters.
Ng, Wei Chun; Lim, Thong Leng; Yoon, Tiem Leong
2017-03-27
Melting dynamics of hafnium clusters are investigated using a novel approach based on the idea of the chemical similarity index. Ground state configurations of small hafnium clusters are first derived using Basin-Hopping and Genetic Algorithm in the parallel tempering mode, employing the COMB potential in the energy calculator. These assumed ground state structures are verified by using the Low Lying Structures (LLS) method. The melting process is carried out either by using the direct heating method or prolonged simulated annealing. The melting point is identified by a caloric curve. However, it is found that the global similarity index is much more superior in locating premelting and total melting points of hafnium clusters.
Hearing the clusters in a graph: A distributed algorithm
Sahai, Tuhin; Banaszuk, Andrzej
2009-01-01
We propose a novel distributed algorithm to decompose graphs or cluster data. The algorithm recovers the solution obtained from spectral clustering without need for expensive eigenvalue/ eigenvector computations. We demonstrate that by solving the wave equation on the graph, every node can assign itself to a cluster by performing a local fast Fourier transform. We prove the equivalence of our algorithm to spectral clustering, derive convergence rates and demonstrate it on examples.
A High-Order CFS Algorithm for Clustering Big Data
Fanyu Bu; Zhikui Chen; Peng Li; Tong Tang; Ying Zhang
2016-01-01
With the development of Internet of Everything such as Internet of Things, Internet of People, and Industrial Internet, big data is being generated. Clustering is a widely used technique for big data analytics and mining. However, most of current algorithms are not effective to cluster heterogeneous data which is prevalent in big data. In this paper, we propose a high-order CFS algorithm (HOCFS) to cluster heterogeneous data by combining the CFS clustering algorithm and the dropout deep learn...
Improvement and Parallelism of k-Means Clustering Algorithm
Institute of Scientific and Technical Information of China (English)
TIAN Jinlan; ZHU Lin; ZHANG Suqin; LIU Lu
2005-01-01
The k-means clustering algorithm is one of the most commonly used algorithms for clustering analysis. The traditional k-means algorithm is, however, inefficient while working on large numbers of data sets and improving the algorithm efficiency remains a problem. This paper focuses on the efficiency issues of cluster algorithms. A refined initial cluster centers method is designed to reduce the number of iterative procedures in the algorithm. A parallel k-means algorithm is also studied for the problem of the operation limitation of a single processor machine when given huge data sets. The analytical results demonstrate that these improvements can greatly enhance the efficiency of the k-means algorithm, i.e., allow the grouping of a large number of data sets more accurately and more quickly. The analysis has theoretical and practical importance for work on the improvement and parallelism of cluster algorithms.
Innovation, learning and cluster dynamics
B. Nooteboom (Bart)
2004-01-01
textabstractThis chapter offers a theory and method for the analysis of the dynamics, i.e. the development, of clusters for innovation. It employs an analysis of three types of embedding: institutional embedding, which is often localized, structural embedding (network structure), and relational
Institute of Scientific and Technical Information of China (English)
刘太洪; 赵永雷
2016-01-01
为提高变压器故障诊断准确率，提出了一种基于遗传算法的动态加权模糊C均值聚类算法。该算法使用把聚类中心作为染色体的浮点数的编码方式，染色体长度可变，不同的长度对应于不同的故障聚类数；并使用权值区别不同样本点对故障划分的影响程度。将该算法应用于电力变压器油中溶解气体分析（DGA）数据分析，实现了变压器的故障诊断。经过大量实例分析，并将结果与其他算法进行对比，表明该算法具有较高的诊断精度。%ABSTRACT:In order to improve the correct rate of fault diagnosis of transformer, this paper investigates a dynamic weighted fuzzy c-means clustering algorithm based on genetic algorithm. The algorithm adopts a kind of cluster-center-based floating point encoding mode, in which the variable length chromosomes express cluster prototypes and different length of chromosomes corresponding to different numbers of cluster prototypes;besides,The algorithm utilizes the weights to express the relative degree of the importance of various data in fault partitioning. The algorithm is applied to DGA data analysis, which can accomplish fault diagnosis of the transformer. Examples analysis and comparison results show that the preci-sion of fault diagnosis can be evidently improved.
Parallelization of Edge Detection Algorithm using MPI on Beowulf Cluster
Haron, Nazleeni; Amir, Ruzaini; Aziz, Izzatdin A.; Jung, Low Tan; Shukri, Siti Rohkmah
In this paper, we present the design of parallel Sobel edge detection algorithm using Foster's methodology. The parallel algorithm is implemented using MPI message passing library and master/slave algorithm. Every processor performs the same sequential algorithm but on different part of the image. Experimental results conducted on Beowulf cluster are presented to demonstrate the performance of the parallel algorithm.
A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
Institute of Scientific and Technical Information of China (English)
Ohn Mar San; Van-Nam Huynh; Yoshiteru Nakamori
2003-01-01
Most of the earlier work on clustering mainly focused on numeric data whose inherent geometric properties can be exploited to naturally define distance functions between data points. However, data mining applications frequently involve many datasets that also consists of mixed numeric and categorical attributes. In this paper we present a clustering algorithm which is based on the k-means algorithm. The algorithm clusters objects with numeric and categorical attributes in a way similar to k-means. The object similarity measure is derived from both numeric and categorical attributes. When applied to numeric data, the algorithm is identical to the k-means. The main result of this paper is to provide a method to update the "cluster centers" of clustering objects described by mixed numeric and categorical attributes in the clustering process to minimize the clustering cost function. The clustering performance of the algorithm is demonstrated with the two well known data sets, namely credit approval and abalone databases.
Dynamical behavior of the Niedermayer algorithm applied to Potts models
Girardi, D.; Penna, T. J. P.; Branco, N. S.
2012-08-01
In this work, we make a numerical study of the dynamic universality class of the Niedermayer algorithm applied to the two-dimensional Potts model with 2, 3, and 4 states. This algorithm updates clusters of spins and has a free parameter, E0, which controls the size of these clusters, such that E0=1 is the Metropolis algorithm and E0=0 regains the Wolff algorithm, for the Potts model. For -1clusters of equal spins can be formed: we show that the mean size of the clusters of (possibly) turned spins initially grows with the linear size of the lattice, L, but eventually saturates at a given lattice size L˜, which depends on E0. For L≥L˜, the Niedermayer algorithm is in the same dynamic universality class of the Metropolis one, i.e, they have the same dynamic exponent. For E0>0, spins in different states may be added to the cluster but the dynamic behavior is less efficient than for the Wolff algorithm (E0=0). Therefore, our results show that the Wolff algorithm is the best choice for Potts models, when compared to the Niedermayer's generalization.
EFFICIENT ALGORITHM FOR MINING FREQUENT ITEMSETS USING CLUSTERING TECHNIQUES
Directory of Open Access Journals (Sweden)
D.Kerana Hanirex
2011-03-01
Full Text Available Now a days, Association rule plays an important role. The purchasing of one product when another product is purchased represents an association rule. The Apriori algorithm is the basic algorithm for mining association rules. This paper presents an efficient Partition Algorithm for Mining Frequent Itemsets(PAFI using clustering. This algorithm finds the frequent itemsets by partitioning the database transactions into clusters. Clusters are formed based on the imilarity measures between the transactions. Then it finds the frequent itemsets with the transactions in the clusters directly using improved Apriori algorithm which further reduces the number of scans in the database and hence improve the efficiency.
A hybrid monkey search algorithm for clustering analysis.
Chen, Xin; Zhou, Yongquan; Luo, Qifang
2014-01-01
Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.
A Hybrid Monkey Search Algorithm for Clustering Analysis
Directory of Open Access Journals (Sweden)
Xin Chen
2014-01-01
Full Text Available Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.
Directory of Open Access Journals (Sweden)
Yachun Pang
2012-01-01
Full Text Available This paper presents a novel two-step approach that incorporates fuzzy c-means (FCMs clustering and gradient vector flow (GVF snake algorithm for lesions contour segmentation on breast magnetic resonance imaging (BMRI. Manual delineation of the lesions by expert MR radiologists was taken as a reference standard in evaluating the computerized segmentation approach. The proposed algorithm was also compared with the FCMs clustering based method. With a database of 60 mass-like lesions (22 benign and 38 malignant cases, the proposed method demonstrated sufficiently good segmentation performance. The morphological and texture features were extracted and used to classify the benign and malignant lesions based on the proposed computerized segmentation contour and radiologists’ delineation, respectively. Features extracted by the computerized characterization method were employed to differentiate the lesions with an area under the receiver-operating characteristic curve (AUC of 0.968, in comparison with an AUC of 0.914 based on the features extracted from radiologists’ delineation. The proposed method in current study can assist radiologists to delineate and characterize BMRI lesion, such as quantifying morphological and texture features and improving the objectivity and efficiency of BMRI interpretation with a certain clinical value.
Gas Dynamics in Galaxy Clusters
McCourt, Michael Kingsley, Jr.
Galaxy clusters are the most massive structures in the universe and, in the hierarchical pattern of cosmological structure formation, the largest objects in the universe form last. Galaxy clusters are thus interesting objects for a number of reasons. Three examples relevant to this thesis are: 1. Constraining the properties of dark energy: Due to the hierarchical nature of structure formation, the largest objects in the universe form last. The cluster mass function is thus sensitive to the entire expansion history of the universe and can be used to constrain the properties of dark energy. This constraint complements others derived from the CMB or from Type Ia supernovae and provides an important, independent confirmation of such methods. In particular, clusters provide detailed information about the equation of state parameter w because they sample a large redshift range z ˜ 0 - 1. 2. Probing galaxy formation: Clusters contain the most massive galaxies in the uni- verse, and the most massive black holes; because clusters form so late, we can still witness the assembly of these objects in the nearby universe. Clusters thus provide a more detailed view of galaxy formation than is possible in studies of lower-mass ob- jects. An important example comes from x-ray studies of clusters, which unexpectedly found that star formation in massive galaxies in clusters is closely correlated with the properties of the hot, virialized gas in their halos. This correlation persists despite the enormous separation in temperature, in dynamical time-scales, and in length-scales between the virialized gas in the halo and the star-forming regions in the galaxy. This remains a challenge to interpret theoretically. 3. Developing our knowledge of dilute plasmas: The masses and sizes of galaxy clusters imply that the plasma which permeates them is both very hot (˜ 108 K) and very dilute (˜ 10 -2 cm-3). This plasma is collisional enough to be considered a fluid, but collisionless enough to
The Georgi algorithms of jet clustering
Ge, Shao-Feng
2015-05-01
We reveal the direct link between the jet clustering algorithms recently proposed by Howard Georgi and parton shower kinematics, providing firm foundation from the theoretical side. The kinematics of this class of elegant algorithms is explored systematically for partons with arbitrary masses and the jet function is generalized to J {/β ( n)} with a jet function index n in order to achieve more degrees of freedom. Based on three basic requirements that, the result of jet clustering is process-independent and hence logically consistent, for softer subjets the inclusion cone is larger to conform with the fact that parton shower tends to emit softer partons at earlier stage with larger opening angle, and that the cone size cannot be too large in order to avoid mixing up neighbor jets, we derive constraints on the jet function parameter β and index n which are closely related to cone size cutoff. Finally, we discuss how jet function values can be made invariant under Lorentz boost.
Molecular dynamics simulations of cluster fission and fusion processes
DEFF Research Database (Denmark)
Lyalin, Andrey G.; Obolensky, Oleg I.; Solov'yov, Ilia
2004-01-01
Results of molecular dynamics simulations of fission reactions Na_10^2+ --> Na_7^+ +Na_3^+ and Na_18^2+ --> 2Na_9^+ are presented. The dependence of the fission barriers on the isomer structure of the parent cluster is analyzed. It is demonstrated that the energy necessary for removing homothetic...... groups of atoms from the parent cluster is largely independent of the isomer form of the parent cluster. The importance of rearrangement of the cluster structure during the fission process is elucidated. This rearrangement may include transition to another isomer state of the parent cluster before actual...... separation of the daughter fragments begins and/or forming a "neck" between the separating fragments. A novel algorithm for modeling the cluster growth process is described. This approach is based on dynamic search for the most stable cluster isomers and allows one to find the optimized cluster geometries...
Energy Aware Clustering Algorithms for Wireless Sensor Networks
Rakhshan, Noushin; Rafsanjani, Marjan Kuchaki; Liu, Chenglian
2011-09-01
The sensor nodes deployed in wireless sensor networks (WSNs) are extremely power constrained, so maximizing the lifetime of the entire networks is mainly considered in the design. In wireless sensor networks, hierarchical network structures have the advantage of providing scalable and energy efficient solutions. In this paper, we investigate different clustering algorithms for WSNs and also compare these clustering algorithms based on metrics such as clustering distribution, cluster's load balancing, Cluster Head's (CH) selection strategy, CH's role rotation, node mobility, clusters overlapping, intra-cluster communications, reliability, security and location awareness.
A Novel Clustering Algorithm Inspired by Membrane Computing
Directory of Open Access Journals (Sweden)
Hong Peng
2015-01-01
Full Text Available P systems are a class of distributed parallel computing models; this paper presents a novel clustering algorithm, which is inspired from mechanism of a tissue-like P system with a loop structure of cells, called membrane clustering algorithm. The objects of the cells express the candidate centers of clusters and are evolved by the evolution rules. Based on the loop membrane structure, the communication rules realize a local neighborhood topology, which helps the coevolution of the objects and improves the diversity of objects in the system. The tissue-like P system can effectively search for the optimal partitioning with the help of its parallel computing advantage. The proposed clustering algorithm is evaluated on four artificial data sets and six real-life data sets. Experimental results show that the proposed clustering algorithm is superior or competitive to k-means algorithm and several evolutionary clustering algorithms recently reported in the literature.
An energy efficient clustering routing algorithm for wireless sensor networks
Institute of Scientific and Technical Information of China (English)
LI Li; DONG Shu-song; WEN Xiang-ming
2006-01-01
This article proposes an energy efficient clustering routing (EECR) algorithm for wireless sensor network. The algorithm can divide a sensor network into a few clusters and select a cluster head base on weight value that leads to more uniform energy dissipation evenly among all sensor nodes.Simulations and results show that the algorithm can save overall energy consumption and extend the lifetime of the wireless sensor network.
Cluster analysis of word frequency dynamics
Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.
2015-01-01
This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.
Introduction to Clustering Algorithms and Applications
Yang, Sibei; Tao, Liangde; Gong, Bingchen
2014-01-01
Data clustering is the process of identifying natural groupings or clusters within multidimensional data based on some similarity measure. Clustering is a fundamental process in many different disciplines. Hence, researchers from different fields are actively working on the clustering problem. This paper provides an overview of the different representative clustering methods. In addition, application of clustering in different field is briefly introduced.
PHC: A Fast Partition and Hierarchy-Based Clustering Algorithm
Institute of Scientific and Technical Information of China (English)
ZHOU HaoFeng(周皓峰); YUAN QingQing(袁晴晴); CHENG ZunPing(程尊平); SHI BaiLe(施伯乐)
2003-01-01
Cluster analysis is a process to classify data in a specified data set. In this field,much attention is paid to high-efficiency clustering algorithms. In this paper, the features in thecurrent partition-based and hierarchy-based algorithms are reviewed, and a new hierarchy-basedalgorithm PHC is proposed by combining advantages of both algorithms, which uses the cohesionand the closeness to amalgamate the clusters. Compared with similar algorithms, the performanceof PHC is improved, and the quality of clustering is guaranteed. And both the features were provedby the theoretic and experimental analyses in the paper.
Counterexamples to convergence theorem of maximum-entropy clustering algorithm
Institute of Scientific and Technical Information of China (English)
于剑; 石洪波; 黄厚宽; 孙喜晨; 程乾生
2003-01-01
In this paper, we surveyed the development of maximum-entropy clustering algorithm, pointed out that the maximum-entropy clustering algorithm is not new in essence, and constructed two examples to show that the iterative sequence given by the maximum-entropy clustering algorithm may not converge to a local minimum of its objective function, but a saddle point. Based on these results, our paper shows that the convergence theorem of maximum-entropy clustering algorithm put forward by Kenneth Rose et al. does not hold in general cases.
jClustering, an Open Framework for the Development of 4D Clustering Algorithms
Mateos-Pérez, José María; García-Villalba, Carmen; Pascau, Javier; Desco, Manuel; Vaquero, Juan J.
2013-01-01
We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License) to allow modification if necessary. PMID:23990913
jClustering, an open framework for the development of 4D clustering algorithms.
Directory of Open Access Journals (Sweden)
José María Mateos-Pérez
Full Text Available We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License to allow modification if necessary.
An Incremental Algorithm of Text Clustering Based on Semantic Sequences
Institute of Scientific and Technical Information of China (English)
FENG Zhonghui; SHEN Junyi; BAO Junpeng
2006-01-01
This paper proposed an incremental textclustering algorithm based on semantic sequence.Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm.The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.
A new efficient Cluster Algorithm for the Ising Model
Nyffeler, M; Wiese, U J; Nyfeler, Matthias; Pepe, Michele; Wiese, Uwe-Jens
2005-01-01
Using D-theory we construct a new efficient cluster algorithm for the Ising model. The construction is very different from the standard Swendsen-Wang algorithm and related to worm algorithms. With the new algorithm we have measured the correlation function with high precision over a surprisingly large number of orders of magnitude.
URL Mining Using Agglomerative Clustering Algorithm
Directory of Open Access Journals (Sweden)
Chinmay R. Deshmukh
2015-02-01
Full Text Available Abstract The tremendous growth of the web world incorporates application of data mining techniques to the web logs. Data Mining and World Wide Web encompasses an important and active area of research. Web log mining is analysis of web log files with web pages sequences. Web mining is broadly classified as web content mining web usage mining and web structure mining. Web usage mining is a technique to discover usage patterns from Web data in order to understand and better serve the needs of Web-based applications. URL mining refers to a subclass of Web mining that helps us to investigate the details of a Uniform Resource Locator. URL mining can be advantageous in the fields of security and protection. The paper introduces a technique for mining a collection of user transactions with an Internet search engine to discover clusters of similar queries and similar URLs. The information we exploit is a clickthrough data each record consist of a users query to a search engine along with the URL which the user selected from among the candidates offered by search engine. By viewing this dataset as a bipartite graph with the vertices on one side corresponding to queries and on the other side to URLs one can apply an agglomerative clustering algorithm to the graphs vertices to identify related queries and URLs.
A fingerprint identification algorithm by clustering similarity
Institute of Scientific and Technical Information of China (English)
TIAN Jie; HE Yuliang; CHEN Hong; YANG Xin
2005-01-01
This paper introduces a fingerprint identification algorithm by clustering similarity with the view to overcome the dilemmas encountered in fingerprint identification.To decrease multi-spectrum noises in a fingerprint, we first use a dyadic scale space (DSS) method for image enhancement. The second step describes the relative features among minutiae by building a minutia-simplex which contains a pair of minutiae and their local associated ridge information, with its transformation-variant and invariant relative features applied for comprehensive similarity measurement and for parameter estimation respectively. The clustering method is employed to estimate the transformation space.Finally, multi-resolution technique is used to find an optimal transformation model for getting the maximal mutual information between the input and the template features. The experimental results including the performance evaluation by the 2nd International Verification Competition in 2002 (FVC2002), over the four fingerprint databases of FVC2002 indicate that our method is promising in an automatic fingerprint identification system (AFIS).
Application of hybrid clustering using parallel k-means algorithm and DIANA algorithm
Umam, Khoirul; Bustamam, Alhadi; Lestari, Dian
2017-03-01
DNA is one of the carrier of genetic information of living organisms. Encoding, sequencing, and clustering DNA sequences has become the key jobs and routine in the world of molecular biology, in particular on bioinformatics application. There are two type of clustering, hierarchical clustering and partitioning clustering. In this paper, we combined two type clustering i.e. K-Means (partitioning clustering) and DIANA (hierarchical clustering), therefore it called Hybrid clustering. Application of hybrid clustering using Parallel K-Means algorithm and DIANA algorithm used to clustering DNA sequences of Human Papillomavirus (HPV). The clustering process is started with Collecting DNA sequences of HPV are obtained from NCBI (National Centre for Biotechnology Information), then performing characteristics extraction of DNA sequences. The characteristics extraction result is store in a matrix form, then normalize this matrix using Min-Max normalization and calculate genetic distance using Euclidian Distance. Furthermore, the hybrid clustering is applied by using implementation of Parallel K-Means algorithm and DIANA algorithm. The aim of using Hybrid Clustering is to obtain better clusters result. For validating the resulted clusters, to get optimum number of clusters, we use Davies-Bouldin Index (DBI). In this study, the result of implementation of Parallel K-Means clustering is data clustered become 5 clusters with minimal IDB value is 0.8741, and Hybrid Clustering clustered data become 13 sub-clusters with minimal IDB values = 0.8216, 0.6845, 0.3331, 0.1994 and 0.3952. The IDB value of hybrid clustering less than IBD value of Parallel K-Means clustering only that perform at 1ts stage. Its means clustering using Hybrid Clustering have the better result to clustered DNA sequence of HPV than perform parallel K-Means Clustering only.
Local Community Detection Algorithm Based on Minimal Cluster
Directory of Open Access Journals (Sweden)
Yong Zhou
2016-01-01
Full Text Available In order to discover the structure of local community more effectively, this paper puts forward a new local community detection algorithm based on minimal cluster. Most of the local community detection algorithms begin from one node. The agglomeration ability of a single node must be less than multiple nodes, so the beginning of the community extension of the algorithm in this paper is no longer from the initial node only but from a node cluster containing this initial node and nodes in the cluster are relatively densely connected with each other. The algorithm mainly includes two phases. First it detects the minimal cluster and then finds the local community extended from the minimal cluster. Experimental results show that the quality of the local community detected by our algorithm is much better than other algorithms no matter in real networks or in simulated networks.
A Load Balance Routing Algorithm Based on Uneven Clustering
Directory of Open Access Journals (Sweden)
Liang Yuan
2013-10-01
Full Text Available Aiming at the problem of uneven load in clustering Wireless Sensor Network (WSN, a kind of load balance routing algorithm based on uneven clustering is proposed to do uneven clustering and calculate optimal number of clustering. This algorithm prevents the number of common node under some certain cluster head from being too large which leads load to be overweight to death through even node clustering. It constructs evaluation function which can better reflect residual energy distribution of nodes and at the same time constructs routing evaluation function between cluster heads which uses MATLAB to do simulation on the performance of this algorithm. Simulation result shows that the routing established by this algorithm effectively improves network’s energy balance and lengthens the life cycle of network.
Cluster-Based Distributed Algorithms for Very Large Linear Equations
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
In many applications such as computational fluid dynamics and weather prediction, as well as image processing and state of Markov chain etc., the grade of matrix n is often very large, and any serial algorithm cannot solve the problems. A distributed cluster-based solution for very large linear equations is discussed, it includes the definitions of notations, partition of matrix, communication mechanism, and a master-slaver algorithm etc., the computing cost is O(n3/N), the memory cost is O(n2/N), the I/O cost is O(n2/N), and the communication cost is O(Nn), here, N is the number of computing nodes or processes. Some tests show that the solution could solve the double type of matrix under 106×106 effectively.
Next Generation Suspension Dynamics Algorithms
Energy Technology Data Exchange (ETDEWEB)
Schunk, Peter Randall [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Higdon, Jonathon [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Chen, Steven [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2014-12-01
This research project has the objective to extend the range of application, improve the efficiency and conduct simulations with the Fast Lubrication Dynamics (FLD) algorithm for concentrated particle suspensions in a Newtonian fluid solvent. The research involves a combination of mathematical development, new computational algorithms, and application to processing flows of relevance in materials processing. The mathematical developments clarify the underlying theory, facilitate verification against classic monographs in the field and provide the framework for a novel parallel implementation optimized for an OpenMP shared memory environment. The project considered application to consolidation flows of major interest in high throughput materials processing and identified hitherto unforeseen challenges in the use of FLD in these applications. Extensions to the algorithm have been developed to improve its accuracy in these applications.
Analyzing Job Aware Scheduling Algorithm in Hadoop for Heterogeneous Cluster
Directory of Open Access Journals (Sweden)
Mayuri A Mehta
2015-12-01
Full Text Available A scheduling algorithm is required to efficiently manage cluster resources in a Hadoop cluster, thereby to increase resource utilization and to reduce response time. The job aware scheduling algorithm schedules non-local map tasks of jobs based on job execution time, earliest deadline first or workload of the job. In this paper, we present the performance evaluation of the job aware scheduling algorithm using MapReduce WordCount benchmark. The experimental results are compared with matchmaking scheduling algorithm. The results show that the job aware scheduling algorithm reduces average waiting time and memory wastage considerably as compared to matchmaking algorithm.
Institute of Scientific and Technical Information of China (English)
CHEN Yunkai; LU Zhengding; LI Ruixuan; LI Yuhua; SUN Xiaolin
2006-01-01
Considering the constantly increasing of data in large databases such as wire transfer database, incremental clustering algorithms play a more and more important role in Data Mining (DM). However, Few of the traditional clustering algorithms can not only handle the categorical data, but also explain its output clearly. Based on the idea of dynamic clustering, an incremental conceptive clustering algorithm is proposed in this paper. Which introduces the Semantic Core Tree (SCT) to deal with large volume of categorical wire transfer data for the detecting money laundering. In addition, the rule generation algorithm is presented here to express the clustering result by the format of knowledge. When we apply this idea in financial data mining, the efficiency of searching the characters of money laundering data will be improved.
Study of the Artificial Fish Swarm Algorithm for Hybrid Clustering
Directory of Open Access Journals (Sweden)
Hongwei Zhao
2015-06-01
Full Text Available The basic Artificial Fish Swarm (AFS Algorithm is a new type of an heuristic swarm intelligence algorithm, but it is difficult to optimize to get high precision due to the randomness of the artificial fish behavior, which belongs to the intelligence algorithm. This paper presents an extended AFS algorithm, namely the Cooperative Artificial Fish Swarm (CAFS, which significantly improves the original AFS in solving complex optimization problems. K-medoids clustering algorithm is being used to classify data, but the approach is sensitive to the initial selection of the centers with low quality of the divided cluster. A novel hybrid clustering method based on the CAFS and K-medoids could be used for solving clustering problems. In this work, first, CAFS algorithm is used for optimizing six widely-used benchmark functions, coming up with comparative results produced by AFS and CAFS, then Particle Swarm Optimization (PSO is studied. Second, the hybrid algorithm with K-medoids and CAFS algorithms is used for data clustering on several benchmark data sets. The performance of the hybrid algorithm based on K-medoids and CAFS is compared with AFS and CAFS algorithms on a clustering problem. The simulation results show that the proposed CAFS outperforms the other two algorithms in terms of accuracy and robustness.
Cluster fusion algorithm: application to Lennard-Jones clusters
DEFF Research Database (Denmark)
Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter
2008-01-01
paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...
Cluster fusion algorithm: application to Lennard-Jones clusters
DEFF Research Database (Denmark)
Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter
2006-01-01
paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...
Simulated annealing spectral clustering algorithm for image segmentation
Institute of Scientific and Technical Information of China (English)
Yifang Yang; and Yuping Wang
2014-01-01
The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity mea-sure. However, the Euclidean distance measure cannot ful y reveal the complex distribution data, and the result of spectral clustering is very sensitive to the scaling parameter. To solve these problems, a new manifold distance measure and a novel simulated anneal-ing spectral clustering (SASC) algorithm based on the manifold distance measure are proposed. The simulated annealing based on genetic algorithm (SAGA), characterized by its rapid conver-gence to the global optimum, is used to cluster the sample points in the spectral mapping space. The proposed algorithm can not only reflect local and global consistency better, but also reduce the sensitivity of spectral clustering to the kernel parameter, which improves the algorithm’s clustering performance. To efficiently ap-ply the algorithm to image segmentation, the Nystr¨om method is used to reduce the computation complexity. Experimental re-sults show that compared with traditional clustering algorithms and those popular spectral clustering algorithms, the proposed algorithm can achieve better clustering performances on several synthetic datasets, texture images and real images.
A Flocking Based algorithm for Document Clustering Analysis
Energy Technology Data Exchange (ETDEWEB)
Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL
2006-01-01
Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.
A scalable and practical one-pass clustering algorithm for recommender system
Khalid, Asra; Ghazanfar, Mustansar Ali; Azam, Awais; Alahmari, Saad Ali
2015-12-01
KMeans clustering-based recommendation algorithms have been proposed claiming to increase the scalability of recommender systems. One potential drawback of these algorithms is that they perform training offline and hence cannot accommodate the incremental updates with the arrival of new data, making them unsuitable for the dynamic environments. From this line of research, a new clustering algorithm called One-Pass is proposed, which is a simple, fast, and accurate. We show empirically that the proposed algorithm outperforms K-Means in terms of recommendation and training time while maintaining a good level of accuracy.
APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags
DEFF Research Database (Denmark)
Zong, Yu; Xu, Guandong; Jin, Pin
2011-01-01
algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C1,C2,…,Cm}; (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone...... resulting from the severe difficulty of ambiguity, redundancy and less semantic nature of tags. Clustering method is a useful tool to address the aforementioned difficulties. Most of the researches on tag clustering are directly using traditional clustering algorithms such as K-means or Hierarchical...
Mercer Kernel Based Fuzzy Clustering Self-Adaptive Algorithm
Institute of Scientific and Technical Information of China (English)
李侃; 刘玉树
2004-01-01
A novel mercer kernel based fuzzy clustering self-adaptive algorithm is presented. The mercer kernel method is introduced to the fuzzy c-means clustering. It may map implicitly the input data into the high-dimensional feature space through the nonlinear transformation. Among other fuzzy c-means and its variants, the number of clusters is first determined. A self-adaptive algorithm is proposed. The number of clusters, which is not given in advance, can be gotten automatically by a validity measure function. Finally, experiments are given to show better performance with the method of kernel based fuzzy c-means self-adaptive algorithm.
APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags
DEFF Research Database (Denmark)
Zong, Yu; Xu, Guandong; Jin, Pin
2011-01-01
algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C1,C2,…,Cm}; (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone...... resulting from the severe difficulty of ambiguity, redundancy and less semantic nature of tags. Clustering method is a useful tool to address the aforementioned difficulties. Most of the researches on tag clustering are directly using traditional clustering algorithms such as K-means or Hierarchical...
Android Malware Classification Using K-Means Clustering Algorithm
Hamid, Isredza Rahmi A.; Syafiqah Khalid, Nur; Azma Abdullah, Nurul; Rahman, Nurul Hidayah Ab; Chai Wen, Chuah
2017-08-01
Malware was designed to gain access or damage a computer system without user notice. Besides, attacker exploits malware to commit crime or fraud. This paper proposed Android malware classification approach based on K-Means clustering algorithm. We evaluate the proposed model in terms of accuracy using machine learning algorithms. Two datasets were selected to demonstrate the practicing of K-Means clustering algorithms that are Virus Total and Malgenome dataset. We classify the Android malware into three clusters which are ransomware, scareware and goodware. Nine features were considered for each types of dataset such as Lock Detected, Text Detected, Text Score, Encryption Detected, Threat, Porn, Law, Copyright and Moneypak. We used IBM SPSS Statistic software for data classification and WEKA tools to evaluate the built cluster. The proposed K-Means clustering algorithm shows promising result with high accuracy when tested using Random Forest algorithm.
Intelligent Hybrid Cluster Based Classification Algorithm for Social Network Analysis
Directory of Open Access Journals (Sweden)
S. Muthurajkumar
2014-05-01
Full Text Available In this paper, we propose an hybrid clustering based classification algorithm based on mean approach to effectively classify to mine the ordered sequences (paths from weblog data in order to perform social network analysis. In the system proposed in this work for social pattern analysis, the sequences of human activities are typically analyzed by switching behaviors, which are likely to produce overlapping clusters. In this proposed system, a robust Modified Boosting algorithm is proposed to hybrid clustering based classification for clustering the data. This work is useful to provide connection between the aggregated features from the network data and traditional indices used in social network analysis. Experimental results show that the proposed algorithm improves the decision results from data clustering when combined with the proposed classification algorithm and hence it is proved that of provides better classification accuracy when tested with Weblog dataset. In addition, this algorithm improves the predictive performance especially for multiclass datasets which can increases the accuracy.
Functional Clustering Algorithm for High-Dimensional Proteomics Data
Directory of Open Access Journals (Sweden)
Halima Bensmail
2005-01-01
Full Text Available Clustering proteomics data is a challenging problem for any traditional clustering algorithm. Usually, the number of samples is largely smaller than the number of protein peaks. The use of a clustering algorithm which does not take into consideration the number of features of variables (here the number of peaks is needed. An innovative hierarchical clustering algorithm may be a good approach. We propose here a new dissimilarity measure for the hierarchical clustering combined with a functional data analysis. We present a specific application of functional data analysis (FDA to a high-throughput proteomics study. The high performance of the proposed algorithm is compared to two popular dissimilarity measures in the clustering of normal and human T-cell leukemia virus type 1 (HTLV-1-infected patients samples.
A new hybrid imperialist competitive algorithm on data clustering
Indian Academy of Sciences (India)
Taher Niknam; Elahe Taherian Fard; Shervin Ehrampoosh; Alireza Rousta
2011-06-01
Clustering is a process for partitioning datasets. This technique is very useful for optimum solution. -means is one of the simplest and the most famous methods that is based on square error criterion. This algorithm depends on initial states and converges to local optima. Some recent researches show that -means algorithm has been successfully applied to combinatorial optimization problems for clustering. In this paper, we purpose a novel algorithm that is based on combining two algorithms of clustering; -means and Modify Imperialist Competitive Algorithm. It is named hybrid K-MICA. In addition, we use a method called modiﬁed expectation maximization (EM) to determine number of clusters. The experimented results show that the new method carries out better results than the ACO, PSO, Simulated Annealing (SA), Genetic Algorithm (GA), Tabu Search (TS), Honey Bee Mating Optimization (HBMO) and -means.
Extension of K-Modes Algorithm for Generating Clusters Automatically
Directory of Open Access Journals (Sweden)
Anupama Chadha
2016-03-01
Full Text Available —K-Modes is an eminent algorithm for clustering data set with categorical attributes. This algorithm is famous for its simplicity and speed. The KModes is an extension of the K-Means algorithm for categorical data. Since K-Modes is used for categorical data so ‘Simple Matching Dissimilarity’ measure is used instead of Euclidean distance and the ‘Modes’ of clusters are used instead of ‘Means’. However, one major limitation of this algorithm is dependency on prior input of number of clusters K, and sometimes it becomes practically impossible to correctly estimate the optimum number of clusters in advance. In this paper we have proposed an algorithm which will overcome this limitation while maintaining the simplicity of K-Modes algorithm
Resource Allocation in Public Cluster with Extended Optimization Algorithm
Akbar, Z.; Handoko, L. T.
2007-01-01
We introduce an optimization algorithm for resource allocation in the LIPI Public Cluster to optimize its usage according to incoming requests from users. The tool is an extended and modified genetic algorithm developed to match specific natures of public cluster. We present a detail analysis of optimization, and compare the results with the exact calculation. We show that it would be very useful and could realize an automatic decision making system for public clusters.
Parallel algorithms for robot dynamics
Energy Technology Data Exchange (ETDEWEB)
Barhen, J.; Babcock, S.M.
1984-01-01
The Department of Energy recently established a Center for Engineering Systems Advanced Research (CESAR) at the Oak Ridge National Laboratory (ORNL). The Center's charter is to conduct long-range energy-related research in intelligent control systems. This paper reports initial results in developing parallel algorithms for efficiency enhancement in real-time solutions of manipulator dynamics equations. Two approaches to the solution of the inverse dynamics problem are discussed. The first is concerned with the implementation of Newton-Euler equations in multiprocessor architecture with emphasis on asynchronous algorithms and interprocess communication. The alternative approach is based on an explicit state description of the manipulator dynamics, obtained using computer-assisted analytic simplifications of the symbolic Lagrange-Euler equations. Multicomputer and multiprocessor implementations are discussed. The construction of a compact knowledge-base in terms of associative memories is also suggested, to allow solutions of the inverse dynamics based on similarity. Future directions are also outlined. This research is an integral part of a large systems integration effort with complementary tasks in strategy planning, sensor fusion, etc.
An ACO Algorithm for Effective Cluster Head Selection
Sampath, Amritha; Thampi, Sabu M; 10.4304/jait.2.1.50-56
2011-01-01
This paper presents an effective algorithm for selecting cluster heads in mobile ad hoc networks using ant colony optimization. A cluster in an ad hoc network consists of a cluster head and cluster members which are at one hop away from the cluster head. The cluster head allocates the resources to its cluster members. Clustering in MANET is done to reduce the communication overhead and thereby increase the network performance. A MANET can have many clusters in it. This paper presents an algorithm which is a combination of the four main clustering schemes- the ID based clustering, connectivity based, probability based and the weighted approach. An Ant colony optimization based approach is used to minimize the number of clusters in MANET. This can also be considered as a minimum dominating set problem in graph theory. The algorithm considers various parameters like the number of nodes, the transmission range etc. Experimental results show that the proposed algorithm is an effective methodology for finding out t...
Squeezer: An Efficient Algorithm for Clustering Categorical Data
Institute of Scientific and Technical Information of China (English)
何增有; 徐晓飞; 邓胜春
2002-01-01
This paper presents a new efficient algorithm for clustering categorical data,Squeezer, which can produce high quality clustering results and at the same time deservegood scalability. The Squeezer algorithm reads each tuple t in sequence, either assigning tto an existing cluster (initially none), or creating t as a new cluster, which is determined bythe similarities between t and clusters. Due to its characteristics, the proposed algorithm isextremely suitable for clustering data streams, where given a sequence of points, the objective isto maintain consistently good clustering of the sequence so far, using a small amount of memoryand time. Outliers can also be handled efficiently and directly in Squeezer. Experimental resultson real-life and synthetic datasets verify the superiority of Squeezer.
Using Hyper Clustering Algorithms in Mobile Network Planning
Directory of Open Access Journals (Sweden)
Lamiaa F. Ibrahim
2011-01-01
Full Text Available Problem statement: As a large amount of data stored in spatial databases, people may like to find groups of data which share similar features. Thus cluster analysis becomes an important area of research in data mining. Applications of clustering analysis have been utilized in many fields, such as when we search to construct a cluster served by base station in mobile network. Deciding upon the optimum placement for the base stations to achieve best services while reducing the cost is a complex task requiring vast computational resource. Approach: This study addresses antenna placement problem or the cell planning problem, involves locating and configuring infrastructure for mobile networks by modified the original density-based Spatial Clustering of Applications with Noise algorithm. The Cluster Partitioning around Medoids original algorithm has been modified and a new algorithm has been proposed by the authors in a recent work. In this study, the density-based Spatial Clustering of Applications with Noise original algorithm has been modified and combined with old algorithm to produce the hybrid algorithm Clustering Density Base and Clustering with Weighted Node-Partitioning around Medoids algorithm to solve the problems in Mobile Network Planning. Results: Implementation of this algorithm to a real case study is presented. Results demonstrate that the proposed algorithm has minimum run time minimum cost and high grade of service. Conclusion: The proposed hyper algorithm has the advantage of quick divide the area into clusters where the density base algorithm has a limit iteration and the advantage of accuracy (no sampling method is used and highly grade of service due to the moving of the location of the base stations (medoid toward the heavy loaded (weighted nodes.
Comparing the biological coherence of network clusters identified by different detection algorithms
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
Protein-protein interaction networks serve to carry out basic molecular activity in the cell. Detecting the modular structures from the protein-protein interaction network is important for understanding the organization, function and dynamics of a biological system. In order to identify functional neighborhoods based on network topology, many network cluster identification algorithms have been developed. However, each algorithm might dissect a network from a different aspect and may provide different insight on the network partition. In order to objectively evaluate the performance of four commonly used cluster detection algorithms: molecular complex detection (MCODE), NetworkBlast, shortest-distance clustering (SDC) and Girvan-Newman (G-N) algorithm, we compared the biological coherence of the network clusters found by these algorithms through a uniform evaluation framework. Each algorithm was utilized to find network clusters in two different protein-protein interaction networks with various parameters. Comparison of the resulting network clusters indicates that clusters found by MCODE and SDC are of higher biological coherence than those by NetworkBlast and G-N algorithm.
Co-clustering models, algorithms and applications
Govaert, Gérard
2013-01-01
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixture
Visual verification and analysis of cluster detection for molecular dynamics.
Grottel, Sebastian; Reina, Guido; Vrabec, Jadran; Ertl, Thomas
2007-01-01
A current research topic in molecular thermodynamics is the condensation of vapor to liquid and the investigation of this process at the molecular level. Condensation is found in many physical phenomena, e.g. the formation of atmospheric clouds or the processes inside steam turbines, where a detailed knowledge of the dynamics of condensation processes will help to optimize energy efficiency and avoid problems with droplets of macroscopic size. The key properties of these processes are the nucleation rate and the critical cluster size. For the calculation of these properties it is essential to make use of a meaningful definition of molecular clusters, which currently is a not completely resolved issue. In this paper a framework capable of interactively visualizing molecular datasets of such nucleation simulations is presented, with an emphasis on the detected molecular clusters. To check the quality of the results of the cluster detection, our framework introduces the concept of flow groups to highlight potential cluster evolution over time which is not detected by the employed algorithm. To confirm the findings of the visual analysis, we coupled the rendering view with a schematic view of the clusters' evolution. This allows to rapidly assess the quality of the molecular cluster detection algorithm and to identify locations in the simulation data in space as well as in time where the cluster detection fails. Thus, thermodynamics researchers can eliminate weaknesses in their cluster detection algorithms. Several examples for the effective and efficient usage of our tool are presented.
Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Machine Learning
Ntampaka, M; Sutherland, D J; Fromenteau, S; Poczos, B; Schneider, J
2015-01-01
We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning (ML) algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create two mock catalogs from Multidark's publicly-available N-body MDPL1 simulation, one with perfect galaxy cluster membership information and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power law scaling relation to infer cluster mass from galaxy line of sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with width = 0.87. Interlopers introduce additional scatter, significantly widening the error distribution further (width = 2.13). We employ the Support Distribution Machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to...
Ananke: temporal clustering reveals ecological dynamics of microbial communities
Directory of Open Access Journals (Sweden)
Michael W. Hall
2017-09-01
Full Text Available Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that complements existing sequence-identity-based clustering approaches by clustering marker-gene data based on time-series profiles and provides interactive visualization of clusters, including highlighting of internal OTU inconsistencies. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.
Institute of Scientific and Technical Information of China (English)
WANG ShunJin; ZHANG Hua
2007-01-01
Based on the exact analytical solution of ordinary differential equations,a truncation of the Taylor series of the exact solution to the Nth order leads to the Nth order algebraic dynamics algorithm.A detailed numerical comparison is presented with Runge-Kutta algorithm and symplectic geometric algorithm for 12 test models.The results show that the algebraic dynamics algorithm can better preserve both geometrical and dynamical fidelity of a dynamical system at a controllable precision,and it can solve the problem of algorithm-induced dissipation for the Runge-Kutta algorithm and the problem of algorithm-induced phase shift for the symplectic geometric algorithm.
Institute of Scientific and Technical Information of China (English)
2007-01-01
Based on the exact analytical solution of ordinary differential equations, a truncation of the Taylor series of the exact solution to the Nth order leads to the Nth order algebraic dynamics algorithm. A detailed numerical comparison is presented with Runge-Kutta algorithm and symplectic geometric algorithm for 12 test models. The results show that the algebraic dynamics algorithm can better preserve both geometrical and dynamical fidelity of a dynamical system at a controllable precision, and it can solve the problem of algorithm-induced dissipation for the Runge-Kutta algorithm and the problem of algorithm-induced phase shift for the symplectic geometric algorithm.
Directory of Open Access Journals (Sweden)
Jiang Ting
2010-01-01
Full Text Available We optimize the cluster structure to solve problems such as the uneven energy consumption of the radar sensor nodes and random cluster head selection in the traditional clustering routing algorithm. According to the defined cost function for clusters, we present the clustering algorithm which is based on radio-free space path loss. In addition, we propose the energy and distance pheromones based on the residual energy and aggregation of the radar sensor nodes. According to bionic heuristic algorithm, a new ant colony-based clustering algorithm for radar sensor networks is also proposed. Simulation results show that this algorithm can get a better balance of the energy consumption and then remarkably prolong the lifetime of the radar sensor network.
Cosine-Based Clustering Algorithm Approach
Directory of Open Access Journals (Sweden)
Mohammed A. H. Lubbad
2012-02-01
Full Text Available Due to many applications need the management of spatial data; clustering large spatial databases is an important problem which tries to find the densely populated regions in the feature space to be used in data mining, knowledge discovery, or efficient information retrieval. A good clustering approach should be efficient and detect clusters of arbitrary shapes. It must be insensitive to the outliers (noise and the order of input data. In this paper Cosine Cluster is proposed based on cosine transformation, which satisfies all the above requirements. Using multi-resolution property of cosine transforms, arbitrary shape clusters can be effectively identified at different degrees of accuracy. Cosine Cluster is also approved to be highly efficient in terms of time complexity. Experimental results on very large data sets are presented, which show the efficiency and effectiveness of the proposed approach compared to other recent clustering methods.
Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Machine Learning
Ntampaka, M.; Trac, H.; Sutherland, D. J.; Fromenteau, S.; Póczos, B.; Schneider, J.
2016-11-01
We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create two mock catalogs from Multidark’s publicly available N-body MDPL1 simulation, one with perfect galaxy cluster membership information and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power-law scaling relation to infer cluster mass from galaxy line-of-sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with a width of {{Δ }}ε ≈ 0.87. Interlopers introduce additional scatter, significantly widening the error distribution further ({{Δ }}ε ≈ 2.13). We employ the support distribution machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement ({{Δ }}ε ≈ 0.67) for the contaminated case. Remarkably, SDM applied to contaminated clusters is better able to recover masses than even the scaling relation approach applied to uncontaminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.
Pixel Intensity Clustering Algorithm for Multilevel Image Segmentation
Directory of Open Access Journals (Sweden)
Oludayo O. Olugbara
2015-01-01
Full Text Available Image segmentation is an important problem that has received significant attention in the literature. Over the last few decades, a lot of algorithms were developed to solve image segmentation problem; prominent amongst these are the thresholding algorithms. However, the computational time complexity of thresholding exponentially increases with increasing number of desired thresholds. A wealth of alternative algorithms, notably those based on particle swarm optimization and evolutionary metaheuristics, were proposed to tackle the intrinsic challenges of thresholding. In codicil, clustering based algorithms were developed as multidimensional extensions of thresholding. While these algorithms have demonstrated successful results for fewer thresholds, their computational costs for a large number of thresholds are still a limiting factor. We propose a new clustering algorithm based on linear partitioning of the pixel intensity set and between-cluster variance criterion function for multilevel image segmentation. The results of testing the proposed algorithm on real images from Berkeley Segmentation Dataset and Benchmark show that the algorithm is comparable with state-of-the-art multilevel segmentation algorithms and consistently produces high quality results. The attractive properties of the algorithm are its simplicity, generalization to a large number of clusters, and computational cost effectiveness.
A High-Order CFS Algorithm for Clustering Big Data
Directory of Open Access Journals (Sweden)
Fanyu Bu
2016-01-01
Full Text Available With the development of Internet of Everything such as Internet of Things, Internet of People, and Industrial Internet, big data is being generated. Clustering is a widely used technique for big data analytics and mining. However, most of current algorithms are not effective to cluster heterogeneous data which is prevalent in big data. In this paper, we propose a high-order CFS algorithm (HOCFS to cluster heterogeneous data by combining the CFS clustering algorithm and the dropout deep learning model, whose functionality rests on three pillars: (i an adaptive dropout deep learning model to learn features from each type of data, (ii a feature tensor model to capture the correlations of heterogeneous data, and (iii a tensor distance-based high-order CFS algorithm to cluster heterogeneous data. Furthermore, we verify our proposed algorithm on different datasets, by comparison with other two clustering schemes, that is, HOPCM and CFS. Results confirm the effectiveness of the proposed algorithm in clustering heterogeneous data.
Meaningful Clustered Forest: an Automatic and Robust Clustering Algorithm
Tepper, Mariano; Almansa, Andrés
2011-01-01
We propose a new clustering method that can be regarded as a numerical method to compute the proximity gestalt. The method analyzes edge length statistics in the MST of the dataset and provides an a contrario cluster detection criterion. The approach is fully parametric on the chosen distance and can detect arbitrarily shaped clusters. The method is also automatic, in the sense that only a single parameter is left to the user. This parameter has an intuitive interpretation as it controls the expected number of false detections. We show that the iterative application of our method can (1) provide robustness to noise and (2) solve a masking phenomenon in which a highly populated and salient cluster dominates the scene and inhibits the detection of less-populated, but still salient, clusters.
The Ordered Clustered Travelling Salesman Problem: A Hybrid Genetic Algorithm
Directory of Open Access Journals (Sweden)
Zakir Hussain Ahmed
2014-01-01
Full Text Available The ordered clustered travelling salesman problem is a variation of the usual travelling salesman problem in which a set of vertices (except the starting vertex of the network is divided into some prespecified clusters. The objective is to find the least cost Hamiltonian tour in which vertices of any cluster are visited contiguously and the clusters are visited in the prespecified order. The problem is NP-hard, and it arises in practical transportation and sequencing problems. This paper develops a hybrid genetic algorithm using sequential constructive crossover, 2-opt search, and a local search for obtaining heuristic solution to the problem. The efficiency of the algorithm has been examined against two existing algorithms for some asymmetric and symmetric TSPLIB instances of various sizes. The computational results show that the proposed algorithm is very effective in terms of solution quality and computational time. Finally, we present solution to some more symmetric TSPLIB instances.
The ordered clustered travelling salesman problem: a hybrid genetic algorithm.
Ahmed, Zakir Hussain
2014-01-01
The ordered clustered travelling salesman problem is a variation of the usual travelling salesman problem in which a set of vertices (except the starting vertex) of the network is divided into some prespecified clusters. The objective is to find the least cost Hamiltonian tour in which vertices of any cluster are visited contiguously and the clusters are visited in the prespecified order. The problem is NP-hard, and it arises in practical transportation and sequencing problems. This paper develops a hybrid genetic algorithm using sequential constructive crossover, 2-opt search, and a local search for obtaining heuristic solution to the problem. The efficiency of the algorithm has been examined against two existing algorithms for some asymmetric and symmetric TSPLIB instances of various sizes. The computational results show that the proposed algorithm is very effective in terms of solution quality and computational time. Finally, we present solution to some more symmetric TSPLIB instances.
The Refinement Algorithm Consideration in Text Clustering Scheme Based on Multilevel Graph
Institute of Scientific and Technical Information of China (English)
CHEN Jian-bin; DONG Xiang-jun; SONG Han-tao
2004-01-01
To construct a high efficient text clustering algorithm, the multilevel graph model and the refinement algorithm used in the uncoarsening phase is discussed.The model is applied to text clustering.The performance of clustering algorithm has to be improved with the refinement algorithm application.The experiment result demonstrated that the multilevel graph text clustering algorithm is available.
Dynamic Clustering Of High Speed Data Streams
Directory of Open Access Journals (Sweden)
J. Chandrika
2012-03-01
Full Text Available We consider the problem of clustering data streams. A data stream can roughly be thought of as a transient, continuously increasing sequence of time-stamped data. In order to maintain an up-to-date clustering structure, it is necessary to analyze the incoming data in an online manner, tolerating but a constant time delay. The purpose of this study is to analyze the working of popular algorithms on clustering data streams and make a comparative analysis.
A Scalable Clustering Algorithm in Dense Mobile Sensor Networks
Directory of Open Access Journals (Sweden)
Jianbo Li
2011-03-01
Full Text Available Clustering offers a kind of hierarchical organization to provide scalability and basic performance guarantee by partitioning the network into disjoint groups of nodes. In this paper a scalable and energy efficient clustering algorithm is proposed under dense mobile sensor networks scenario. In the initial cluster formation phase, our proposed scheme features a simple execution process with polynomial time complexity, and eliminates the “frozen time” requirement by introducing some GPS-capable mobile nodes to act as cluster heads. In the following cluster maintenance stage, the maintenance of clusters is asynchronously and event driven so as to thoroughly eliminate the “ripple effect” brought by node mobility. As a result local changes in a cluster need not be seen and updated by the entire network, thus bringing greatly reduced communication overheads and being well suitable for the high mobility environment. Extensive simulations have been conducted and the simulation results reveal that our proposed algorithm successfully achieves its target at incurring much less clustering overheads as well as maintaining much more stable cluster structure, as compared to HCC(High Connectivity Clustering algorithm
Color Image Segmentation Method Based on Improved Spectral Clustering Algorithm
Dong Qin
2014-01-01
Contraposing to the features of image data with high sparsity of and the problems on determination of clustering numbers, we try to put forward an color image segmentation algorithm, combined with semi-supervised machine learning technology and spectral graph theory. By the research of related theories and methods of spectral clustering algorithms, we introduce information entropy conception to design a method which can automatically optimize the scale parameter value. So it avoids the unstab...
Cluster Dynamics in a Circulating Fluidized Bed
Energy Technology Data Exchange (ETDEWEB)
Guenther, C.P.; Breault, R.W.
2006-11-01
A common hydrodynamic feature in industrial scale circulating fluidized beds is the presence of clusters. The continuous formation and destruction of clusters strongly influences particle hold-up, pressure drop, heat transfer at the wall, and mixing. In this paper fiber optic data is analyzed using discrete wavelet analysis to characterize the dynamic behavior of clusters. Five radial positions at three different axial locations under five different operating were analyzed using discrete wavelets. Results are summarized with respect to cluster size and frequency.
Higher-order structure and epidemic dynamics in clustered networks
Ritchie, Martin; House, Thomas; Kiss, Istvan Z
2013-01-01
Clustering is typically measured by the ratio of triangles to all triples, open or closed. Generating clustered networks, and how clustering affects dynamics on networks, is reasonably well understood for certain classes of networks \\cite{vmclust, karrerclust2010}, e.g., networks composed of lines and non-overlapping triangles. In this paper we show that it is possible to generate networks which, despite having the same degree distribution and equal clustering, exhibit different higher-order structure, specifically, overlapping triangles and other order-four (a closed network motif composed of four nodes) structures. To distinguish and quantify these additional structural features, we develop a new network metric capable of measuring order-four structure which, when used alongside traditional network metrics, allows us to more accurately describe a network's topology. Three network generation algorithms are considered: a modified configuration model and two rewiring algorithms. By generating homogeneous netwo...
The Parallel Maximal Cliques Algorithm for Protein Sequence Clustering
Directory of Open Access Journals (Sweden)
Khalid Jaber
2009-01-01
Full Text Available Problem statement: Protein sequence clustering is a method used to discover relations between proteins. This method groups the proteins based on their common features. It is a core process in protein sequence classification. Graph theory has been used in protein sequence clustering as a means of partitioning the data into groups, where each group constitutes a cluster. Mohseni-Zadeh introduced a maximal cliques algorithm for protein clustering. Approach: In this study we adapted the maximal cliques algorithm of Mohseni-Zadeh to find cliques in protein sequences and we then parallelized the algorithm to improve computation times and allowed large protein databases to be processed. We used the N-Gram Hirschberg approach proposed by Abdul Rashid to calculate the distance between protein sequences. The task farming parallel program model was used to parallelize the enhanced cliques algorithm. Results: Our parallel maximal cliques algorithm was implemented on the stealth cluster using the C programming language and a hybrid approach that includes both the Message Passing Interface (MPI library and POSIX threads (PThread to accelerate protein sequence clustering. Conclusion: Our results showed a good speedup over sequential algorithms for cliques in protein sequences.
A New Method for Medical Image Clustering Using Genetic Algorithm
Directory of Open Access Journals (Sweden)
Akbar Shahrzad Khashandarag
2013-01-01
Full Text Available Segmentation is applied in medical images when the brightness of the images becomes weaker so that making different in recognizing the tissues borders. Thus, the exact segmentation of medical images is an essential process in recognizing and curing an illness. Thus, it is obvious that the purpose of clustering in medical images is the recognition of damaged areas in tissues. Different techniques have been introduced for clustering in different fields such as engineering, medicine, data mining and so on. However, there is no standard technique of clustering to present ideal results for all of the imaging applications. In this paper, a new method combining genetic algorithm and k-means algorithm is presented for clustering medical images. In this combined technique, variable string length genetic algorithm (VGA is used for the determination of the optimal cluster centers. The proposed algorithm has been compared with the k-means clustering algorithm. The advantage of the proposed method is the accuracy in selecting the optimal cluster centers compared with the above mentioned technique.
Centronit: Initial Centroid Designation Algorithm for K-Means Clustering
Directory of Open Access Journals (Sweden)
Ali Ridho Barakbah
2014-06-01
Full Text Available Clustering performance of the K-means highly depends on the correctness of initial centroids. Usually initial centroids for the K- means clustering are determined randomly so that the determined initial centers may cause to reach the nearest local minima, not the global optimum. In this paper, we propose an algorithm, called as Centronit, for designation of initial centroidoptimization of K-means clustering. The proposed algorithm is based on the calculation of the average distance of the nearest data inside region of the minimum distance. The initial centroids can be designated by the lowest average distance of each data. The minimum distance is set by calculating the average distance between the data. This method is also robust from outliers of data. The experimental results show effectiveness of the proposed method to improve the clustering results with the K-means clustering. Keywords: K-means clustering, initial centroids, Kmeansoptimization.
New clustering algorithm for interconnection of MANET and internet
Institute of Scientific and Technical Information of China (English)
万象; 姚尹雄; 王豪行
2004-01-01
This paper presents core-agent based clustering (CBC) algorithm, a novel heuristic clustering scheme for interconnection of MANET and Internet using power, movement probability and hop length as constraints. CBC includes two phases as cluster initialization and cluster maintenance. In phase one, the selection of clusterheads obeys the first two constraints, whereas the father node of each clustering node is chosen according to above three ones. Phase two concerns the case of node insertion or removal. Easy access and little alteration of conventional mobile IP are some characters of this algorithm. Simulation results demonstrate that CBC has many advantages as less average hop length, good robustness and less overheads, and the clustered network architecture behaves stably when topology changes.
The Effective Clustering Partition Algorithm Based on the Genetic Evolution
Institute of Scientific and Technical Information of China (English)
LIAO Qin; LI Xi-wen
2006-01-01
To the problem that it is hard to determine the clustering number and the abnormal points by using the clustering validity function, an effective clustering partition model based on the genetic algorithm is built in this paper. The solution to the problem is formed by the combination of the clustering partition and the encoding samples, and the fitness function is defined by the distances among and within clusters. The clustering number and the samples in each cluster are determined and the abnormal points are distinguished by implementing the triple random crossover operator and the mutation. Based on the known sample data, the results of the novel method and the clustering validity function are compared. Numerical experiments are given and the results show that the novel method is more effective.
An Extended Clustering Algorithm for Statistical Language Models
Ueberla, J P
1994-01-01
Statistical language models frequently suffer from a lack of training data. This problem can be alleviated by clustering, because it reduces the number of free parameters that need to be trained. However, clustered models have the following drawback: if there is ``enough'' data to train an unclustered model, then the clustered variant may perform worse. On currently used language modeling corpora, e.g. the Wall Street Journal corpus, how do the performances of a clustered and an unclustered model compare? While trying to address this question, we develop the following two ideas. First, to get a clustering algorithm with potentially high performance, an existing algorithm is extended to deal with higher order N-grams. Second, to make it possible to cluster large amounts of training data more efficiently, a heuristic to speed up the algorithm is presented. The resulting clustering algorithm can be used to cluster trigrams on the Wall Street Journal corpus and the language models it produces can compete with exi...
Cluster dynamics and universality of Ising lattice gases
Heringa, J. R.; Blöte, H. W. J.
Lattice gases with nearest-neighbour exclusion are studied by means of Monte Carlo simulations with an efficient cluster algorithm. The critical dynamics is consistent with a dynamical exponent z=0 in the case of Wolff-like cluster updates for square and simple-cubic lattices in the studied range of lattice sizes. We find the critical activity zc=0.72020(4) for the body-centred cubic lattice. The critical exponents yh=2.475(8) and yt=1.61(6) disagree with an earlier study, but they do agree with the known values for the three-dimensional Ising universality class.
Segmentation of Medical Image using Clustering and Watershed Algorithms
M. C.J. Christ; R.M.S Parvathi
2011-01-01
Problem statement: Segmentation plays an important role in medical imaging. Segmentation of an image is the division or separation of the image into dissimilar regions of similar attribute. In this study we proposed a methodology that integrates clustering algorithm and marker controlled watershed segmentation algorithm for medical image segmentation. The use of the conservative watershed algorithm for medical image analysis is pervasive because of its advantages, such as always being able to...
Institute of Scientific and Technical Information of China (English)
王兴良; 王立宏; 武栓虎
2014-01-01
Since the corresponding eigenvectors of k maximum eigenvalues do not always achieve the optimal clustering results, the clustering performance is improved by selective integrated approach for eigenvector groups involving the selection of base eigenvector group and selective integration strategy. Constraint score is used to evaluate eigenvectors by the pair-wise constraint information of training data, and some prefera-ble base eigenvector groups are obtained. For each testing data, the clustering accuracy of l-nearest neighbors from training dataset are used to dynamically evaluate eigenvector groups, and several accurate eigenvector groups are selected to vote. To test the obtained eigenvector groups, spectral clustering is carried out on the corresponding eigenvectors of testing dataset. The clustering results are aligned and the final experimental results are obtained. The experimental results on UCI benchmark datasets show that the proposed algorithm improves the clustering performance of testing data.%谱聚类中k个最大特征值对应的特征向量不一定使聚类结果达到最好，因此，文中采用特征向量组的选择性集成方法以提高谱聚类性能，涉及基特征向量组的选取、选择性集成策略等问题。利用训练数据的成对约束信息进行打分，选出较好的基特征向量组；应用测试数据在训练数据中的l-最近邻的聚类性能指标，动态评价每组特征向量，选出少量几个参与投票的特征向量组；对测试数据集的几个特征向量组数据进行谱聚类，并对结果进行簇配准，给出最终的聚类结果。实验表明，采用动态选择性集成方法能提高测试数据的聚类性能。
Measuring Constraint-Set Utility for Partitional Clustering Algorithms
Davidson, Ian; Wagstaff, Kiri L.; Basu, Sugato
2006-01-01
Clustering with constraints is an active area of machine learning and data mining research. Previous empirical work has convincingly shown that adding constraints to clustering improves the performance of a variety of algorithms. However, in most of these experiments, results are averaged over different randomly chosen constraint sets from a given set of labels, thereby masking interesting properties of individual sets. We demonstrate that constraint sets vary significantly in how useful they are for constrained clustering; some constraint sets can actually decrease algorithm performance. We create two quantitative measures, informativeness and coherence, that can be used to identify useful constraint sets. We show that these measures can also help explain differences in performance for four particular constrained clustering algorithms.
Algebraic dynamics solution and algebraic dynamics algorithm of Burgers equations
Institute of Scientific and Technical Information of China (English)
2008-01-01
Algebraic dynamics solution and algebraic dynamics algorithm of nonlinear partial differential evolution equations in the functional space are applied to Burgers equation. The results indicate that the approach is effective for analytical solutions to Burgers equation, and the algorithm for numerical solutions of Burgers equation is more stable, with higher precision than other existing finite difference algo-rithms.
SURVEY ON CLUSTERING ALGORITHM AND SIMILARITY MEASURE FOR CATEGORICAL DATA
Directory of Open Access Journals (Sweden)
S. Anitha Elavarasi
2014-01-01
Full Text Available Learning is the process of generating useful information from a huge volume of data. Learning can be either supervised learning (e.g. classification or unsupervised learning (e.g. Clustering Clustering is the process of grouping a set of physical objects into classes of similar object. Objects in real world consist of both numerical and categorical data. Categorical data are not analyzed as numerical data because of the absence of inherit ordering. This paper describes about ten different clustering algorithms, its methodology and the factors influencing its performance. Each algorithm is evaluated using real world datasets and its pro and cons are specified. The various similarity / dissimilarity measure applied to categorical data and its performance is also discussed. The time complexity defines the amount of time taken by an algorithm to perform the elementary operation. The time complexity of various algorithms are discussed and its performance on real world data such as mushroom, zoo, soya bean, cancer, vote, car and iris are measured. In this survey Cluster Accuracy and Error rate for four different clustering algorithm (K-modes, fuzzy K-modes, ROCK and Squeezer, two different similarity measure (DISC and Overlap and DILCA applied for hierarchy and partition algorithm are evaluated.
A Geometric Clustering Algorithm with Applications to Structural Data
Xu, Shutan; Zou, Shuxue
2015-01-01
Abstract An important feature of structural data, especially those from structural determination and protein-ligand docking programs, is that their distribution could be mostly uniform. Traditional clustering algorithms developed specifically for nonuniformly distributed data may not be adequate for their classification. Here we present a geometric partitional algorithm that could be applied to both uniformly and nonuniformly distributed data. The algorithm is a top-down approach that recursively selects the outliers as the seeds to form new clusters until all the structures within a cluster satisfy a classification criterion. The algorithm has been evaluated on a diverse set of real structural data and six sets of test data. The results show that it is superior to the previous algorithms for the clustering of structural data and is similar to or better than them for the classification of the test data. The algorithm should be especially useful for the identification of the best but minor clusters and for speeding up an iterative process widely used in NMR structure determination. PMID:25517067
Research on retailer data clustering algorithm based on Spark
Huang, Qiuman; Zhou, Feng
2017-03-01
Big data analysis is a hot topic in the IT field now. Spark is a high-reliability and high-performance distributed parallel computing framework for big data sets. K-means algorithm is one of the classical partition methods in clustering algorithm. In this paper, we study the k-means clustering algorithm on Spark. Firstly, the principle of the algorithm is analyzed, and then the clustering analysis is carried out on the supermarket customers through the experiment to find out the different shopping patterns. At the same time, this paper proposes the parallelization of k-means algorithm and the distributed computing framework of Spark, and gives the concrete design scheme and implementation scheme. This paper uses the two-year sales data of a supermarket to validate the proposed clustering algorithm and achieve the goal of subdividing customers, and then analyze the clustering results to help enterprises to take different marketing strategies for different customer groups to improve sales performance.
Big Data Clustering Using Genetic Algorithm On Hadoop Mapreduce
Directory of Open Access Journals (Sweden)
Nivranshu Hans
2015-04-01
Full Text Available Abstract Cluster analysis is used to classify similar objects under same group. It is one of the most important data mining methods. However it fails to perform well for big data due to huge time complexity. For such scenarios parallelization is a better approach. Mapreduce is a popular programming model which enables parallel processing in a distributed environment. But most of the clustering algorithms are not naturally parallelizable for instance Genetic Algorithms. This is so due to the sequential nature of Genetic Algorithms. This paper introduces a technique to parallelize GA based clustering by extending hadoop mapreduce. An analysis of proposed approach to evaluate performance gains with respect to a sequential algorithm is presented. The analysis is based on a real life large data set.
Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering.
He, Zhaoshui; Xie, Shengli; Zdunek, Rafal; Zhou, Guoxu; Cichocki, Andrzej
2011-12-01
Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.
An improved algorithm for clustering gene expression data.
Bandyopadhyay, Sanghamitra; Mukhopadhyay, Anirban; Maulik, Ujjwal
2007-11-01
Recent advancements in microarray technology allows simultaneous monitoring of the expression levels of a large number of genes over different time points. Clustering is an important tool for analyzing such microarray data, typical properties of which are its inherent uncertainty, noise and imprecision. In this article, a two-stage clustering algorithm, which employs a recently proposed variable string length genetic scheme and a multiobjective genetic clustering algorithm, is proposed. It is based on the novel concept of points having significant membership to multiple classes. An iterated version of the well-known Fuzzy C-Means is also utilized for clustering. The significant superiority of the proposed two-stage clustering algorithm as compared to the average linkage method, Self Organizing Map (SOM) and a recently developed weighted Chinese restaurant-based clustering method (CRC), widely used methods for clustering gene expression data, is established on a variety of artificial and publicly available real life data sets. The biological relevance of the clustering solutions are also analyzed.
Improved insensitive to input parameters trajectory clustering algorithm
Institute of Scientific and Technical Information of China (English)
Jiashun Chen; Dechang Pi
2013-01-01
The existing trajectory clustering (TRACLUS) is sensi-tive to the input parameters ε and MinLns. The parameter value is changed a little, but cluster results are entirely different. Aiming at this vulnerability, a shielding parameters sensitivity trajectory cluster (SPSTC) algorithm is proposed which is insensitive to the input parameters. Firstly, some definitions about the core distance and reachable distance of line segment are presented, and then the algorithm generates cluster sorting according to the core dis-tance and reachable distance. Secondly, the reachable plots of line segment sets are constructed according to the cluster sor-ting and reachable distance. Thirdly, a parameterized sequence is extracted according to the reachable plot, and then the final trajec-tory cluster based on the parameterized sequence is acquired. The parameterized sequence represents the inner cluster structure of trajectory data. Experiments on real data sets and test data sets show that the SPSTC algorithm effectively reduces the sensitivity to the input parameters, meanwhile it can obtain the better quality of the trajectory cluster.
Multilayer Traffic Network Optimized by Multiobjective Genetic Clustering Algorithm
Wen, Feng; Gen, Mitsuo; Yu, Xinjie
This paper introduces a multilayer traffic network model and traffic network clustering method for solving the route selection problem (RSP) in car navigation system (CNS). The purpose of the proposed method is to reduce the computation time of route selection substantially with acceptable loss of accuracy by preprocessing the large size traffic network into new network form. The proposed approach further preprocesses the traffic network than the traditional hierarchical network method by clustering method. The traffic network clustering considers two criteria. We specify a genetic clustering algorithm for traffic network clustering and use NSGA-II for calculating the multiple objective Pareto optimal set. The proposed method can overcome the size limitations when solving route selection in CNS. Solutions provided by the proposed algorithm are compared with the optimal solutions to analyze and quantify the loss of accuracy.
Critical slowing down of cluster algorithms for Ising models coupled to 2-d gravity
Bowick, Mark; Falcioni, Marco; Harris, Geoffrey; Marinari, Enzo
1994-02-01
We simulate single and multiple Ising models coupled to 2-d gravity using both the Swendsen-Wang and Wolff algorithms to update the spins. We study the integrated autocorrelation time and find that there is considerable critical slowing down, particularly in the magnetization. We argue that this is primarily due to the local nature of the dynamical triangulation algorithm and to the generation of a distribution of baby universes which inhibits cluster growth.
Critical Slowing Down of Cluster Algorithms for Ising Models Coupled to 2-d Gravity
Bowick, M; Harris, G; Marinari, E
1994-01-01
We simulate single and multiple Ising models coupled to 2-d gravity using both the Swendsen-Wang and Wolff algorithms to update the spins. We study the integrated autocorrelation time and find that there is considerable critical slowing down, particularly in the magnetization. We argue that this is primarily due to the local nature of the dynamical triangulation algorithm and to the generation of a distribution of baby universes which inhibits cluster growth.
A dynamic hierarchical clustering method for trajectory-based unusual video event detection.
Jiang, Fan; Wu, Ying; Katsaggelos, Aggelos K
2009-04-01
The proposed unusual video event detection method is based on unsupervised clustering of object trajectories, which are modeled by hidden Markov models (HMM). The novelty of the method includes a dynamic hierarchical process incorporated in the trajectory clustering algorithm to prevent model overfitting and a 2-depth greedy search strategy for efficient clustering.
Morphology of Open Clusters NGC 1857 and Czernik 20 using Clustering Algorithms
Bhattacharya, Souradeep; Pandaokar, Samay; Singh, Parikshit Kishor
2016-01-01
The morphology and cluster membership of the Galactic open clusters - Czernik 20 and NGC 1857 were analyzed using two different clustering algorithms. We present the maiden use of density-based spatial clustering of applications with noise (DBSCAN) to determine open cluster morphology from spatial distribution. The region of analysis has also been spatially classified using a statistical membership determination algorithm. We utilized near infrared (NIR) data for a suitably large region around the clusters from the United Kingdom Infrared Deep Sky Survey Galactic Plane Survey star catalogue database, and also from the Two Micron All Sky Survey star catalogue database. The densest regions of the cluster morphologies (1 for Czernik 20 and 2 for NGC 1857) thus identified were analyzed with a K-band extinction map and color-magnitude diagrams (CMDs). To address significant discrepancy in known distance and reddening parameters, we carried out field decontamination of these CMDs and subsequent isochrone fitting of...
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale
Emmons, Scott; Gallant, Mike; Börner, Katy
2016-01-01
Notions of community quality underlie network clustering. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms -- Blondel, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 o...
Sampling Within k-Means Algorithm to Cluster Large Datasets
Energy Technology Data Exchange (ETDEWEB)
Bejarano, Jeremy [Brigham Young University; Bose, Koushiki [Brown University; Brannan, Tyler [North Carolina State University; Thomas, Anita [Illinois Institute of Technology; Adragni, Kofi [University of Maryland; Neerchal, Nagaraj [University of Maryland; Ostrouchov, George [ORNL
2011-08-01
Due to current data collection technology, our ability to gather data has surpassed our ability to analyze it. In particular, k-means, one of the simplest and fastest clustering algorithms, is ill-equipped to handle extremely large datasets on even the most powerful machines. Our new algorithm uses a sample from a dataset to decrease runtime by reducing the amount of data analyzed. We perform a simulation study to compare our sampling based k-means to the standard k-means algorithm by analyzing both the speed and accuracy of the two methods. Results show that our algorithm is significantly more efficient than the existing algorithm with comparable accuracy. Further work on this project might include a more comprehensive study both on more varied test datasets as well as on real weather datasets. This is especially important considering that this preliminary study was performed on rather tame datasets. Also, these datasets should analyze the performance of the algorithm on varied values of k. Lastly, this paper showed that the algorithm was accurate for relatively low sample sizes. We would like to analyze this further to see how accurate the algorithm is for even lower sample sizes. We could find the lowest sample sizes, by manipulating width and confidence level, for which the algorithm would be acceptably accurate. In order for our algorithm to be a success, it needs to meet two benchmarks: match the accuracy of the standard k-means algorithm and significantly reduce runtime. Both goals are accomplished for all six datasets analyzed. However, on datasets of three and four dimension, as the data becomes more difficult to cluster, both algorithms fail to obtain the correct classifications on some trials. Nevertheless, our algorithm consistently matches the performance of the standard algorithm while becoming remarkably more efficient with time. Therefore, we conclude that analysts can use our algorithm, expecting accurate results in considerably less time.
GDCluster: A General Decentralized Clustering Algorithm
Mashayekhi, Hoda; Habibi, Jafar; Khalafbeigi, Tania; Voulgaris, Spyros; van Steen, Martinus Richardus
In many popular applications like peer-to-peer systems, large amounts of data are distributed among multiple sources. Analysis of this data and identifying clusters is challenging due to processing, storage, and transmission costs. In this paper, we propose GDCluster, a general fully decentralized
A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering
Chahine, Firas Safwan
2012-01-01
Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…
A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering
Chahine, Firas Safwan
2012-01-01
Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…
Effective FCM noise clustering algorithms in medical images.
Kannan, S R; Devi, R; Ramathilagam, S; Takezawa, K
2013-02-01
The main motivation of this paper is to introduce a class of robust non-Euclidean distance measures for the original data space to derive new objective function and thus clustering the non-Euclidean structures in data to enhance the robustness of the original clustering algorithms to reduce noise and outliers. The new objective functions of proposed algorithms are realized by incorporating the noise clustering concept into the entropy based fuzzy C-means algorithm with suitable noise distance which is employed to take the information about noisy data in the clustering process. This paper presents initial cluster prototypes using prototype initialization method, so that this work tries to obtain the final result with less number of iterations. To evaluate the performance of the proposed methods in reducing the noise level, experimental work has been carried out with a synthetic image which is corrupted by Gaussian noise. The superiority of the proposed methods has been examined through the experimental study on medical images. The experimental results show that the proposed algorithms perform significantly better than the standard existing algorithms. The accurate classification percentage of the proposed fuzzy C-means segmentation method is obtained using silhouette validity index.
Robustness of the ATLAS pixel clustering neural network algorithm
AUTHOR|(INSPIRE)INSPIRE-00407780; The ATLAS collaboration
2016-01-01
Proton-proton collisions at the energy frontier puts strong constraints on track reconstruction algorithms. In the ATLAS track reconstruction algorithm, an artificial neural network is utilised to identify and split clusters of neighbouring read-out elements in the ATLAS pixel detector created by multiple charged particles. The robustness of the neural network algorithm is presented, probing its sensitivity to uncertainties in the detector conditions. The robustness is studied by evaluating the stability of the algorithm's performance under a range of variations in the inputs to the neural networks. Within reasonable variation magnitudes, the neural networks prove to be robust to most variation types.
Study of the dynamic behavior of Niedermayer's algorithm
Girardi, Daniel; Branco, Nilton
2010-03-01
We calculate the dynamic exponent for the Niedermayer algorithm applied to the two-dimensional Ising and XY models, for various values of the free parameter E0. For E0=-1 we reobtain the Metropolis algorithm and for E0=1 we regain the Wolff algorithm. For -1L, the Niedermayer algorithm is equivalent to the Metropolis one, i.e, they have the same dynamic exponent. For a given size L, the correlation time is always greater for the Niedermayer algorithm than for Wolff's. For E0>1, the mean size of the islands of turned spins grows faster than a power of L and the correlation time is always greater than for the Wolff algorithm. Therefore, we show that the best choice of cluster algorithm is the Wolff one, when compared to the Nierdermayer generalization. We also obtain the dynamic behavior of the Wolff algorithm: although not conclusive, we propose a scaling law for the dependence of the correlation time on L.
World Wide Web Metasearch Clustering Algorithm
Directory of Open Access Journals (Sweden)
Adina LIPAI
2008-01-01
Full Text Available As the storage capacity and the processing speed of search engine is growing to keep up with the constant expansion of the World Wide Web, the user is facing an increasing list of results for a given query. A simple query composed of common words sometimes have hundreds even thousands of results making it practically impossible for the user to verify all of them, in order to identify a particular site. Even when the list of results is presented to the user ordered by a rank, most of the time it is not sufficient support to help him identify the most relevant sites for his query. The concept of search result clustering was introduced as a solution to this situation. The process of clustering search results consists of building up thematically homogenous groups from the initial list results provided by classic search tools, and using up characteristics present within the initial results, without any kind of predefined categories.
Lee, Chongdeuk; Jeong, Taegwon
2011-01-01
Clustering is an important mechanism that efficiently provides information for mobile nodes and improves the processing capacity of routing, bandwidth allocation, and resource management and sharing. Clustering algorithms can be based on such criteria as the battery power of nodes, mobility, network size, distance, speed and direction. Above all, in order to achieve good clustering performance, overhead should be minimized, allowing mobile nodes to join and leave without perturbing the membership of the cluster while preserving current cluster structure as much as possible. This paper proposes a Fuzzy Relevance-based Cluster head selection Algorithm (FRCA) to solve problems found in existing wireless mobile ad hoc sensor networks, such as the node distribution found in dynamic properties due to mobility and flat structures and disturbance of the cluster formation. The proposed mechanism uses fuzzy relevance to select the cluster head for clustering in wireless mobile ad hoc sensor networks. In the simulation implemented on the NS-2 simulator, the proposed FRCA is compared with algorithms such as the Cluster-based Routing Protocol (CBRP), the Weighted-based Adaptive Clustering Algorithm (WACA), and the Scenario-based Clustering Algorithm for Mobile ad hoc networks (SCAM). The simulation results showed that the proposed FRCA achieves better performance than that of the other existing mechanisms.
Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques
Directory of Open Access Journals (Sweden)
Rayner Alfred
2010-01-01
Full Text Available Problem statement: In solving a classification problem in relational data mining, traditional methods, for example, the C4.5 and its variants, usually require data transformations from datasets stored in multiple tables into a single table. Unfortunately, we may loss some information when we join tables with a high degree of one-to-many association. Therefore, data transformation becomes a tedious trial-and-error work and the classification result is often not very promising especially when the number of tables and the degree of one-to-many association are large. Approach: We proposed a genetic semi-supervised clustering technique as a means of aggregating data stored in multiple tables to facilitate the task of solving a classification problem in relational database. This algorithm is suitable for classification of datasets with a high degree of one-to-many associations. It can be used in two ways. One is user-controlled clustering, where the user may control the result of clustering by varying the compactness of the spherical cluster. The other is automatic clustering, where a non-overlap clustering strategy is applied. In this study, we use the latter method to dynamically cluster multiple instances, as a means of aggregating them and illustrate the effectiveness of this method using the semi-supervised genetic algorithm-based clustering technique. Results: It was shown in the experimental results that using the reciprocal of Davies-Bouldin Index for cluster dispersion and the reciprocal of Gini Index for cluster purity, as the fitness function in the Genetic Algorithm (GA, finds solutions with much greater accuracy. The results obtained in this study showed that automatic clustering (seeding, by optimizing the cluster dispersion or cluster purity alone using GA, provides one with good results compared to the traditional k-means clustering. However, the best result can be achieved by optimizing the combination values of both the cluster
Efficient Clustering of Web Search Results Using Enhanced Lingo Algorithm
Directory of Open Access Journals (Sweden)
M. Manikantan
2015-02-01
Full Text Available Web query optimization is the focus of recent research and development efforts. To fetch the required information, the users are using search engines and sometimes through the website interfaces. One approach is search engine optimization which is used by the website developers to popularize their website through the search engine results. Clustering is a main task of explorative data mining process and a common technique for grouping the web search results into a different category based on the specific web contents. A clustering search engine called Lingo used only snippets to cluster the documents. Though this method takes less time to cluster the documents, it could not be able to produce the clusters of good quality. This study focuses on clustering all documents using by applying semantic similarity between words and then by applying modified lingo algorithm in less time and produce good quality.
A Novel Hybrid Data Clustering Algorithm Based on Artificial Bee Colony Algorithm and K-Means
Institute of Scientific and Technical Information of China (English)
TRAN Dang Cong; WU Zhijian; WANG Zelin; DENG Changshou
2015-01-01
To improve the performance of K-means clustering algorithm, this paper presents a new hybrid ap-proach of Enhanced artificial bee colony algorithm and K-means (EABCK). In EABCK, the original artificial bee colony algorithm (called ABC) is enhanced by a new mu-tation operation and guided by the global best solution (called EABC). Then, the best solution is updated by K-means in each iteration for data clustering. In the experi-ments, a set of benchmark functions was used to evaluate the performance of EABC with other comparative ABC variants. To evaluate the performance of EABCK on data clustering, eleven benchmark datasets were utilized. The experimental results show that EABC and EABCK out-perform other comparative ABC variants and data clus-tering algorithms, respectively.
AN IMPROVED FUZZY CLUSTERING ALGORITHM FOR MICROARRAY IMAGE SPOTS SEGMENTATION
Directory of Open Access Journals (Sweden)
V.G. Biju
2015-11-01
Full Text Available An automatic cDNA microarray image processing using an improved fuzzy clustering algorithm is presented in this paper. The spot segmentation algorithm proposed uses the gridding technique developed by the authors earlier, for finding the co-ordinates of each spot in an image. Automatic cropping of spots from microarray image is done using these co-ordinates. The present paper proposes an improved fuzzy clustering algorithm Possibility fuzzy local information c means (PFLICM to segment the spot foreground (FG from background (BG. The PFLICM improves fuzzy local information c means (FLICM algorithm by incorporating typicality of a pixel along with gray level information and local spatial information. The performance of the algorithm is validated using a set of simulated cDNA microarray images added with different levels of AWGN noise. The strength of the algorithm is tested by computing the parameters such as the Segmentation matching factor (SMF, Probability of error (pe, Discrepancy distance (D and Normal mean square error (NMSE. SMF value obtained for PFLICM algorithm shows an improvement of 0.9 % and 0.7 % for high noise and low noise microarray images respectively compared to FLICM algorithm. The PFLICM algorithm is also applied on real microarray images and gene expression values are computed.
Application of genetic algorithms to hydrogenated silicon clusters
Indian Academy of Sciences (India)
N Chakraborti; R Prasad
2003-01-01
We discuss the application of biologically inspired genetic algorithms to determine the ground state structures of a number of Si–H clusters. The total energy of a given configuration of a cluster has been obtained by using a non-orthogonal tight-binding model and the energy minimization has been carried out by using genetic algorithms and their recent variant differential evolution. Our results for ground state structures and cohesive energies for Si–H clusters are in good agreement with the earlier work conducted using the simulated annealing technique. We find that the results obtained by genetic algorithms turn out to be comparable and often better than the results obtained by the simulated annealing technique.
Spin chain simulations with a meron cluster algorithm
Energy Technology Data Exchange (ETDEWEB)
Boyer, T. [Humboldt-Universitaet, Berlin (Germany). Inst. fuer Physik]|[Ecole Normale Superieure de Cachan (France); Bietenholz, W. [Humboldt-Universitaet, Berlin (Germany). Inst. fuer Physik]|[Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC; Wuilloud, J. [Humboldt-Universitaet, Berlin (Germany). Inst. fuer Physik]|[Geneve Univ. (Switzerland). Dept. de Physique Theorique
2007-01-15
We apply a meron cluster algorithm to the XY spin chain, which describes a quantum rotor. This is a multi-cluster simulation supplemented by an improved estimator, which deals with objects of half-integer topological charge. This method is powerful enough to provide precise results for the model with a {theta}-term - it is therefore one of the rare examples, where a system with a complex action can be solved numerically. In particular we measure the correlation length, as well as the topological and magnetic susceptibility. We discuss the algorithmic efficiency in view of the critical slowing down. Due to the excellent performance that we observe, it is strongly motivated to work on new applications of meron cluster algorithms in higher dimensions. (orig.)
Adaptive Weighted Clustering Algorithm for Mobile Ad-hoc Networks
Directory of Open Access Journals (Sweden)
Adwan Yasin
2016-04-01
Full Text Available In this paper we present a new algorithm for clustering MANET by considering several parameters. This is a new adaptive load balancing technique for clustering out Mobile Ad-hoc Networks (MANET. MANET is special kind of wireless networks where no central management exits and the nodes in the network cooperatively manage itself and maintains connectivity. The algorithm takes into account the local capabilities of each node, the remaining battery power, degree of connectivity and finally the power consumption based on the average distance between nodes and candidate cluster head. The proposed algorithm efficiently decreases the overhead in the network that enhances the overall MANET performance. Reducing the maintenance time of broken routes makes the network more stable, reliable. Saving the power of the nodes also guarantee consistent and reliable network.
A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis
Directory of Open Access Journals (Sweden)
Shaoning Li
2017-01-01
Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.
Energy Efficient Homogenous Clustering and Cluster Head Selection Algorithm for WSN
Directory of Open Access Journals (Sweden)
Ganeshayya I. Shidaganti
2013-02-01
Full Text Available Wireless sensor networks (WSNs are energy and resource constrained networks, which are made up of small electronic devices called sensor nodes. Each sensor nodes are capable of sensing, computing and transmitting data from one node to another, till to reach base station. Each node monitors physical or environmental conditions, depending on application and communicate with nearby nodes via radio broadcast. Radio transmission and reception consumes a lot of energy in a wireless sensor network (WSN, thus, one of the important issues in wireless sensor network is the inherent limited battery power within the sensor nodes. Therefore, battery power is crucial parameter in the algorithm design in maximizing the lifespan of sensor nodes. Much research has been done in recent years in the area of low power routing protocol, but there are still many design options open for improvement and for further research targeted to the specific applications need to be done. In this paper, we propose a new approach of an energy-efficient homogeneous clustering and cluster head selection algorithm for wireless sensor networks in which the lifespan of the network is increased by ensuring a homogeneous distribution of nodes in the clusters. In this clustering algorithm, energy efficiency is distributed and network performance is improved by selecting cluster heads on the basis of the residual energy of existing cluster heads, holdback value, and nearest hop distance of the node. In the proposed clustering algorithm, the cluster members are uniformly distributed and the life of the network is further extended
An empirical study of dynamic graph algorithms
Energy Technology Data Exchange (ETDEWEB)
Alberts, D. [Freie Universitaet Berlin (Germany); Cattaneo, G. [Universita di Salerno (Italy); Italiano, G.F. [Universita Ca Forscari di Venezia (Italy)
1996-12-31
We conduct an empirical study on some dynamic graph algorithms which where developed recently. The following implementations were tested and compared with simple algorithms: dynamic connectivity, and dynamic minimum 1 spanning tree based on sparsification by Eppstein et al.; dynamic connectivity based on a very recent paper by Henzinger and King. In our experiments, we considered both random and non-random inputs. Moreover, we present a simplified variant of the algorithm by Henzinger and King, which for random inputs was always faster than the original implementation. Indeed, this variant was among the fastest implementations for random inputs. For non-random inputs, sparsification was the fastest algorithm for small sequences of updates; for medium and large sequences of updates, the original algorithm by Henzinger and King was faster. Perhaps one of the main practical results of this paper is that our implementations of the sophisticated dynamic graph algorithms were faster than simpler algorithms for most practical values of the graph parameters, and competitive with simpler algorithms even in case of very small graphs (say graphs with less than a dozen vertices and edges). From the theoretical point of view, we analyze the average case running time of sparsification and prove that the logarithmic overhead for simple sparsification vanishes for dynamic random graphs.
A REAL-TIME C-V CLUSTERING ALGORITHM FOR WEB-MINING
Institute of Scientific and Technical Information of China (English)
Li Haiying; Zhuang Zhenquan; Li Bin; Wan Ke
2002-01-01
In this letter, a real-time C-V (Characteristic-Vector) clustering algorithm is put forth to treat with vast action data which are dynamically collected from web site. The algorithm cites the concept of C-V to denote characteristic, synchronously it adopts two-value [0,1]input and self-definition vigilance parameter to design clustering-architecture. Vector Degree of Matching (VDM) plays a key role in the clustering algorithm, which determines the magnitude of typical characteristic. Making use of stability analysis, the classifications are confirmed to have reliably hierarchical structure when vigilance parameter shifts from 0.1 to 0.99. This non-linear relation between vigilance parameter and classification upper limit helps mining out representative classifications from net-users according to the actual web resource, then administering system can map them to web resource space to implement the intelligent configuration effectually and rapidly.
A Throughput-Driven Scheduling Algorithm of Differentiated Service for Web Cluster
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
Requests distribution is an key technology for Web cluster server. This paper presents a throughput-driven scheduling algorithm (TDSA). The algorithm adopts the throughput of cluster back-ends to evaluate their load and employs the neural network model to predict the future load so that the scheduling system features a self-learning capability and good adaptability to the change of load. Moreover, it separates static requests from dynamic requests to make full use of the CPU resources and takes the locality of requests into account to improve the cache hit ratio. Experimental results from the testing tool of WebBenchTM show better performance for Web cluster server with TDSA than that with traditional scheduling algorithms.
NCUBE - A clustering algorithm based on a discretized data space
Eigen, D. J.; Northouse, R. A.
1974-01-01
Cluster analysis involves the unsupervised grouping of data. The process provides an automatic procedure for generating known training samples for pattern classification. NCUBE, the clustering algorithm presented, is based upon the concept of imposing a gridwork on the data space. The NCUBE computer implementation of this concept provides an easily derived form of piecewise linear discrimination. This piecewise linear discrimination permits the separation of some types of data groups that are not linearly separable.
Cluster-Based Multipolling Sequencing Algorithm for Collecting RFID Data in Wireless LANs
Choi, Woo-Yong; Chatterjee, Mainak
2015-03-01
With the growing use of RFID (Radio Frequency Identification), it is becoming important to devise ways to read RFID tags in real time. Access points (APs) of IEEE 802.11-based wireless Local Area Networks (LANs) are being integrated with RFID networks that can efficiently collect real-time RFID data. Several schemes, such as multipolling methods based on the dynamic search algorithm and random sequencing, have been proposed. However, as the number of RFID readers associated with an AP increases, it becomes difficult for the dynamic search algorithm to derive the multipolling sequence in real time. Though multipolling methods can eliminate the polling overhead, we still need to enhance the performance of the multipolling methods based on random sequencing. To that extent, we propose a real-time cluster-based multipolling sequencing algorithm that drastically eliminates more than 90% of the polling overhead, particularly so when the dynamic search algorithm fails to derive the multipolling sequence in real time.
A Rough Set based Gene Expression Clustering Algorithm
Directory of Open Access Journals (Sweden)
J. J. Emilyn
2011-01-01
Full Text Available Problem statement: Microarray technology helps in monitoring the expression levels of thousands of genes across collections of related samples. Approach: The main goal in the analysis of large and heterogeneous gene expression datasets was to identify groups of genes that get expressed in a set of experimental conditions. Results: Several clustering techniques have been proposed for identifying gene signatures and to understand their role and many of them have been applied to gene expression data, but with partial success. The main aim of this work was to develop a clustering algorithm that would successfully indentify gene patterns. The proposed novel clustering technique (RCGED provides an efficient way of finding the hidden and unique gene expression patterns. It overcomes the restriction of one object being placed in only one cluster. Conclusion/Recommendations: The proposed algorithm is termed intelligent because it automatically determines the optimum number of clusters. The proposed algorithm was experimented with colon cancer dataset and the results were compared with Rough Fuzzy K Means algorithm.
Core Business Selection Based on Ant Colony Clustering Algorithm
Directory of Open Access Journals (Sweden)
Yu Lan
2014-01-01
Full Text Available Core business is the most important business to the enterprise in diversified business. In this paper, we first introduce the definition and characteristics of the core business and then descript the ant colony clustering algorithm. In order to test the effectiveness of the proposed method, Tianjin Port Logistics Development Co., Ltd. is selected as the research object. Based on the current situation of the development of the company, the core business of the company can be acquired by ant colony clustering algorithm. Thus, the results indicate that the proposed method is an effective way to determine the core business for company.
Research on Scheduling Algorithms in Web Cluster Servers
Institute of Scientific and Technical Information of China (English)
LEI YingChun (雷迎春); GONG YiLi (龚奕利); ZHANG Song (张松); LI GuoJie (李国杰)
2003-01-01
This paper analyzes quantitatively the impact of the load balance scheduling algorithms and the locality scheduling algorithms on the performance of Web cluster servers, and brings forward the Adaptive_LARD algorithm. Compared with the representative LARD algorithm, the advantages of the Adaptive_LARD are that: (1) it adjusts load distribution among the back-ends through the idea of load balancing to avoid learning steps in the LARD algorithm and reinforce its adaptability; (2) by distinguishing between TCP connections accessing disks and those accessing cache memory, it can estimate the impact of different connections on the back-ends' load more precisely. Performance evaluations suggest that the proposed method outperforms the LARD algorithm by up to 14.7%.
Identifying multiple influential spreaders by a heuristic clustering algorithm
Energy Technology Data Exchange (ETDEWEB)
Bao, Zhong-Kui [School of Mathematical Science, Anhui University, Hefei 230601 (China); Liu, Jian-Guo [Data Science and Cloud Service Research Center, Shanghai University of Finance and Economics, Shanghai, 200133 (China); Zhang, Hai-Feng, E-mail: haifengzhang1978@gmail.com [School of Mathematical Science, Anhui University, Hefei 230601 (China); Department of Communication Engineering, North University of China, Taiyuan, Shan' xi 030051 (China)
2017-03-18
The problem of influence maximization in social networks has attracted much attention. However, traditional centrality indices are suitable for the case where a single spreader is chosen as the spreading source. Many times, spreading process is initiated by simultaneously choosing multiple nodes as the spreading sources. In this situation, choosing the top ranked nodes as multiple spreaders is not an optimal strategy, since the chosen nodes are not sufficiently scattered in networks. Therefore, one ideal situation for multiple spreaders case is that the spreaders themselves are not only influential but also they are dispersively distributed in networks, but it is difficult to meet the two conditions together. In this paper, we propose a heuristic clustering (HC) algorithm based on the similarity index to classify nodes into different clusters, and finally the center nodes in clusters are chosen as the multiple spreaders. HC algorithm not only ensures that the multiple spreaders are dispersively distributed in networks but also avoids the selected nodes to be very “negligible”. Compared with the traditional methods, our experimental results on synthetic and real networks indicate that the performance of HC method on influence maximization is more significant. - Highlights: • A heuristic clustering algorithm is proposed to identify the multiple influential spreaders in complex networks. • The algorithm can not only guarantee the selected spreaders are sufficiently scattered but also avoid to be “insignificant”. • The performance of our algorithm is generally better than other methods, regardless of real networks or synthetic networks.
Limited Random Walk Algorithm for Big Graph Data Clustering
Zhang, Honglei; Kiranyaz, Serkan; Gabbouj, Moncef
2016-01-01
Graph clustering is an important technique to understand the relationships between the vertices in a big graph. In this paper, we propose a novel random-walk-based graph clustering method. The proposed method restricts the reach of the walking agent using an inflation function and a normalization function. We analyze the behavior of the limited random walk procedure and propose a novel algorithm for both global and local graph clustering problems. Previous random-walk-based algorithms depend on the chosen fitness function to find the clusters around a seed vertex. The proposed algorithm tackles the problem in an entirely different manner. We use the limited random walk procedure to find attracting vertices in a graph and use them as features to cluster the vertices. According to the experimental results on the simulated graph data and the real-world big graph data, the proposed method is superior to the state-of-the-art methods in solving graph clustering problems. Since the proposed method uses the embarrass...
A Genetic Clustering Algorithm for Mean-Residual Vector Quantization
Institute of Scientific and Technical Information of China (English)
CHUShuchuan; JohnF.Roddick; CHENTsongyi
2004-01-01
Vector quantization (VQ) is a useful tool for data compression and can be applied to compress the data vectors in the database. The quality of the recovered data vector depends on a good codebook. Meanresidual vector quantization (M/R VQ) has been shown to be efficient in the encoding time and it only needs a little storage. In this paper, genetic algorithms in combination with the Generalized lloyd algorithm (GLA) are applied to the codebook design of M/R VQ. The mean codebook and residual codebook are trained using GLA algorithm separately, then Genetic algorithms (GA) are used to evaluate and evolve the combined mean codebook and residual codebook. The parameters used in the proposed algorithm are designed based on experiments and they are robust to the proposed GA based clustering algorithm for M/R VQ. Experimental results demonstrate the proposed genetic clustering algorithm applied to M/R VQ may improve the peak signal to noise ratio of the recovered data vector compared with the GLA algorithm.
Critical Dynamics Behavior of the Wolff Algorithm in the Site-Bond-Correlated Ising Model
Campos, P. R. A.; Onody, R. N.
Here we apply the Wolff single-cluster algorithm to the site-bond-correlated Ising model and study its critical dynamical behavior. We have verified that the autocorrelation time diminishes in the presence of dilution and correlation, showing that the Wolff algorithm performs even better in such situations. The critical dynamical exponents are also estimated.
Efficient Algorithms for Langevin and DPD Dynamics.
Goga, N; Rzepiela, A J; de Vries, A H; Marrink, S J; Berendsen, H J C
2012-10-09
In this article, we present several algorithms for stochastic dynamics, including Langevin dynamics and different variants of Dissipative Particle Dynamics (DPD), applicable to systems with or without constraints. The algorithms are based on the impulsive application of friction and noise, thus avoiding the computational complexity of algorithms that apply continuous friction and noise. Simulation results on thermostat strength and diffusion properties for ideal gas, coarse-grained (MARTINI) water, and constrained atomic (SPC/E) water systems are discussed. We show that the measured thermal relaxation rates agree well with theoretical predictions. The influence of various parameters on the diffusion coefficient is discussed.
An Airborne Radar Clutter Tracking Algorithm Based on Multifractal and Fuzzy C-Mean Cluster
Institute of Scientific and Technical Information of China (English)
Wei Zhang; Sheng-Lin Yu; Gong Zhang
2007-01-01
For an airborne lookdown radar, clutter power often changes dynamically about 80 dB with wide distributions as the platform moves. Therefore, clutter tracking techniques are required to guide the selection of const false alarm rate (CFAR) schemes. In this work, clutter tracking is done in image domain and an algorithm combining multifractal and fuzzy C-mean (FCM) cluster is proposed. The clutter with large dynamic distributions in power density is converted to steady distributions of multifractal exponents by the multifractal transformation with the optimum moment. Then, later, the main lobe and side lobe are tracked from the multifractal exponents by FCM clustering method.
A Novel Cluster Head Selection Algorithm Based on Fuzzy Clustering and Particle Swarm Optimization.
Ni, Qingjian; Pan, Qianqian; Du, Huimin; Cao, Cen; Zhai, Yuqing
2017-01-01
An important objective of wireless sensor network is to prolong the network life cycle, and topology control is of great significance for extending the network life cycle. Based on previous work, for cluster head selection in hierarchical topology control, we propose a solution based on fuzzy clustering preprocessing and particle swarm optimization. More specifically, first, fuzzy clustering algorithm is used to initial clustering for sensor nodes according to geographical locations, where a sensor node belongs to a cluster with a determined probability, and the number of initial clusters is analyzed and discussed. Furthermore, the fitness function is designed considering both the energy consumption and distance factors of wireless sensor network. Finally, the cluster head nodes in hierarchical topology are determined based on the improved particle swarm optimization. Experimental results show that, compared with traditional methods, the proposed method achieved the purpose of reducing the mortality rate of nodes and extending the network life cycle.
Cluster dynamics largely shapes protoplanetary disc sizes
Vincke, Kirsten
2016-01-01
It is still on open question to what degree the cluster environment influences the sizes of protoplanetary discs surrounding young stars. Particularly so for the short-lived clusters typical for the solar neighbourhood in which the stellar density and therefore the influence of the cluster environment changes considerably over the first 10 Myr. In previous studies often the effect of the gas on the cluster dynamics has been neglected, this is remedied here. Using the code NBody6++ we study the stellar dynamics in different developmental phases - embedded, expulsion, expansion - including the gas and quantify the effect of fly-bys on the disc size. We concentrate on massive clusters ($M_{\\text{cl}} \\geq 10^3 - 6 \\cdot 10^4 M_{\\text{Sun}}$), which are representative for clusters like the Orion Nebula Cluster (ONC) or NGC 6611. We find that not only the stellar density but also the duration of the embedded phase matters. The densest clusters react fastest to the gas expulsion and drop quickly in density, here 98...
Dynamic Route Guidance Using Improved Genetic Algorithms
Directory of Open Access Journals (Sweden)
Zhanke Yu
2013-01-01
Full Text Available This paper presents an improved genetic algorithm (IGA for dynamic route guidance algorithm. The proposed IGA design a vicinity crossover technique and a greedy backward mutation technique to increase the population diversity and strengthen local search ability. The steady-state reproduction is introduced to protect the optimized genetic individuals. Furthermore the junction delay is introduced to the fitness function. The simulation results show the effectiveness of the proposed algorithm.
Clustering algorithms for Stokes space modulation format recognition
DEFF Research Database (Denmark)
Boada, Ricard; Borkowski, Robert; Tafur Monroy, Idelfonso
2015-01-01
Stokes space modulation format recognition (Stokes MFR) is a blind method enabling digital coherent receivers to infer modulation format information directly from a received polarization-division-multiplexed signal. A crucial part of the Stokes MFR is a clustering algorithm, which largely...
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.
Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy
2016-01-01
Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.
The C4 clustering algorithm: Clusters of galaxies in the Sloan Digital Sky Survey
Energy Technology Data Exchange (ETDEWEB)
Miller, Christopher J.; Nichol, Robert; Reichart, Dan; Wechsler, Risa H.; Evrard, August; Annis, James; McKay, Timothy; Bahcall, Neta; Bernardi, Mariangela; Boehringer,; Connolly, Andrew; Goto, Tomo; Kniazev, Alexie; Lamb, Donald; Postman, Marc; Schneider, Donald; Sheth, Ravi; Voges, Wolfgang; /Cerro-Tololo InterAmerican Obs. /Portsmouth U.,
2005-03-01
We present the ''C4 Cluster Catalog'', a new sample of 748 clusters of galaxies identified in the spectroscopic sample of the Second Data Release (DR2) of the Sloan Digital Sky Survey (SDSS). The C4 cluster-finding algorithm identifies clusters as overdensities in a seven-dimensional position and color space, thus minimizing projection effects that have plagued previous optical cluster selection. The present C4 catalog covers {approx}2600 square degrees of sky and ranges in redshift from z = 0.02 to z = 0.17. The mean cluster membership is 36 galaxies (with redshifts) brighter than r = 17.7, but the catalog includes a range of systems, from groups containing 10 members to massive clusters with over 200 cluster members with redshifts. The catalog provides a large number of measured cluster properties including sky location, mean redshift, galaxy membership, summed r-band optical luminosity (L{sub r}), velocity dispersion, as well as quantitative measures of substructure and the surrounding large-scale environment. We use new, multi-color mock SDSS galaxy catalogs, empirically constructed from the {Lambda}CDM Hubble Volume (HV) Sky Survey output, to investigate the sensitivity of the C4 catalog to the various algorithm parameters (detection threshold, choice of passbands and search aperture), as well as to quantify the purity and completeness of the C4 cluster catalog. These mock catalogs indicate that the C4 catalog is {approx_equal}90% complete and 95% pure above M{sub 200} = 1 x 10{sup 14} h{sup -1}M{sub {circle_dot}} and within 0.03 {le} z {le} 0.12. Using the SDSS DR2 data, we show that the C4 algorithm finds 98% of X-ray identified clusters and 90% of Abell clusters within 0.03 {le} z {le} 0.12. Using the mock galaxy catalogs and the full HV dark matter simulations, we show that the L{sub r} of a cluster is a more robust estimator of the halo mass (M{sub 200}) than the galaxy line-of-sight velocity dispersion or the richness of the cluster
Clustering molecular dynamics trajectories for optimizing docking experiments.
De Paris, Renata; Quevedo, Christian V; Ruiz, Duncan D; Norberto de Souza, Osmar; Barros, Rodrigo C
2015-01-01
Molecular dynamics simulations of protein receptors have become an attractive tool for rational drug discovery. However, the high computational cost of employing molecular dynamics trajectories in virtual screening of large repositories threats the feasibility of this task. Computational intelligence techniques have been applied in this context, with the ultimate goal of reducing the overall computational cost so the task can become feasible. Particularly, clustering algorithms have been widely used as a means to reduce the dimensionality of molecular dynamics trajectories. In this paper, we develop a novel methodology for clustering entire trajectories using structural features from the substrate-binding cavity of the receptor in order to optimize docking experiments on a cloud-based environment. The resulting partition was selected based on three clustering validity criteria, and it was further validated by analyzing the interactions between 20 ligands and a fully flexible receptor (FFR) model containing a 20 ns molecular dynamics simulation trajectory. Our proposed methodology shows that taking into account features of the substrate-binding cavity as input for the k-means algorithm is a promising technique for accurately selecting ensembles of representative structures tailored to a specific ligand.
Clustering Molecular Dynamics Trajectories for Optimizing Docking Experiments
Directory of Open Access Journals (Sweden)
Renata De Paris
2015-01-01
Full Text Available Molecular dynamics simulations of protein receptors have become an attractive tool for rational drug discovery. However, the high computational cost of employing molecular dynamics trajectories in virtual screening of large repositories threats the feasibility of this task. Computational intelligence techniques have been applied in this context, with the ultimate goal of reducing the overall computational cost so the task can become feasible. Particularly, clustering algorithms have been widely used as a means to reduce the dimensionality of molecular dynamics trajectories. In this paper, we develop a novel methodology for clustering entire trajectories using structural features from the substrate-binding cavity of the receptor in order to optimize docking experiments on a cloud-based environment. The resulting partition was selected based on three clustering validity criteria, and it was further validated by analyzing the interactions between 20 ligands and a fully flexible receptor (FFR model containing a 20 ns molecular dynamics simulation trajectory. Our proposed methodology shows that taking into account features of the substrate-binding cavity as input for the k-means algorithm is a promising technique for accurately selecting ensembles of representative structures tailored to a specific ligand.
A Survey on Clustering Algorithms for Heterogeneous Wireless Sensor Networks
Directory of Open Access Journals (Sweden)
Vivek Katiyar
2011-01-01
Full Text Available Potential use of wireless sensor networks (WSNs can be seen in various fields like disaster management, battle field surveillance and border security surveillance since last few years. In such applications, a large number of sensor nodes are deployed, which are often unattended and work autonomously. Clustering is a key technique used to extend the lifetime of a sensor network by reducing energy consumption. It can also increase network scalability. Sensor nodes are considered to be homogeneous since the researches in the field of WSNs have been evolved, but some nodes may be of different energy to prolong the lifetime of a WSN and its reliability. In this paper, we study the impact of heterogeneity of nodes to the performance of WSNs. This paper surveys different clustering algorithms for heterogeneous WSNs by classifying algorithms depending upon various clustering attributes.
A HYBRID HEURISTIC ALGORITHM FOR THE CLUSTERED TRAVELING SALESMAN PROBLEM
Directory of Open Access Journals (Sweden)
Mário Mestria
2016-04-01
Full Text Available ABSTRACT This paper proposes a hybrid heuristic algorithm, based on the metaheuristics Greedy Randomized Adaptive Search Procedure, Iterated Local Search and Variable Neighborhood Descent, to solve the Clustered Traveling Salesman Problem (CTSP. Hybrid Heuristic algorithm uses several variable neighborhood structures combining the intensification (using local search operators and diversification (constructive heuristic and perturbation routine. In the CTSP, the vertices are partitioned into clusters and all vertices of each cluster have to be visited contiguously. The CTSP is -hard since it includes the well-known Traveling Salesman Problem (TSP as a special case. Our hybrid heuristic is compared with three heuristics from the literature and an exact method. Computational experiments are reported for different classes of instances. Experimental results show that the proposed hybrid heuristic obtains competitive results within reasonable computational time.
Distributed Clustering Algorithm to Explore Selection Diversity in Wireless Sensor Networks
Kong, Hyung-Yun; Asaduzzaman, Hyung-Yun
This paper presents a novel cross-layer approach to explore selection diversity for distributed clustering based wireless sensor networks (WSNs) by selecting a proper cluster-head. We develop and analyze an instantaneous channel state information (CSI) based cluster-head selection algorithm for a distributed, dynamic and randomized clustering based WSN. The proposed cluster-head selection scheme is also random and capable to distribute the energy uses among the nodes in the network. We present an analytical approach to evaluate the energy efficiency and system lifetime of our proposal. Analysis shows that the proposed scheme outperforms the performance of additive white Gaussian noise (AWGN) channel under Rayleigh fading environment. This proposal also outperforms the existing cooperative diversity protocols in terms of system lifetime and implementation complexity.
An Efficient Cluster Algorithm for CP(N-1) Models
Beard, B B; Riederer, S; Wiese, U J
2005-01-01
We construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a new regularization for CP(N-1) models in the framework of D-theory, which is an alternative non-perturbative approach to quantum field theory formulated in terms of discrete quantum variables instead of classical fields. Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard formulation of lattice field theory. In fact, there is even a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. We present various simulations for different correlation lengths, couplings and lattice sizes. We have simulated correlation lengths up to 250 lattice spacings on lattices as large as 640x640 and we detect no evidence for critical slowing down.
Morphology of open clusters NGC 1857 and Czernik 20 using clustering algorithms
Bhattacharya, S.; Mahulkar, V.; Pandaokar, S.; Singh, P. K.
2017-01-01
The morphology and cluster membership of the Galactic open clusters-Czernik 20 and NGC 1857 were analyzed using two different clustering algorithms. We present the maiden use of density-based spatial clustering of applications with noise (DBSCAN) to determine open cluster morphology from spatial distribution. The region of analysis has also been spatially classified using a statistical membership determination algorithm. We utilized near infrared (NIR) data for a suitably large region around the clusters from the United Kingdom Infrared Deep Sky Survey Galactic Plane Survey star catalogue database, and also from the Two Micron All Sky Survey star catalogue database. The densest regions of the cluster morphologies (1 for Czernik 20 and 2 for NGC 1857) thus identified were analyzed with a K-band extinction map and color-magnitude diagrams (CMDs). To address significant discrepancy in known distance and reddening parameters, we carried out field decontamination of these CMDs and subsequent isochrone fitting of the cleaned CMDs to obtain reliable distance and reddening parameters for the clusters (Czernik 20: D = 2900 pc; E(J- K) = 0 . 33; NGC 1857: D = 2400 pc; E(J- K) =0.18-0.19). The isochrones were also used to convert the luminosity functions for the densest regions of Czernik 20 and NGC 1857 into mass function, to derive their slopes. Additionally, a previously unknown over-density consistent with that of a star cluster is identified in the region of analysis.
A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network.
Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue
2016-02-19
Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency.
A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network
Directory of Open Access Journals (Sweden)
Yuzhong Chen
2016-02-01
Full Text Available Vehicular ad hoc networks (VANETs have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency.
Cardiac mitochondria exhibit dynamic functional clustering
Directory of Open Access Journals (Sweden)
Felix Tobias Kurz
2014-09-01
Full Text Available Multi-oscillatory behavior of mitochondrial inner membrane potential ΔΨm in self-organized cardiac mitochondrial networks can be triggered by metabolic or oxidative stress. Spatio-temporal analyses of cardiac mitochondrial networks have shown that mitochondria are heterogeneously organized in synchronously oscillating clusters in which the mean cluster frequency and size are inversely correlated, thus suggesting a modulation of cluster frequency through local inter-mitochondrial coupling. In this study, we propose a method to examine the mitochondrial network's topology through quantification of its dynamic local clustering coefficients. Individual mitochondrial ΔΨm oscillation signals were identified for each cardiac myocyte and cross-correlated with all network mitochondria using previously described methods (Kurz et al., 2010. Time-varying inter-mitochondrial connectivity, defined for mitochondria in the whole network whose signals are at least 90% correlated at any given time point, allowed considering functional local clustering coefficients. It is shown that mitochondrial clustering in isolated cardiac myocytes changes dynamically and is significantly higher than for random mitochondrial networks that are constructed using the Erdös-Rényi model based on the same sets of vertices. The network's time-averaged clustering coefficient for cardiac myocytes was found to be 0.500 ± 0.051 (N=9 versus 0.061 ± 0.020 for random networks, respectively. Our results demonstrate that cardiac mitochondria constitute a network with dynamically connected constituents whose topological organization is prone to clustering. Cluster partitioning in networks of coupled oscillators has been observed in scale-free and chaotic systems and is therefore in good agreement with previous models of cardiac mitochondrial networks (Aon et al., 2008.
An efficient method of key-frame extraction based on a cluster algorithm.
Zhang, Qiang; Yu, Shao-Pei; Zhou, Dong-Sheng; Wei, Xiao-Peng
2013-12-18
This paper proposes a novel method of key-frame extraction for use with motion capture data. This method is based on an unsupervised cluster algorithm. First, the motion sequence is clustered into two classes by the similarity distance of the adjacent frames so that the thresholds needed in the next step can be determined adaptively. Second, a dynamic cluster algorithm called ISODATA is used to cluster all the frames and the frames nearest to the center of each class are automatically extracted as key-frames of the sequence. Unlike many other clustering techniques, the present improved cluster algorithm can automatically address different motion types without any need for specified parameters from users. The proposed method is capable of summarizing motion capture data reliably and efficiently. The present work also provides a meaningful comparison between the results of the proposed key-frame extraction technique and other previous methods. These results are evaluated in terms of metrics that measure reconstructed motion and the mean absolute error value, which are derived from the reconstructed data and the original data.
Evaluation of clustering algorithms for protein-protein interaction networks
Directory of Open Access Journals (Sweden)
van Helden Jacques
2006-11-01
Full Text Available Abstract Background Protein interactions are crucial components of all cellular processes. Recently, high-throughput methods have been developed to obtain a global description of the interactome (the whole network of protein interactions for a given organism. In 2002, the yeast interactome was estimated to contain up to 80,000 potential interactions. This estimate is based on the integration of data sets obtained by various methods (mass spectrometry, two-hybrid methods, genetic studies. High-throughput methods are known, however, to yield a non-negligible rate of false positives, and to miss a fraction of existing interactions. The interactome can be represented as a graph where nodes correspond with proteins and edges with pairwise interactions. In recent years clustering methods have been developed and applied in order to extract relevant modules from such graphs. These algorithms require the specification of parameters that may drastically affect the results. In this paper we present a comparative assessment of four algorithms: Markov Clustering (MCL, Restricted Neighborhood Search Clustering (RNSC, Super Paramagnetic Clustering (SPC, and Molecular Complex Detection (MCODE. Results A test graph was built on the basis of 220 complexes annotated in the MIPS database. To evaluate the robustness to false positives and false negatives, we derived 41 altered graphs by randomly removing edges from or adding edges to the test graph in various proportions. Each clustering algorithm was applied to these graphs with various parameter settings, and the clusters were compared with the annotated complexes. We analyzed the sensitivity of the algorithms to the parameters and determined their optimal parameter values. We also evaluated their robustness to alterations of the test graph. We then applied the four algorithms to six graphs obtained from high-throughput experiments and compared the resulting clusters with the annotated complexes. Conclusion This
A heuristic approach to possibilistic clustering algorithms and applications
Viattchenin, Dmitri A
2013-01-01
The present book outlines a new approach to possibilistic clustering in which the sought clustering structure of the set of objects is based directly on the formal definition of fuzzy cluster and the possibilistic memberships are determined directly from the values of the pairwise similarity of objects. The proposed approach can be used for solving different classification problems. Here, some techniques that might be useful at this purpose are outlined, including a methodology for constructing a set of labeled objects for a semi-supervised clustering algorithm, a methodology for reducing analyzed attribute space dimensionality and a methods for asymmetric data processing. Moreover, a technique for constructing a subset of the most appropriate alternatives for a set of weak fuzzy preference relations, which are defined on a universe of alternatives, is described in detail, and a method for rapidly prototyping the Mamdani’s fuzzy inference systems is introduced. This book addresses engineers, scientist...
Cluster dynamics transcending chemical dynamics toward nuclear fusion.
Heidenreich, Andreas; Jortner, Joshua; Last, Isidore
2006-07-11
Ultrafast cluster dynamics encompasses femtosecond nuclear dynamics, attosecond electron dynamics, and electron-nuclear dynamics in ultraintense laser fields (peak intensities 10(15)-10(20) W.cm(-2)). Extreme cluster multielectron ionization produces highly charged cluster ions, e.g., (C(4+)(D(+))(4))(n) and (D(+)I(22+))(n) at I(M) = 10(18) W.cm(-2), that undergo Coulomb explosion (CE) with the production of high-energy (5 keV to 1 MeV) ions, which can trigger nuclear reactions in an assembly of exploding clusters. The laser intensity and the cluster size dependence of the dynamics and energetics of CE of (D(2))(n), (HT)(n), (CD(4))(n), (DI)(n), (CD(3)I)(n), and (CH(3)I)(n) clusters were explored by electrostatic models and molecular dynamics simulations, quantifying energetic driving effects, and kinematic run-over effects. The optimization of table-top dd nuclear fusion driven by CE of deuterium containing heteroclusters is realized for light-heavy heteroclusters of the largest size, which allows for the prevalence of cluster vertical ionization at the highest intensity of the laser field. We demonstrate a 7-orders-of-magnitude enhancement of the yield of dd nuclear fusion driven by CE of light-heavy heteroclusters as compared with (D(2))(n) clusters of the same size. Prospective applications for the attainment of table-top nucleosynthesis reactions, e.g., (12)C(P,gamma)(13)N driven by CE of (CH(3)I)(n) clusters, were explored.
A comparison of clustering algorithms in article recommendation system
Tantanasiriwong, Supaporn
2012-01-01
Recommendation system is considered a tool that can be used to recommend researchers about resources that are suitable for their research of interest by using content-based filtering. In this paper, clustering algorithm as an unsupervised learning is introduced for grouping objects based on their feature selection and similarities. The information of publication in Science Cited Index is used to be dataset for clustering as a feature extraction in terms of dimensionality reduction of these articles by comparing Latent Dirichlet Allocation (LDA), Principal Component Analysis (PCA), and K-Mean to determine the best algorithm. In my experiment, the selected database consists of 2625 documents extraction extracted from SCI corpus from 2001 to 2009. Clustering into ranks as 50,100,200,250 is used to consider and using F-Measure evaluate among them in three algorithms. The result of this paper showed that LDA technique given the accuracy up to 95.5% which is the highest effective than any other clustering technique.
Nonlocalized cluster dynamics and nuclear molecular structure
Zhou, Bo; Horiuchi, Hisashi; Ren, Zhongzhou; Röpke, Gerd; Schuck, Peter; Tohsaki, Akihiro; Xu, Chang; Yamada, Taiichi
2013-01-01
A container picture is proposed for understanding cluster dynamics where the clusters make nonlocalized motion occupying the lowest orbit of the cluster mean-field potential characterized by the size parameter $``B"$ in the THSR (Tohsaki-Horiuchi-Schuck-R\\"{o}pke) wave function. The nonlocalized cluster aspects of the inversion-doublet bands in $^{20}$Ne which have been considered as a typical manifestation of localized clustering are discussed. So far unexplained puzzling features of the THSR wave function, namely that after angular-momentum projection for two cluster systems the prolate THSR wave function is almost 100$\\%$ equivalent to an oblate THSR wave function is clarified. It is shown that the true intrinsic two-cluster THSR configuration is nonetheless prolate. The proposal of the container picture is based on the fact that typical cluster systems, 2$\\alpha$, 3$\\alpha$, and $\\alpha$+$^{16}$O, are all well described by a single THSR wave function. It will be shown for the case of linear-chain states w...
A Clustering Genetic Algorithm for Cylinder Drag Optimization
Milano, Michele; Koumoutsakos, Petros
2002-01-01
A real coded genetic algorithm is implemented for the optimization of actuator parameters for cylinder drag minimization. We consider two types of idealized actuators that are allowed either to move steadily and tangentially to the cylinder surface (“belts”) or to steadily blow/suck with a zero net mass constraint. The genetic algorithm we implement has the property of identifying minima basins, rather than single optimum points. The knowledge of the shape of the minimum basin enables further insights into the system properties and provides a sensitivity analysis in a fully automated way. The drag minimization problem is formulated as an optimal regulation problem. By means of the clustering property of the present genetic algorithm, a set of solutions producing drag reduction of up to 50% is identified. A comparison between the two types of actuators, based on the clustering property of the algorithm, indicates that blowing/suction actuation parameters are associated with larger tolerances when compared to optimal parameters for the belt actuators. The possibility of using a few strategically placed actuators to obtain a significant drag reduction is explored using the clustering diagnostics of this method. The optimal belt-actuator parameters obtained by optimizing the two-dimensional case is employed in three-dimensional simulations, by extending the actuators across the span of the cylinder surface. The three-dimensional controlled flow exhibits a strong two-dimensional character near the cylinder surface, resulting in significant drag reduction.
Robustness of the ATLAS pixel clustering neural network algorithm
Sidebo, Per Edvin; The ATLAS collaboration
2016-01-01
Proton-proton collisions at the energy frontier puts strong constraints on track reconstruction algorithms. The algorithms depend heavily on accurate estimation of the position of particles as they traverse the inner detector elements. An artificial neural network algorithm is utilised to identify and split clusters of neighbouring read-out elements in the ATLAS pixel detector created by multiple charged particles. The method recovers otherwise lost tracks in dense environments where particles are separated by distances comparable to the size of the detector read-out elements. Such environments are highly relevant for LHC run 2, e.g. in searches for heavy resonances. Within the scope of run 2 track reconstruction performance and upgrades, the robustness of the neural network algorithm will be presented. The robustness has been studied by evaluating the stability of the algorithm’s performance under a range of variations in the pixel detector conditions.
Comparative Study of Clustering Algorithms in Text Mining Context
Directory of Open Access Journals (Sweden)
Abdennour Mohamed Jalil
2016-06-01
Full Text Available The spectacular increasing of Data is due to the appearance of networks and smartphones. Amount 42% of world population using internet [1]; have created a problem related of the processing of the data exchanged, which is rising exponentially and that should be automatically treated. This paper presents a classical process of knowledge discovery databases, in order to treat textual data. This process is divided into three parts: preprocessing, processing and post-processing. In the processing step, we present a comparative study between several clustering algorithms such as KMeans, Global KMeans, Fast Global KMeans, Two Level KMeans and FWKmeans. The comparison between these algorithms is made on real textual data from the web using RSS feeds. Experimental results identified two problems: the first one quality results which remain for algorithms, which rapidly converge. The second problem is due to the execution time that needs to decrease for some algorithms.
Clustered volatility in multiagent dynamics
Youssefmir, M; Youssefmir, Michael; Huberman, Bernardo
1995-01-01
Large distributed multiagent systems are characterized by vast numbers of agents trying to gain access to limited resources in an unpredictable environment. Agents in these system continuously switch strategies in order to opportunistically find improvements in their utilities. We have analyzed the fluctuations around equilibrium that arise from strategy switching and discovered the existence of a new phenomenon. It consists of the appearance of sudden bursts of activity that punctuate the fixed point, and is due to an effective random walk consistent with overall stability. This clustered volatility is followed by relaxation to the fixed point but with different strategy mixes from the previous one. This phenomenon is quite general for systems in which agents explore strategies in search of local improvements.
A REAL—TIME C—V CLUSTERING ALGORITHM FOR WEB—MINING
Institute of Scientific and Technical Information of China (English)
LiHaiying; ZuangZhenquan; 等
2002-01-01
In this letter, a real-time C-V (Characteristic-Vector) clustering algorithm is put forth to treat with vast action data which are dynamically collected from web site.The algo-fithm cites the concept of C-V to denote characteristic, synchronously it adopts two-value[0,1] input and self-definition vigilance parameter to design clustering-architecture.Vector Degree of Matching(VDM) plays a key role in the clustering algorithm, which determines the magnitude of typical characteristic.Making use of stability analysis, the classifications are confirmed to have reliably hierarchical structure when vigilance parameter shifts from 0.1 to 0.99.This non-linear relation between vigilance parameter and classification upper limit helps mining out representa-tive classifications from net-users according to the actural web resource, then administering system can map them to web resource space to implement the intelligent configuration effectually and reapidly.
clusterMaker: a multi-algorithm clustering plugin for Cytoscape
2011-01-01
Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin clusterMaker provides a number of clustering
clusterMaker: a multi-algorithm clustering plugin for Cytoscape
Directory of Open Access Journals (Sweden)
Morris John H
2011-11-01
Full Text Available Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view, k-means, k-medoid, SCPS, AutoSOME, and native (Java MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin cluster
Exploring New Clustering Algorithms for the CMS Tracker FED
Gamboa Alvarado, Jose Leandro
2013-01-01
In the current Front End (FE) firmware clusters of hits within the APV frames are found using a simple threshold comparison (which is made between the data and a 3 or 5 sigma strip noise cut) on reordered pedestal and Common Mode (CM) noise subtracted data. In addition the CM noise subtraction requires the baseline of each APV frame to be approximately uniform. Therefore, the current algorithm will fail if the APV baseline exhibits large-scale non-uniform behavior. Under very high luminosity conditions the assumption of a uniform APV baseline breaks down and the FED is unable to maintain a high efficiency of cluster finding. \
High-performance dynamic quantum clustering on graphics processors
Energy Technology Data Exchange (ETDEWEB)
Wittek, Peter, E-mail: peterwittek@acm.org [Swedish School of Library and Information Science, University of Boras, Boras (Sweden)
2013-01-15
Clustering methods in machine learning may benefit from borrowing metaphors from physics. Dynamic quantum clustering associates a Gaussian wave packet with the multidimensional data points and regards them as eigenfunctions of the Schroedinger equation. The clustering structure emerges by letting the system evolve and the visual nature of the algorithm has been shown to be useful in a range of applications. Furthermore, the method only uses matrix operations, which readily lend themselves to parallelization. In this paper, we develop an implementation on graphics hardware and investigate how this approach can accelerate the computations. We achieve a speedup of up to two magnitudes over a multicore CPU implementation, which proves that quantum-like methods and acceleration by graphics processing units have a great relevance to machine learning.
Institute of Scientific and Technical Information of China (English)
无
1995-01-01
This paper presents a neural network approach,based on high-order twodimension temporal and dynamically clustering competitive activation mechanisms,to implement parallel searching algorithm and many other symbolic logic algorithms.This approach is superior in many respects to both the common sequential algorithms of symbolic logic and the common neural network used for optimization problems.Simulations of problem solving examples prove the effectiveness of the approach.
Accelerated Monte Carlo by Embedded Cluster Dynamics
Brower, R. C.; Gross, N. A.; Moriarty, K. J. M.
1991-07-01
We present an overview of the new methods for embedding Ising spins in continuous fields to achieve accelerated cluster Monte Carlo algorithms. The methods of Brower and Tamayo and Wolff are summarized and variations are suggested for the O( N) models based on multiple embedded Z2 spin components and/or correlated projections. Topological features are discussed for the XY model and numerical simulations presented for d=2, d=3 and mean field theory lattices.
Early dynamical evolution of young substructured clusters
Dorval, Julien; Boily, Christian
2017-03-01
Stellar clusters form with a high level of substructure, inherited from the molecular cloud and the star formation process. Evidence from observations and simulations also indicate the stars in such young clusters form a subvirial system. The subsequent dynamical evolution can cause important mass loss, ejecting a large part of the birth population in the field. It can also imprint the stellar population and still be inferred from observations of evolved clusters. Nbody simulations allow a better understanding of these early twists and turns, given realistic initial conditions. Nowadays, substructured, clumpy young clusters are usually obtained through pseudo-fractal growth and velocity inheritance. We introduce a new way to create clumpy initial conditions through a ''Hubble expansion'' which naturally produces self consistent clumps, velocity-wise. In depth analysis of the resulting clumps shows consistency with hydrodynamical simulations of young star clusters. We use these initial conditions to investigate the dynamical evolution of young subvirial clusters. We find the collapse to be soft, with hierarchical merging leading to a high level of mass segregation. The subsequent evolution is less pronounced than the equilibrium achieved from a cold collapse formation scenario.
Nonlinear dynamics of electron-positron clusters
Manfredi, Giovanni; Haas, Fernando; 10.1088/1367-2630/14/7/075012
2012-01-01
Electron-positron clusters are studied using a quantum hydrodynamic model that includes Coulomb and exchange interactions. A variational Lagrangian method is used to determine their stationary and dynamical properties. The cluster static features are validated against existing Hartree-Fock calculations. In the linear response regime, we investigate both dipole and monopole (breathing) modes. The dipole mode is reminiscent of the surface plasmon mode usually observed in metal clusters. The nonlinear regime is explored by means of numerical simulations. We show that, by exciting the cluster with a chirped laser pulse with slowly varying frequency (autoresonance), it is possible to efficiently separate the electron and positron populations on a timescale of a few tens of femtoseconds.
FCM Clustering Algorithms for Segmentation of Brain MR Images
Directory of Open Access Journals (Sweden)
Yogita K. Dubey
2016-01-01
Full Text Available The study of brain disorders requires accurate tissue segmentation of magnetic resonance (MR brain images which is very important for detecting tumors, edema, and necrotic tissues. Segmentation of brain images, especially into three main tissue types: Cerebrospinal Fluid (CSF, Gray Matter (GM, and White Matter (WM, has important role in computer aided neurosurgery and diagnosis. Brain images mostly contain noise, intensity inhomogeneity, and weak boundaries. Therefore, accurate segmentation of brain images is still a challenging area of research. This paper presents a review of fuzzy c-means (FCM clustering algorithms for the segmentation of brain MR images. The review covers the detailed analysis of FCM based algorithms with intensity inhomogeneity correction and noise robustness. Different methods for the modification of standard fuzzy objective function with updating of membership and cluster centroid are also discussed.
Mapping cultivable land from satellite imagery with clustering algorithms
Arango, R. B.; Campos, A. M.; Combarro, E. F.; Canas, E. R.; Díaz, I.
2016-07-01
Open data satellite imagery provides valuable data for the planning and decision-making processes related with environmental domains. Specifically, agriculture uses remote sensing in a wide range of services, ranging from monitoring the health of the crops to forecasting the spread of crop diseases. In particular, this paper focuses on a methodology for the automatic delimitation of cultivable land by means of machine learning algorithms and satellite data. The method uses a partition clustering algorithm called Partitioning Around Medoids and considers the quality of the clusters obtained for each satellite band in order to evaluate which one better identifies cultivable land. The proposed method was tested with vineyards using as input the spectral and thermal bands of the Landsat 8 satellite. The experimental results show the great potential of this method for cultivable land monitoring from remote-sensed multispectral imagery.
Modeling, clustering, and segmenting video with mixtures of dynamic textures.
Chan, Antoni B; Vasconcelos, Nuno
2008-05-01
A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectationmaximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, time-series clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (e.g. fire, steam, water, vehicle and pedestrian traffic, etc.). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (e.g. optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.
Advanced defect detection algorithm using clustering in ultrasonic NDE
Gongzhang, Rui; Gachagan, Anthony
2016-02-01
A range of materials used in industry exhibit scattering properties which limits ultrasonic NDE. Many algorithms have been proposed to enhance defect detection ability, such as the well-known Split Spectrum Processing (SSP) technique. Scattering noise usually cannot be fully removed and the remaining noise can be easily confused with real feature signals, hence becoming artefacts during the image interpretation stage. This paper presents an advanced algorithm to further reduce the influence of artefacts remaining in A-scan data after processing using a conventional defect detection algorithm. The raw A-scan data can be acquired from either traditional single transducer or phased array configurations. The proposed algorithm uses the concept of unsupervised machine learning to cluster segmental defect signals from pre-processed A-scans into different classes. The distinction and similarity between each class and the ensemble of randomly selected noise segments can be observed by applying a classification algorithm. Each class will then be labelled as `legitimate reflector' or `artefacts' based on this observation and the expected probability of defection (PoD) and probability of false alarm (PFA) determined. To facilitate data collection and validate the proposed algorithm, a 5MHz linear array transducer is used to collect A-scans from both austenitic steel and Inconel samples. Each pulse-echo A-scan is pre-processed using SSP and the subsequent application of the proposed clustering algorithm has provided an additional reduction to PFA while maintaining PoD for both samples compared with SSP results alone.
Core Business Selection Based on Ant Colony Clustering Algorithm
Yu Lan; Yan Bo; Yao Baozhen
2014-01-01
Core business is the most important business to the enterprise in diversified business. In this paper, we first introduce the definition and characteristics of the core business and then descript the ant colony clustering algorithm. In order to test the effectiveness of the proposed method, Tianjin Port Logistics Development Co., Ltd. is selected as the research object. Based on the current situation of the development of the company, the core business of the company can be acquired by ant c...
Dynamic Shortest Path Algorithms for Hypergraphs
2012-01-01
Performance comparison of algorithms for the dynamic shortest path problem,” IEICE Transactions on Fundamentals of Electronics , Communications and...computation,” IEEE/ACM Transactions on Networking, vol. 8, no. 6, pp. 734–746, 2000. [8] G. Ramalingam and T. Reps, “An incremental algorithm for a...multihop performance,” IEEE Transactions on Mobile Computing, pp. 337–348, 2003. [17] S. Chachulski, M. Jennings, S. Katti, and D. Katabli, “Trading
Comparison of cluster expansion fitting algorithms for interactions at surfaces
Herder, Laura M.; Bray, Jason M.; Schneider, William F.
2015-10-01
Cluster expansions (CEs) are Ising-type interaction models that are increasingly used to model interaction and ordering phenomena at surfaces, such as the adsorbate-adsorbate interactions that control coverage-dependent adsorption or surface-vacancy interactions that control surface reconstructions. CEs are typically fit to a limited set of data derived from density functional theory (DFT) calculations. The CE fitting process involves iterative selection of DFT data points to include in a fit set and selection of interaction clusters to include in the CE. Here we compare the performance of three CE fitting algorithms-the MIT Ab-initio Phase Stability code (MAPS, the default in ATAT software), a genetic algorithm (GA), and a steepest descent (SD) algorithm-against synthetic data. The synthetic data is encoded in model Hamiltonians of varying complexity motivated by the observed behavior of atomic adsorbates on a face-centered-cubic transition metal close-packed (111) surface. We compare the performance of the leave-one-out cross-validation score against the true fitting error available from knowledge of the hidden CEs. For these systems, SD achieves lowest overall fitting and prediction error independent of the underlying system complexity. SD also most accurately predicts cluster interaction energies without ignoring or introducing extra interactions into the CE. MAPS achieves good results in fewer iterations, while the GA performs least well for these particular problems.
Optimized algorithm for balancing clusters in wireless sensor networks
Institute of Scientific and Technical Information of China (English)
Mucheol KIM; Sun-hong KIM; Hyungjin BYUN; Sang-yong HAN
2009-01-01
Wireless sensor networks consist of hundreds or thousands of sensor nodes that involve numerous restrictions including computation capability and battery capacity. Topology control is an important issue for achieving a balanced placement of sensor nodes. The clustering scheme is a widely known and efficient means of topology control for transmitting information to the base station in two hops. The automatic routing scheme of the self-organizing technique is another critical element of wireless sensor networks. In this paper we propose an optimal algorithm with cluster balance taken into consideration, and compare it with three well known and widely used approaches, I.e., LEACH, MEER, and VAP-E, in performance evaluation. Experimental results show that the proposed approach increases the overall network lifetime, indicating that the amount of energy required for communication to the base station will be reduced for locating an optimal cluster.
Optimized Bayesian dynamic advising theory and algorithms
Karny, Miroslav
2006-01-01
Written by one of the world's leading groups in the area of Bayesian identification, control, and decision making, this book provides the theoretical and algorithmic basis of optimized probabilistic advising. Starting from abstract ideas and formulations, and culminating in detailed algorithms, the book comprises a unified treatment of an important problem of the design of advisory systems supporting supervisors of complex processes. It introduces the theoretical and algorithmic basis of developed advising, relying on novel and powerful combination black-box modelling by dynamic mixture models
Cell Division, Differentiation and Dynamic Clustering
Kaneko, K; Kaneko, Kunihiko; Yomo, Tetsuya
1993-01-01
A novel mechanism for cell differentiation is proposed, based on the dynamic clustering in a globally coupled chaotic system. A simple model with metabolic reaction, active transport of chemicals from media, and cell division is found to show three successive stages with the growth of the number of cells; coherent growth, dynamic clustering, and fixed cell differentiation. At the last stage, disparity in activities, germ line segregation, somatic cell differentiation, and homeochaotic stability against external perturbation are found. Our results, in consistency with the experiments of the preceding paper, imply that cell differentiation can occur without a spatial pattern. From dynamical systems viewpoint, the new concept of ``open chaos" is proposed, as a novel and general scenario for systems with growing numbers of elements, also seen in economics and sociology.A
The Dynamical Equilibrium of Galaxy Clusters
Carlberg, R. G.; Yee, H. K. C.; Ellingson, E.; Morris, S. L.; Abraham, R.; Gravel, P.; Pritchet, C. J.; Smecker-Hane, T.; Hartwick, F. D. A.; Hesser, J. E.; Hutchings, J. B.; Oke, J. B.
1997-02-01
If a galaxy cluster is effectively in dynamical equilibrium, then all galaxy populations within the cluster must have distributions in velocity and position that individually reflect the same underlying mass distribution, although the derived virial masses can be quite different. Specifically, within the Canadian Network for Observational Cosmology cluster sample, the virial radius of the red galaxy population is, on the average, a factor of 2.05 +/- 0.34 smaller than that of the blue population. The red galaxies also have a smaller rms velocity dispersion, a factor of 1.31 +/- 0.13 within our sample. Consequently, the virial mass calculated from the blue galaxies is 3.5 +/- 1.3 times larger than from the red galaxies. However, applying the Jeans equation of stellar hydrodynamic equilibrium to the red and blue subsamples separately gives statistically identical cluster mass profiles. This is strong evidence that these clusters are effectively equilibrium systems and therefore demonstrates empirically that the masses in the virialized region are reliably estimated using dynamical techniques.
A cluster analysis on road traffic accidents using genetic algorithms
Saharan, Sabariah; Baragona, Roberto
2017-04-01
The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.
Dynamic algorithms for the Dyck languages
DEFF Research Database (Denmark)
Frandsen, Gudmund Skovbjerg; Husfeldt, Thore; Miltersen, Peter Bro;
1995-01-01
We study Dynamic Membership problems for the Dyck languages, the class of strings of properly balanced parentheses. We also study the Dynamic Word problem for the free group. We present deterministic algorithms and data structures which maintain a string under replacements of symbols, insertions......, and deletions of symbols, and language membership queries. Updates and queries are handled in polylogarithmic time. We also give both Las Vegas- and Monte Carlo-type randomised algorithms to achieve better running times, and present lower bounds on the complexity for variants of the problems....
Community Clustering Algorithm in Complex Networks Based on Microcommunity Fusion
Directory of Open Access Journals (Sweden)
Jin Qi
2015-01-01
Full Text Available With the further research on physical meaning and digital features of the community structure in complex networks in recent years, the improvement of effectiveness and efficiency of the community mining algorithms in complex networks has become an important subject in this area. This paper puts forward a concept of the microcommunity and gets final mining results of communities through fusing different microcommunities. This paper starts with the basic definition of the network community and applies Expansion to the microcommunity clustering which provides prerequisites for the microcommunity fusion. The proposed algorithm is more efficient and has higher solution quality compared with other similar algorithms through the analysis of test results based on network data set.
Density-based cluster algorithms for the identification of core sets
Lemke, Oliver; Keller, Bettina G.
2016-10-01
The core-set approach is a discretization method for Markov state models of complex molecular dynamics. Core sets are disjoint metastable regions in the conformational space, which need to be known prior to the construction of the core-set model. We propose to use density-based cluster algorithms to identify the cores. We compare three different density-based cluster algorithms: the CNN, the DBSCAN, and the Jarvis-Patrick algorithm. While the core-set models based on the CNN and DBSCAN clustering are well-converged, constructing core-set models based on the Jarvis-Patrick clustering cannot be recommended. In a well-converged core-set model, the number of core sets is up to an order of magnitude smaller than the number of states in a conventional Markov state model with comparable approximation error. Moreover, using the density-based clustering one can extend the core-set method to systems which are not strongly metastable. This is important for the practical application of the core-set method because most biologically interesting systems are only marginally metastable. The key point is to perform a hierarchical density-based clustering while monitoring the structure of the metric matrix which appears in the core-set method. We test this approach on a molecular-dynamics simulation of a highly flexible 14-residue peptide. The resulting core-set models have a high spatial resolution and can distinguish between conformationally similar yet chemically different structures, such as register-shifted hairpin structures.
Clustering Algorithms: Their Application to Gene Expression Data
Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel
2016-01-01
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867
Institute of Scientific and Technical Information of China (English)
许颖梅
2014-01-01
Sliding window is one kind of approximation methods on recent data in data streams .This paper proposes an optimization algorithm SWStream which processes data over sliding window .In the online component , the sliding window tree is introduced to store the important statistical information of data streams , and adjust the sizes of sliding windows .Optimized algorithm can promptly eliminate expired tuple , and the new tuples arrive continuously in real-time processing , which can achieve more accurate results .In the offline component, by employing the mean value of the macro-clusters, generate the final clustering results .Com-pared with clustering algorithm CluStream , this algorithm is more efficient on data processing and memory sav-ing.%滑动窗口是数据流中一种关注近期数据的近似方法，提出一种采用滑动窗口处理数据的优化算法SWStream。在线阶段利用滑动窗口树存储概要结构，动态调整窗口大小。优化后的算法能及时淘汰过期元组，同时对新到达的元组不断进行实时处理，可以获得更准确的分析结果。而在离线阶段对上一阶段的结果进行宏聚类，得到最后的结果。与聚类算法CluS-tream相比，此算法处理数据的效率更高，也相对节约内存。
Kale, R.; Venturi, T.; Cassano, R.; Giacintucci, S.; Bardelli, S.; Dallacasa, D.; Zucca, E.
2015-09-01
Aims: First-ranked galaxies in clusters, usually referred to as brightest cluster galaxies (BCGs), show exceptional properties over the whole electromagnetic spectrum. They are the most massive elliptical galaxies and show the highest probability to be radio loud. Moreover, their special location at the centres of galaxy clusters raises the question of the role of the environment in shaping their radio properties. In the attempt to separate the effect of the galaxy mass and of the environment on their statistical radio properties, we investigate the possible dependence of the occurrence of radio loudness and of the fractional radio luminosity function on the dynamical state of the hosting cluster. Methods: We studied the radio properties of the BCGs in the Extended GMRT Radio Halo Survey (EGRHS), which consists of 65 clusters in the redshift range 0.2-0.4, with X-ray luminosity LX ≥ 5 × 1044 erg s-1, and quantitative information on their dynamical state from high-quality Chandra imaging. We obtained a statistical sample of 59 BCGs, which we divided into two classes, depending on whether the dynamical state of the host cluster was merging (M) or relaxed (R). Results: Of the 59 BCGs, 28 are radio loud and 31 are radio quiet. The radio-loud sources are favourably located in relaxed clusters (71%), while the reverse is true for the radio-quiet BCGs, which are mostly located in merging systems (81%). The fractional radio luminosity function for the BCGs in merging and relaxed clusters is different, and it is considerably higher for BCGs in relaxed clusters, where the total fraction of radio loudness reaches almost 90%, to be compared to the ~30% in merging clusters. For relaxed clusters, we found a positive correlation between the radio power of the BCGs and the strength of the cool core, consistent with previous studies on local samples. Conclusions: Our study suggests that the radio loudness of the BCGs strongly depends on the cluster dynamics; their fraction is
Mustapha, Ibrahim; Mohd Ali, Borhanuddin; Rasid, Mohd Fadlee A; Sali, Aduwati; Mohamad, Hafizal
2015-08-13
It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach.
Structure and dynamics of cationic van-der-Waals clusters. II. Dynamics of protonated argon clusters
Ritschel, T.; Zuhrt, Ch.; Zülicke, L.; Kuntz, P. J.
2007-01-01
A diatomics-in-molecules (DIM) model with ab-initio input data, which in part I successfully described the structure and bonding properties of protonated argon clusters ArnH+, is used here to investigate some aspects of the dynamics of such aggregates for n up to 30. The simple triatomic ionic fragment, Ar2H+, is studied in some detail with respect to normal vibrations, characteristics of classical intramolecular dynamics as reflected in the Fourier spectra of dynamical variables, and accurate quantum states of the vibrational motion. For larger clusters ArnH+ (n ≤30), the normal vibrational frequencies (and displacement eigenvectors) are calculated and related to the cluster structure. In addition, the Fourier spectra are analyzed with respect to their variation with changing internal energy and cluster size. As expected, the clusters show some floppy character. Even a little vibrational excitation can lead to internal rearrangement and to Ar-atom evaporation from the clusters; this is studied in more detail for one small complex (n = 3). Electronic excitation to one of the low-lying excited states, which are all globally repulsive, leads to complete fragmentation (atomization) of the clusters. A variety of conceivable elementary collision processes involving protonated argon clusters are discussed. Some of these may play a role in the gas-phase formation of medium-sized ArnH+ aggregates.
Directory of Open Access Journals (Sweden)
Tcha Hong
2008-01-01
Full Text Available Abstract Background The previous studies of genome-wide expression patterns show that a certain percentage of genes are cell cycle regulated. The expression data has been analyzed in a number of different ways to identify cell cycle dependent genes. In this study, we pose the hypothesis that cell cycle dependent genes are considered as oscillating systems with a rhythm, i.e. systems producing response signals with period and frequency. Therefore, we are motivated to apply the theory of multivariate phase synchronization for clustering cell cycle specific genome-wide expression data. Results We propose the strategy to find groups of genes according to the specific biological process by analyzing cell cycle specific gene expression data. To evaluate the propose method, we use the modified Kuramoto model, which is a phase governing equation that provides the long-term dynamics of globally coupled oscillators. With this equation, we simulate two groups of expression signals, and the simulated signals from each group shares their own common rhythm. Then, the simulated expression data are mixed with randomly generated expression data to be used as input data set to the algorithm. Using these simulated expression data, it is shown that the algorithm is able to identify expression signals that are involved in the same oscillating process. We also evaluate the method with yeast cell cycle expression data. It is shown that the output clusters by the proposed algorithm include genes, which are closely associated with each other by sharing significant Gene Ontology terms of biological process and/or having relatively many known biological interactions. Therefore, the evaluation analysis indicates that the method is able to identify expression signals according to the specific biological process. Our evaluation analysis also indicates that some portion of output by the proposed algorithm is not obtainable by the traditional clustering algorithm with
Dynamical analysis of the cluster pair: A3407 + A3408
Nascimento, R S; Trevisan, M; Carrasco, E R; Plana, H; Dupke, R
2016-01-01
We carried out a dynamical study of the galaxy cluster pair A3407 \\& A3408 based on a spectroscopic survey obtained with the 4 meter Blanco telescope at the CTIO, plus 6dF data, and ROSAT All-Sky-Survey. The sample consists of 122 member galaxies brighter than $m_R=20$. Our main goal is to probe the galaxy dynamics in this field and verify if the sample constitutes a single galaxy system or corresponds to an ongoing merging process. Statistical tests were applied to clusters members showing that both the composite system A3407 + A3408 as well as each individual cluster have Gaussian velocity distribution. A velocity gradient of $\\sim 847\\pm 114$ $\\rm km\\;s^{-1}$ was identified around the principal axis of the projected distribution of galaxies, indicating that the global field may be rotating. Applying the KMM algorithm to the distribution of galaxies we found that the solution with two clusters is better than the single unit solution at the 99\\% c.l. This is consistent with the X-ray distribution around ...
Identifying multiple influential spreaders by a heuristic clustering algorithm
Bao, Zhong-Kui; Liu, Jian-Guo; Zhang, Hai-Feng
2017-03-01
The problem of influence maximization in social networks has attracted much attention. However, traditional centrality indices are suitable for the case where a single spreader is chosen as the spreading source. Many times, spreading process is initiated by simultaneously choosing multiple nodes as the spreading sources. In this situation, choosing the top ranked nodes as multiple spreaders is not an optimal strategy, since the chosen nodes are not sufficiently scattered in networks. Therefore, one ideal situation for multiple spreaders case is that the spreaders themselves are not only influential but also they are dispersively distributed in networks, but it is difficult to meet the two conditions together. In this paper, we propose a heuristic clustering (HC) algorithm based on the similarity index to classify nodes into different clusters, and finally the center nodes in clusters are chosen as the multiple spreaders. HC algorithm not only ensures that the multiple spreaders are dispersively distributed in networks but also avoids the selected nodes to be very "negligible". Compared with the traditional methods, our experimental results on synthetic and real networks indicate that the performance of HC method on influence maximization is more significant.
Gravitation field algorithm and its application in gene cluster
Directory of Open Access Journals (Sweden)
Zheng Ming
2010-09-01
Full Text Available Abstract Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA.
Local rewiring algorithms to increase clustering and grow a small world
Alstott, Jeff; Pizza, Pamela B; Radcliffe, Mary
2016-01-01
Many real-world networks have high clustering among vertices: vertices that share neighbors are often also directly connected to each other. A network's clustering can be a useful indicator of its connectedness and community structure. Algorithms for generating networks with high clustering have been developed, but typically rely on adding or removing edges and nodes, sometimes from a completely empty network. Here, we introduce algorithms that create a highly clustered network by starting with an existing network and rearranging edges, without adding or removing them; these algorithms can preserve other network properties even as the clustering increases. These algorithms rely on local rewiring rules, in which a single edge changes one of its vertices in a way that is guaranteed to increase clustering. This greedy algorithm can be applied iteratively to transform a random network into a form with much higher clustering. Additionally, these algorithms grow the network's clustering faster than they increase it...
Sweeney, Timothy E; Chen, Albert C; Gevaert, Olivier
2015-11-19
In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of 'dark art', with multiple techniques available for assessing the validity of unsupervised clustering algorithms. Here, we present a new technique for unsupervised clustering that uses multiple clustering algorithms, multiple validity metrics, and progressively bigger subsets of the data to produce an intuitive 3D map of cluster stability that can help determine the optimal number of clusters in a data set, a technique we call COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL). COMMUNAL locally optimizes algorithms and validity measures for the data being used. We show its application to simulated data with a known K, and then apply this technique to several well-known cancer gene expression datasets, showing that COMMUNAL provides new insights into clustering behavior and stability in all tested cases. COMMUNAL is shown to be a useful tool for determining K in complex biological datasets, and is freely available as a package for R.
Realization of R-tree for GIS on hybrid clustering algorithm
Institute of Scientific and Technical Information of China (English)
HUANG Ji-xian; BAO Guang-shu; LI Qing-song
2005-01-01
The characteristic of geographic information system(GIS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.
A Flow-Partitioned Unequal Clustering Routing Algorithm for Wireless Sensor Networks
Jian Peng; Xiaohai Chen; Tang Liu
2014-01-01
Energy efficiency and energy balance are two important issues for wireless sensor networks. In previous clustering routing algorithms, multihop transmission, sleep scheduling, and unequal clustering are always used to improve energy efficiency and energy balance. In these algorithms, only the cluster heads share the burden of data forwarding in each round. In this paper, we propose a flow-partitioned unequal clustering routing (FPUC) algorithm to achieve better energy efficiency and energy ba...
Development of Automatic Cluster Algorithm for Microcalcification in Digital Mammography
Energy Technology Data Exchange (ETDEWEB)
Choi, Seok Yoon [Dept. of Medical Engineering, Korea University, Seoul (Korea, Republic of); Kim, Chang Soo [Dept. of Radiological Science, College of Health Sciences, Catholic University of Pusan, Pusan (Korea, Republic of)
2009-03-15
Digital Mammography is an efficient imaging technique for the detection and diagnosis of breast pathological disorders. Six mammographic criteria such as number of cluster, number, size, extent and morphologic shape of microcalcification, and presence of mass, were reviewed and correlation with pathologic diagnosis were evaluated. It is very important to find breast cancer early when treatment can reduce deaths from breast cancer and breast incision. In screening breast cancer, mammography is typically used to view the internal organization. Clusterig microcalcifications on mammography represent an important feature of breast mass, especially that of intraductal carcinoma. Because microcalcification has high correlation with breast cancer, a cluster of a microcalcification can be very helpful for the clinical doctor to predict breast cancer. For this study, three steps of quantitative evaluation are proposed : DoG filter, adaptive thresholding, Expectation maximization. Through the proposed algorithm, each cluster in the distribution of microcalcification was able to measure the number calcification and length of cluster also can be used to automatically diagnose breast cancer as indicators of the primary diagnosis.
Clustering of User Behaviour based on Web Log data using Improved K-Means Clustering Algorithm
Directory of Open Access Journals (Sweden)
S.Padmaja
2016-02-01
Full Text Available The proposed work does an improved K-means clustering algorithm for identifying internet user behaviour. Web data analysis includes the transformation and interpretation of web log data find out the information, patterns and knowledge discovery. The efficiency of the algorithm is analyzed by considering certain parameters. The parameters are date, time, S_id, CS_method, C_IP, User_agent and time taken. The research done by using more than 2 years of real data set collected from two different group of institutions web server .this dataset provides a better analysis of Log data to identify internet user behaviour.
Dynamic task scheduling algorithm with load balancing for heterogeneous computing system
Directory of Open Access Journals (Sweden)
Doaa M. Abdelkader
2012-07-01
Full Text Available In parallel computation, the scheduling and mapping tasks is considered the most critical problem which needs High Performance Computing (HPC to solve it by breaking the problem into subtasks and working on those subtasks at the same time. The application sub tasks are assigned to underline machines and ordered for execution according to its proceeding to grantee efficient use of available resources such as minimize execution time and satisfy load balance between processors of the underline machine. The underline infrastructure may be homogeneous or heterogeneous. Homogeneous infrastructure could use the same machines power and performance. While heterogeneous infrastructure include machines differ in its performance, speed, and interconnection. According to work in this paper a new dynamic task scheduling algorithm for Heterogeneous called a Clustering Based HEFT with Duplication (CBHD have been developed. The CBHD algorithm is considered an amalgamation between the most two important task scheduling in Heterogeneous machine, The Heterogeneous Earliest Finish Time (HEFT and the Triplet Clustering algorithms. In the CBHD algorithm the duplication is required to improve the performance of algorithm. A comparative study among the developed CBHD, the HEFT, and the Triplet Cluster algorithms has been done. According to the comparative results, it is found that the developed CBHD algorithm satisfies better execution time than both HEFT algorithm and Triplet Cluster algorithm, and in the same time, it achieves the load balancing which considered one of the main performance factors in the dynamic environment.
Clustering Algorithms for Heterogeneous Wireless Sensor Networks - A Brief Survey
Directory of Open Access Journals (Sweden)
A.MeenaKowshalya
2011-09-01
Full Text Available Wireless sensor networks (WSN are emerging in vari ous fields like disaster management, battle field surveillance and border security surveillance. A la rge number of sensors in these applications are unattended and work autonomously. Clustering is a k ey technique to improve the network lifetime, reduc e the energy consumption and increase the scalability of the sensor network. In this paper, we study the impact of heterogeneity of the nodes to the perform ance of WSN. This paper surveys the different clust ering algorithm for heterogeneous WSN .
Classification of posture maintenance data with fuzzy clustering algorithms
Bezdek, James C.
1992-01-01
Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various sensory organization test (SOT) conditions were collected in conjunction with Johnson Space Center postural control studies using a tilt-translation device (TTD). The University of West Florida applied the fuzzy c-meams (FCM) clustering algorithms to this data with a view towards identifying various states and stages of subjects experiencing such changes. Feature analysis, time step analysis, pooling data, response of the subjects, and the algorithms used are discussed.
Intelligent control algorithm for ship dynamic positioning
Directory of Open Access Journals (Sweden)
Meng Wang
2014-12-01
Full Text Available Ship motion in the sea is a complex nonlinear kinematics. The hydrodynamic coefficients of ship model are very difficult to accurately determine. Establishing accurate mathematical model of ship motion is difficult because of changing random factors in the marine environment. Aiming at seeking a method of control to realize ship positioning, intelligent control algorithms are adopt utilizing operator's experience. Fuzzy controller and the neural network controller are respectively designed. Through simulations and experiments, intelligent control algorithm can deal with the complex nonlinear motion, and has good robustness. The ship dynamic positioning system with neural network control has high positioning accuracy and performance.
Synaptic dynamics: linear model and adaptation algorithm.
Yousefi, Ali; Dibazar, Alireza A; Berger, Theodore W
2014-08-01
In this research, temporal processing in brain neural circuitries is addressed by a dynamic model of synaptic connections in which the synapse model accounts for both pre- and post-synaptic processes determining its temporal dynamics and strength. Neurons, which are excited by the post-synaptic potentials of hundred of the synapses, build the computational engine capable of processing dynamic neural stimuli. Temporal dynamics in neural models with dynamic synapses will be analyzed, and learning algorithms for synaptic adaptation of neural networks with hundreds of synaptic connections are proposed. The paper starts by introducing a linear approximate model for the temporal dynamics of synaptic transmission. The proposed linear model substantially simplifies the analysis and training of spiking neural networks. Furthermore, it is capable of replicating the synaptic response of the non-linear facilitation-depression model with an accuracy better than 92.5%. In the second part of the paper, a supervised spike-in-spike-out learning rule for synaptic adaptation in dynamic synapse neural networks (DSNN) is proposed. The proposed learning rule is a biologically plausible process, and it is capable of simultaneously adjusting both pre- and post-synaptic components of individual synapses. The last section of the paper starts with presenting the rigorous analysis of the learning algorithm in a system identification task with hundreds of synaptic connections which confirms the learning algorithm's accuracy, repeatability and scalability. The DSNN is utilized to predict the spiking activity of cortical neurons and pattern recognition tasks. The DSNN model is demonstrated to be a generative model capable of producing different cortical neuron spiking patterns and CA1 Pyramidal neurons recordings. A single-layer DSNN classifier on a benchmark pattern recognition task outperforms a 2-Layer Neural Network and GMM classifiers while having fewer numbers of free parameters and
Ortiz, Juan F; Rokas, Antonis
2017-01-01
Closely spaced clusters of tandemly duplicated genes (CTDGs) contribute to the diversity of many phenotypes, including chemosensation, snake venom, and animal body plans. CTDGs have traditionally been identified subjectively as genomic neighborhoods containing several gene duplicates in close proximity; however, CTDGs are often highly variable with respect to gene number, intergenic distance, and synteny. This lack of formal definition hampers the study of CTDG evolutionary dynamics and the discovery of novel CTDGs in the exponentially growing body of genomic data. To address this gap, we developed a novel homology-based algorithm, CTDGFinder, which formalizes and automates the identification of CTDGs by examining the physical distribution of individual members of families of duplicated genes across chromosomes. Application of CTDGFinder accurately identified CTDGs for many well-known gene clusters (e.g., Hox and beta-globin gene clusters) in the human, mouse and 20 other mammalian genomes. Differences between previously annotated gene clusters and our inferred CTDGs were due to the exclusion of nonhomologs that have historically been considered parts of specific gene clusters, the inclusion or absence of genes between the CTDGs and their corresponding gene clusters, and the splitting of certain gene clusters into distinct CTDGs. Examination of human genes showing tissue-specific enhancement of their expression by CTDGFinder identified members of several well-known gene clusters (e.g., cytochrome P450s and olfactory receptors) and revealed that they were unequally distributed across tissues. By formalizing and automating CTDG identification, CTDGFinder will facilitate understanding of CTDG evolutionary dynamics, their functional implications, and how they are associated with phenotypic diversity. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e
Image Transformation using Modified Kmeans clustering algorithm for Parallel saliency map
Directory of Open Access Journals (Sweden)
Aman Sharma
2013-08-01
Full Text Available to design an image transformation system is Depending on the transform chosen, the input and output images may appear entirely different and have different interpretations. Image Transformationwith the help of certain module like input image, image cluster index, object in cluster and color index transformation of image. K-means clustering algorithm is used to cluster the image for bettersegmentation. In the proposed method parallel saliency algorithm with K-means clustering is used to avoid local minima and to find the saliency map. The region behind that of using parallel saliency algorithm is proved to be more than exiting saliency algorithm.
Directory of Open Access Journals (Sweden)
Mingwei Leng
2013-01-01
Full Text Available The accuracy of most of the existing semisupervised clustering algorithms based on small size of labeled dataset is low when dealing with multidensity and imbalanced datasets, and labeling data is quite expensive and time consuming in many real-world applications. This paper focuses on active data selection and semisupervised clustering algorithm in multidensity and imbalanced datasets and proposes an active semisupervised clustering algorithm. The proposed algorithm uses an active mechanism for data selection to minimize the amount of labeled data, and it utilizes multithreshold to expand labeled datasets on multidensity and imbalanced datasets. Three standard datasets and one synthetic dataset are used to demonstrate the proposed algorithm, and the experimental results show that the proposed semisupervised clustering algorithm has a higher accuracy and a more stable performance in comparison to other clustering and semisupervised clustering algorithms, especially when the datasets are multidensity and imbalanced.
Nonequilibrium molecular dynamics theory, algorithms and applications
Todd, Billy D
2017-01-01
Written by two specialists with over twenty-five years of experience in the field, this valuable text presents a wide range of topics within the growing field of nonequilibrium molecular dynamics (NEMD). It introduces theories which are fundamental to the field - namely, nonequilibrium statistical mechanics and nonequilibrium thermodynamics - and provides state-of-the-art algorithms and advice for designing reliable NEMD code, as well as examining applications for both atomic and molecular fluids. It discusses homogenous and inhomogenous flows and pays considerable attention to highly confined fluids, such as nanofluidics. In addition to statistical mechanics and thermodynamics, the book covers the themes of temperature and thermodynamic fluxes and their computation, the theory and algorithms for homogenous shear and elongational flows, response theory and its applications, heat and mass transport algorithms, applications in molecular rheology, highly confined fluids (nanofluidics), the phenomenon of slip and...
Cluster Monte Carlo and dynamical scaling for long-range interactions
Flores-Sola, Emilio; Kenna, Ralph; Berche, Bertrand
2016-01-01
Many spin systems affected by critical slowing down can be efficiently simulated using cluster algorithms. Where such systems have long-range interactions, suitable formulations can additionally bring down the computational effort for each update from O($N^2$) to O($N\\ln N$) or even O($N$), thus promising an even more dramatic computational speed-up. Here, we review the available algorithms and propose a new and particularly efficient single-cluster variant. The efficiency and dynamical scaling of the available algorithms are investigated for the Ising model with power-law decaying interactions.
Clustering Algorithm Based on Crowding Niche%小生境排挤聚类算法
Institute of Scientific and Technical Information of China (English)
业宁; 董逸生
2003-01-01
A new clustering algorithm is proposed in this paper, which is based on crowding niche. Homogeneityspontaneous to withstands heterogeneity when organisms are evolving. Contemporary, Individual in same class com-pete each other to strive for limited resource. Individual that has bad fitness will be eliminated. We propose a cluster-ing algorithm based on this idea. Experiment evaluation has proved its efficiency.
A Heuristic Task Scheduling Algorithm for Heterogeneous Virtual Clusters
Directory of Open Access Journals (Sweden)
Weiwei Lin
2016-01-01
Full Text Available Cloud computing provides on-demand computing and storage services with high performance and high scalability. However, the rising energy consumption of cloud data centers has become a prominent problem. In this paper, we first introduce an energy-aware framework for task scheduling in virtual clusters. The framework consists of a task resource requirements prediction module, an energy estimate module, and a scheduler with a task buffer. Secondly, based on this framework, we propose a virtual machine power efficiency-aware greedy scheduling algorithm (VPEGS. As a heuristic algorithm, VPEGS estimates task energy by considering factors including task resource demands, VM power efficiency, and server workload before scheduling tasks in a greedy manner. We simulated a heterogeneous VM cluster and conducted experiment to evaluate the effectiveness of VPEGS. Simulation results show that VPEGS effectively reduced total energy consumption by more than 20% without producing large scheduling overheads. With the similar heuristic ideology, it outperformed Min-Min and RASA with respect to energy saving by about 29% and 28%, respectively.
Ternary alloy material prediction using genetic algorithm and cluster expansion
Energy Technology Data Exchange (ETDEWEB)
Chen, Chong [Iowa State Univ., Ames, IA (United States)
2015-12-01
This thesis summarizes our study on the crystal structures prediction of Fe-V-Si system using genetic algorithm and cluster expansion. Our goal is to explore and look for new stable compounds. We started from the current ten known experimental phases, and calculated formation energies of those compounds using density functional theory (DFT) package, namely, VASP. The convex hull was generated based on the DFT calculations of the experimental known phases. Then we did random search on some metal rich (Fe and V) compositions and found that the lowest energy structures were body centered cube (bcc) underlying lattice, under which we did our computational systematic searches using genetic algorithm and cluster expansion. Among hundreds of the searched compositions, thirteen were selected and DFT formation energies were obtained by VASP. The stability checking of those thirteen compounds was done in reference to the experimental convex hull. We found that the composition, 24-8-16, i.e., Fe_{3}VSi_{2} is a new stable phase and it can be very inspiring to the future experiments.
Thermodynamic Casimir effect in films: the exchange cluster algorithm.
Hasenbusch, Martin
2015-02-01
We study the thermodynamic Casimir force for films with various types of boundary conditions and the bulk universality class of the three-dimensional Ising model. To this end, we perform Monte Carlo simulations of the improved Blume-Capel model on the simple cubic lattice. In particular, we employ the exchange or geometric cluster cluster algorithm [Heringa and Blöte, Phys. Rev. E 57, 4976 (1998)]. In a previous work, we demonstrated that this algorithm allows us to compute the thermodynamic Casimir force for the plate-sphere geometry efficiently. It turns out that also for the film geometry a substantial reduction of the statistical error can achieved. Concerning physics, we focus on (O,O) boundary conditions, where O denotes the ordinary surface transition. These are implemented by free boundary conditions on both sides of the film. Films with such boundary conditions undergo a phase transition in the universality class of the two-dimensional Ising model. We determine the inverse transition temperature for a large range of thicknesses L(0) of the film and study the scaling of this temperature with L(0). In the neighborhood of the transition, the thermodynamic Casimir force is affected by finite size effects, where finite size refers to a finite transversal extension L of the film. We demonstrate that these finite size effects can be computed by using the universal finite size scaling function of the free energy of the two-dimensional Ising model.
Maximum-entropy clustering algorithm and its global convergence analysis
Institute of Scientific and Technical Information of China (English)
ZHANG; Zhihua
2001-01-01
［1］Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithm. New York: Plenum, 1981.［2］Krishnapuram, R., Keller, J., A possibilistic approach to clustering, IEEE Trans. on Fuzzy Systems, 1993, 1(2): 98.［3］Yair, E., Zeger, K., Gersho, A., Competitive learning and soft competition for vector quantizer design, IEEE Trans on Signal Processing, 1992, 40(2): 294.［4］Pal, N. R., Bezdek, J. C., Tsao, E. C. K., Generalized clustering networks and Kohonen's self-organizing scheme, IEEE Trans on Neural Networks, 1993, 4(4): 549.［5］Karayiannis, N. B., Bezdek, J. C., Pal, N. R. et al., Repair to GLVQ: a new family of competitive learning schemes, IEEE Trans on Neural Networks, 1996, 7(5): 1062.［6］Karayiannis, N. B., Pai, P. I., Fuzzy algorithms for learning vector quantization, IEEE Trans. on Neural Networks, 1996, 7(5): 1196.［7］Karayiannis, N. B., A methodology for constructing fuzzy algorithms for learning vector quantization, IEEE Trans. on Neural Networks, 1997, 8(3): 505.［8］Karayiannis, N. B., Bezdek, J. C., An integrated approach to fuzzy learning vector quantization and fuzzy C-Means clustering, IEEE Trans. on Fuzzy Systems, 1997, 5(4): 622.［9］Li Xing-si, An efficient approach to nonlinear minimax problems, Chinese Science Bulletin? 1992, 37(10): 802.［10］Li Xing-si, An efficient approach to a class of non-smooth optimization problems, Science in China, Series A,1994, 37(3): 323.［11］. Zangwill, W., Non-linear Programming: A Unified Approach, Englewood Cliffs: Prentice-Hall, 1969.［12］. Fletcher, R., Practical Methods of Optimization,2nd ed., New York: John Wiley & Sons, 1987.［13］. Zhang Zhihua, Zheng Nanning, Wang Tianshu, Behavioral analysis and improving of generalized LVQ neural network, Acta Automatica Sinica, 1999, 25(5): 582.［14］. Kirkpatrick, S., Gelatt, C. D., Vecchi, M. P., Optimization by simulated annealing, Science, 1983, 220(3): 671.［15］. Ross, K., Deterministic annealing for
An Improved Dynamic Bandwidth Allocation Algorithm for Ethernet PON
Institute of Scientific and Technical Information of China (English)
无
2003-01-01
This paper proposes an improved Dynamic Bandwidth Allocation (DBA) algorithm for EPON, which combines static and traditional dynamic allocation schemes. Simulation result shows that the proposed algorithm may effectively improve the performance of packet delay.
A mathematical programming approach for sequential clustering of dynamic networks
Silva, Jonathan C.; Bennett, Laura; Papageorgiou, Lazaros G.; Tsoka, Sophia
2016-02-01
A common analysis performed on dynamic networks is community structure detection, a challenging problem that aims to track the temporal evolution of network modules. An emerging area in this field is evolutionary clustering, where the community structure of a network snapshot is identified by taking into account both its current state as well as previous time points. Based on this concept, we have developed a mixed integer non-linear programming (MINLP) model, SeqMod, that sequentially clusters each snapshot of a dynamic network. The modularity metric is used to determine the quality of community structure of the current snapshot and the historical cost is accounted for by optimising the number of node pairs co-clustered at the previous time point that remain so in the current snapshot partition. Our method is tested on social networks of interactions among high school students, college students and members of the Brazilian Congress. We show that, for an adequate parameter setting, our algorithm detects the classes that these students belong more accurately than partitioning each time step individually or by partitioning the aggregated snapshots. Our method also detects drastic discontinuities in interaction patterns across network snapshots. Finally, we present comparative results with similar community detection methods for time-dependent networks from the literature. Overall, we illustrate the applicability of mathematical programming as a flexible, adaptable and systematic approach for these community detection problems. Contribution to the Topical Issue "Temporal Network Theory and Applications", edited by Petter Holme.
IoT Service Clustering for Dynamic Service Matchmaking.
Zhao, Shuai; Yu, Le; Cheng, Bo; Chen, Junliang
2017-07-27
As the adoption of service-oriented paradigms in the IoT (Internet of Things) environment, real-world devices will open their capabilities through service interfaces, which enable other functional entities to interact with them. In an IoT application, it is indispensable to find suitable services for satisfying users' requirements or replacing the unavailable services. However, from the perspective of performance, it is inappropriate to find desired services from the service repository online directly. Instead, clustering services offline according to their similarity and matchmaking or discovering service online in limited clusters is necessary. This paper proposes a multidimensional model-based approach to measure the similarity between IoT services. Then, density-peaks-based clustering is employed to gather similar services together according to the result of similarity measurement. Based on the service clustering, the algorithms of dynamic service matchmaking, discovery, and replacement will be performed efficiently. Evaluating experiments are conducted to validate the performance of proposed approaches, and the results are promising.
A Request Distribution Algorithm for Web Server Cluster
Directory of Open Access Journals (Sweden)
Wei Zhang
2011-12-01
Full Text Available With the explosively increasing of web-based applications’ workloads, Web server cluster encounters challenge in response time for requests. Request distribution among servers in web server cluster is the key to address such challenge, especially under heavy workloads. In this paper, we propose a new request distribution algorithm named llac (least load active cache for load balancing switch in web server cluster. The goal of llac is to improve the cache hit rate and reduce response time. Packets are parsed in IP level, and back-end servers are notified to cache hot files using link change technology, neither changing URL information nor modifying the service program. This avoids switching overhead between user mode and kernel mode. The load balancing switch directly creates connection with the selected server, avoiding migrating connection overhead. This policy estimates the current composited load of each server and selects the server with the least load to serve the request. It also improves the resource utilization of web servers. Experimental results show that llac achieves better performance for web applications than wrr (weight round robin which is a popular request distribution.
Gong, Lina; Xu, Tao; Zhang, Wei; Li, Xuhong; Wang, Xia; Pan, Wenwen
2017-03-01
The traditional microblog recommendation algorithm has the problems of low efficiency and modest effect in the era of big data. In the aim of solving these issues, this paper proposed a mixed recommendation algorithm with user clustering. This paper first introduced the situation of microblog marketing industry. Then, this paper elaborates the user interest modeling process and detailed advertisement recommendation methods. Finally, this paper compared the mixed recommendation algorithm with the traditional classification algorithm and mixed recommendation algorithm without user clustering. The results show that the mixed recommendation algorithm with user clustering has good accuracy and recall rate in the microblog advertisements promotion.
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm
Directory of Open Access Journals (Sweden)
Serge Thomas Mickala Bourobou
2015-05-01
Full Text Available This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen’s temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home.
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm.
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-05-21
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen's temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home.
Textural defect detect using a revised ant colony clustering algorithm
Zou, Chao; Xiao, Li; Wang, Bingwen
2007-11-01
We propose a totally novel method based on a revised ant colony clustering algorithm (ACCA) to explore the topic of textural defect detection. In this algorithm, our efforts are mainly made on the definition of local irregularity measurement and the implementation of the revised ACCA. The local irregular measurement defined evaluates the local textural inconsistency of each pixel against their mini-environment. In our revised ACCA, the behaviors of each ant are divided into two steps: release pheromone and act. The quantity of pheromone released is proportional to the irregularity measurement; the actions of the ants to act next are chosen independently of each other in a stochastic way according to some evaluated heuristic knowledge. The independency of ants implies the inherent parallel computation architecture of this algorithm. We apply the proposed method in some typical textural images with defects. From the series of pheromone distribution map (PDM), it can be clearly seen that the pheromone distribution approaches the textual defects gradually. By some post-processing, the final distribution of pheromone can demonstrate the shape and area of the defects well.
Kale, Ruta; Cassano, Rossella; Giacintucci, Simona; Bardelli, sandro; Dallacasa, Daniele; Zucca, Elena
2015-01-01
Brightest Cluster Galaxies (BCGs) show exceptional properties over the whole electromagnetic spectrum. Their special location at the centres of galaxy clusters raises the question of the role of the environment on their radio properties. To decouple the effect of the galaxy mass and of the environment in their statistical radio properties, we investigate the possible dependence of the occurrence of radio loudness and of the fractional radio luminosity function on the dynamical state of the hosting cluster. We studied the radio properties of the BCGs in the Extended GMRT Radio Halo Survey (EGRHS). We obtained a statistical sample of 59 BCGs, which was divided into two classes, depending on the dynamical state of the host cluster, i.e. merging (M) and relaxed (R). Among the 59 BCGs, 28 are radio-loud, and 31 are radio--quiet. The radio-loud sources are located favourably located in relaxed clusters (71\\%), while the reverse is true for the radio-quiet BCGs, mostly located in merging systems (81\\%). The fraction...
Self-Expanded Clustering Algorithm Based on Density Units with Evaluation Feedback Section
Institute of Scientific and Technical Information of China (English)
YU Yongqian; ZHAO Xiangguo; CHEN Hengyue; WANG Bin; YU Ge; WANG Guoren
2006-01-01
This paper presents an effective clustering mode and a novel clustering result evaluating mode. Clustering mode has two limited integral parameters. Evaluating mode evaluates clustering results and gives each a mark. The higher mark the clustering result gains, the higher quality it has. By organizing two modes in different ways, we can build two clustering algorithms: SECDU(Self-Expanded Clustering Algorithm based on Density Units) and SECDUF(Self-Expanded Clustering Algorithm Based on Density Units with Evaluation Feedback Section). SECDU enumerates all value pairs of two parameters of clustering mode to process data set repeatedly and evaluates every clustering result by evaluating mode. Then SECDU output the clustering result that has the highest evaluating mark among all the ones. By applying "hill-climbing algorithm", SECDUF improves clustering efficiency greatly. Data sets that have different distribution features can be well adapted to both algorithms. SECDU and SECDUF can output high-quality clustering results. SECDUF tunes parameters of clustering mode automatically and no man's action involves through the whole process. In addition, SECDUF has a high clustering performance.
Numerical simulation study of the dynamical behavior of the Niedermayer algorithm
Girardi, D
2010-01-01
We calculate the dynamic critical exponent for the Niedermayer algorithm applied to the two-dimensional Ising and XY models, for various values of the free parameter $E_0$. For $E_0=-1$ we regain the Metropolis algorithm and for $E_0=1$ we regain the Wolff algorithm. For $-1\\widetilde{L}$, the Niedermayer algorithm is equivalent to the Metropolis one, i.e, they have the same dynamic exponent. For $E_0>1$, the autocorrelation time is always greater than for $E_0=1$ (Wolff) and, more important, it also grows faster than a power of $L$. Therefore, we show that the best choice of cluster algorithm is the Wolff one, when compared to the Nierdermayer generalization. We also obtain the dynamic behavior of the Wolff algorithm: although not conclusive, we propose a scaling law for the dependence of the autocorrelation time on $L$.
An improved scheduling algorithm for 3D cluster rendering with platform LSF
Xu, Wenli; Zhu, Yi; Zhang, Liping
2013-10-01
High-quality photorealistic rendering of 3D modeling needs powerful computing systems. On this demand highly efficient management of cluster resources develops fast to exert advantages. This paper is absorbed in the aim of how to improve the efficiency of 3D rendering tasks in cluster. It focuses research on a dynamic feedback load balance (DFLB) algorithm, the work principle of load sharing facility (LSF) and optimization of external scheduler plug-in. The algorithm can be applied into match and allocation phase of a scheduling cycle. Candidate hosts is prepared in sequence in match phase. And the scheduler makes allocation decisions for each job in allocation phase. With the dynamic mechanism, new weight is assigned to each candidate host for rearrangement. The most suitable one will be dispatched for rendering. A new plugin module of this algorithm has been designed and integrated into the internal scheduler. Simulation experiments demonstrate the ability of improved plugin module is superior to the default one for rendering tasks. It can help avoid load imbalance among servers, increase system throughput and improve system utilization.
Dynamic programming algorithms for biological sequence comparison.
Pearson, W R; Miller, W
1992-01-01
Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.
An efficient hybrid evolutionary optimization algorithm based on PSO and SA for clustering
Institute of Scientific and Technical Information of China (English)
Taher NIKNAM; Babak AMIRI; Javad OLAMAEI; Ali AREFI
2009-01-01
The K-means algorithm is one of the most popular techniques in clustering. Nevertheless, the performance of the Kmeans algorithm depends highly on initial cluster centers and converges to local minima. This paper proposes a hybrid evolutionary programming based clustering algorithm, called PSO-SA, by combining particle swarm optimization (PSO) and simulated annealing (SA). The basic idea is to search around the global solution by SA and to increase the information exchange among particles using a mutation operator to escape local optima. Three datasets, Iris, Wisconsin Breast Cancer, and Riplcy's Glass, have been considered to show the effectiveness of the proposed clustering algorithm in providing optimal clusters. The simulation results show that the PSO-SA clustering algorithm not only has a better response but also converges more quickly than the K-means, PSO, and SA algorithms.
An Affinity Propagation Clustering Algorithm for Mixed Numeric and Categorical Datasets
Directory of Open Access Journals (Sweden)
Kang Zhang
2014-01-01
Full Text Available Clustering has been widely used in different fields of science, technology, social science, and so forth. In real world, numeric as well as categorical features are usually used to describe the data objects. Accordingly, many clustering methods can process datasets that are either numeric or categorical. Recently, algorithms that can handle the mixed data clustering problems have been developed. Affinity propagation (AP algorithm is an exemplar-based clustering method which has demonstrated good performance on a wide variety of datasets. However, it has limitations on processing mixed datasets. In this paper, we propose a novel similarity measure for mixed type datasets and an adaptive AP clustering algorithm is proposed to cluster the mixed datasets. Several real world datasets are studied to evaluate the performance of the proposed algorithm. Comparisons with other clustering algorithms demonstrate that the proposed method works well not only on mixed datasets but also on pure numeric and categorical datasets.
Dynamic Clustering in Object-Oriented Databases: An Advocacy for Simplicity
Darmont, Jérôme; Régnier, Stéphane; Gruenwald, Le; Schneider, Michel
2007-01-01
We present in this paper three dynamic clustering techniques for Object-Oriented Databases (OODBs). The first two, Dynamic, Statistical & Tunable Clustering (DSTC) and StatClust, exploit both comprehensive usage statistics and the inter-object reference graph. They are quite elaborate. However, they are also complex to implement and induce a high overhead. The third clustering technique, called Detection & Reclustering of Objects (DRO), is based on the same principles, but is much simpler to implement. These three clustering algorithm have been implemented in the Texas persistent object store and compared in terms of clustering efficiency (i.e., overall performance increase) and overhead using the Object Clustering Benchmark (OCB). The results obtained showed that DRO induced a lighter overhead while still achieving better overall performance.
Directory of Open Access Journals (Sweden)
G. Abel Thangaraja
2014-11-01
Full Text Available The need of Data mining is because of the explosive growth of data from terabytes to petabytes. Data mining preprocess aims to produce the quality mining result in descriptive and predictive analysis. The quality of a clustering result depends on both the similarity measure used by the method and its implementation. A straightforward way to combine structural and attribute similarities is to use a weighted distance function. Clustering results are arrived based on attribute similarities. The clusters balance the attribute and structural similarities. The existing Structural and Attribute cluster algorithm is analyzed and a new algorithm is proposed. Both the algorithms are compared and results are analyzed. It is found that the modified algorithm gives better quality clusters.
Combined Density-based and Constraint-based Algorithm for Clustering
Institute of Scientific and Technical Information of China (English)
CHEN Tung-shou; CHEN Rong-chang; LIN Chih-chiang; CHIU Yung-hsing
2006-01-01
We propose a new clustering algorithm that assists the researchers to quickly and accurately analyze data. We call this algorithm Combined Density-based and Constraint-based Algorithm (CDC). CDC consists of two phases. In the first phase, CDC employs the idea of density-based clustering algorithm to split the original data into a number of fragmented clusters. At the same time, CDC cuts off the noises and outliers. In the second phase, CDC employs the concept of K-means clustering algorithm to select a greater cluster to be the center. Then, the greater cluster merges some smaller clusters which satisfy some constraint rules.Due to the merged clusters around the center cluster, the clustering results show high accu racy. Moreover, CDC reduces the calculations and speeds up the clustering process. In this paper, the accuracy of CDC is evaluated and compared with those of K-means, hierarchical clustering, and the genetic clustering algorithm (GCA)proposed in 2004. Experimental results show that CDC has better performance.
Robust K-Median and K-Means Clustering Algorithms for Incomplete Data
Directory of Open Access Journals (Sweden)
Jinhua Li
2016-01-01
Full Text Available Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for complete data, such as K-median and K-means. However, in practice, it is often hard to obtain accurate estimation of the missing values, which deteriorates the performance of clustering. To enhance the robustness of clustering algorithms, this paper represents the missing values by interval data and introduces the concept of robust cluster objective function. A minimax robust optimization (RO formulation is presented to provide clustering results, which are insensitive to estimation errors. To solve the proposed RO problem, we propose robust K-median and K-means clustering algorithms with low time and space complexity. Comparisons and analysis of experimental results on both artificially generated and real-world incomplete data sets validate the robustness and effectiveness of the proposed algorithms.
Directory of Open Access Journals (Sweden)
Muthukkumar R.
2016-07-01
Full Text Available Cognitive Radio (CR is a promising and potential technique to enable secondary users (SUs or unlicenced users to exploit the unused spectrum resources effectively possessed by primary users (PUs or licenced users. The proven clustering approach is used to organize nodes in the network into the logical groups to attain energy efficiency, network scalability, and stability for improving the sensing accuracy in CR through cooperative spectrum sensing (CSS. In this paper, a distributed dynamic load balanced clustering (DDLBC algorithm is proposed. In this algorithm, each member in the cluster is to calculate the cooperative gain, residual energy, distance, and sensing cost from the neighboring clusters to perform the optimal decision. Each member in a cluster participates in selecting a cluster head (CH through cooperative gain, and residual energy that minimises network energy consumption and enhances the channel sensing. First, we form the number of clusters using the Markov decision process (MDP model to reduce the energy consumption in a network. In this algorithm, CR users effectively utilize the PUs reporting time slots of unavailability. The simulation results reveal that the clusters convergence, energy efficiency, and accuracy of channel sensing increased considerably by using the proposed algorithm.
Directory of Open Access Journals (Sweden)
Guohua Zou
2016-12-01
Full Text Available New medical imaging technology, such as Computed Tomography and Magnetic Resonance Imaging (MRI, has been widely used in all aspects of medical diagnosis. The purpose of these imaging techniques is to obtain various qualitative and quantitative data of the patient comprehensively and accurately, and provide correct digital information for diagnosis, treatment planning and evaluation after surgery. MR has a good imaging diagnostic advantage for brain diseases. However, as the requirements of the brain image definition and quantitative analysis are always increasing, it is necessary to have better segmentation of MR brain images. The FCM (Fuzzy C-means algorithm is widely applied in image segmentation, but it has some shortcomings, such as long computation time and poor anti-noise capability. In this paper, firstly, the Ant Colony algorithm is used to determine the cluster centers and the number of FCM algorithm so as to improve its running speed. Then an improved Markov random field model is used to improve the algorithm, so that its antinoise ability can be improved. Experimental results show that the algorithm put forward in this paper has obvious advantages in image segmentation speed and segmentation effect.
CLUSTER SYNCHRONIZATION IN A COMPLEX DYNAMICAL NETWORK WITH TWO NONIDENTICAL CLUSTERS
Institute of Scientific and Technical Information of China (English)
Liang CHEN; Jun'an LU
2008-01-01
This paper further investigates cluster synchronization in a complex dynamical network with two-cluster. Each cluster contains a number of identical dynamical systems, however, the sub-systems composing the two clusters can be different, i.e., the individual dynamical system in one cluster can differ from that in the other cluster. Complete synchronization within each cluster is possible only if each node from one cluster receives the same input from nodes in other cluster. In this case, the stability condition of one-cluster synchronization is known to contain two terms: the first accounts for the contribution of the inner-cluster coupling structure while the second is simply an extra linear term, which can be deduced by the "same-input" condition. Applying the connection graph stability method, the authors obtain an upper bound of input strength for one cluster if the first account is known, by which the synchronizability of cluster can be scaled. For different clusters, there are different upper bound of input strength by virtue of different dynamics and the corresponding cluster structure. Moreover, two illustrative examples are presented and the numerical simulations coincide with the theoretical analysis.
An improved genetic algorithm with dynamic topology
Cai, Kai-Quan; Tang, Yan-Wu; Zhang, Xue-Jun; Guan, Xiang-Min
2016-12-01
The genetic algorithm (GA) is a nature-inspired evolutionary algorithm to find optima in search space via the interaction of individuals. Recently, researchers demonstrated that the interaction topology plays an important role in information exchange among individuals of evolutionary algorithm. In this paper, we investigate the effect of different network topologies adopted to represent the interaction structures. It is found that GA with a high-density topology ends up more likely with an unsatisfactory solution, contrarily, a low-density topology can impede convergence. Consequently, we propose an improved GA with dynamic topology, named DT-GA, in which the topology structure varies dynamically along with the fitness evolution. Several experiments executed with 15 well-known test functions have illustrated that DT-GA outperforms other test GAs for making a balance of convergence speed and optimum quality. Our work may have implications in the combination of complex networks and computational intelligence. Project supported by the National Natural Science Foundation for Young Scientists of China (Grant No. 61401011), the National Key Technologies R & D Program of China (Grant No. 2015BAG15B01), and the National Natural Science Foundation of China (Grant No. U1533119).
Fundamental algorithms in computational fluid dynamics
Pulliam, Thomas H
2014-01-01
Intended as a textbook for courses in computational fluid dynamics at the senior undergraduate or graduate level, this book is a follow-up to the book Fundamentals of Computational Fluid Dynamics by the same authors, which was published in the series Scientific Computation in 2001. Whereas the earlier book concentrated on the analysis of numerical methods applied to model equations, this new book concentrates on algorithms for the numerical solution of the Euler and Navier-Stokes equations. It focuses on some classical algorithms as well as the underlying ideas based on the latest methods. A key feature of the book is the inclusion of programming exercises at the end of each chapter based on the numerical solution of the quasi-one-dimensional Euler equations and the shock-tube problem. These exercises can be included in the context of a typical course, and sample solutions are provided in each chapter, so readers can confirm that they have coded the algorithms correctly.
Mobility-Aware and Load Balancing Based Clustering Algorithm for Energy Conservation in MANET
Institute of Scientific and Technical Information of China (English)
XU Li; ZHENG Bao-yu; GUO Gong-de
2005-01-01
Mobile ad hoc network (MANET) is one of wireless communication network architecture that has received a lot of attention. MANET is characterized by dynamic network topology and limited energy. With mobility-aware and load balancing based clustering algorithm (MLCA), this paper proposes a new topology management strategy to conserve energy. Performance simulation results show that the proposed MLCA strategy can balances the traffic load inside the whole network, so as to prolong the network lifetime, meanly, at the same time, achieve higher throughput ratio and network stability.
Park, Sang Ha; Lee, Seokjin; Sung, Koeng-Mo
Non-negative matrix factorization (NMF) is widely used for monaural musical sound source separation because of its efficiency and good performance. However, an additional clustering process is required because the musical sound mixture is separated into more signals than the number of musical tracks during NMF separation. In the conventional method, manual clustering or training-based clustering is performed with an additional learning process. Recently, a clustering algorithm based on the mel-frequency cepstrum coefficient (MFCC) was proposed for unsupervised clustering. However, MFCC clustering supplies limited information for clustering. In this paper, we propose various timbre features for unsupervised clustering and a clustering algorithm with these features. Simulation experiments are carried out using various musical sound mixtures. The results indicate that the proposed method improves clustering performance, as compared to conventional MFCC-based clustering.
Enhanced Dynamic Algorithm of Genome Sequence Alignments
Directory of Open Access Journals (Sweden)
Arabi E. keshk
2014-05-01
Full Text Available The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between sequences. This paper introduces an enhancement of dynamic algorithm of genome sequence alignment, which called EDAGSA. It is filling the three main diagonals without filling the entire matrix by the unused data. It gets the optimal solution with decreasing the execution time and therefore the performance is increased. To illustrate the effectiveness of optimizing the performance of the proposed algorithm, it is compared with the traditional methods such as Needleman-Wunsch, Smith-Waterman and longest common subsequence algorithms. Also, database is implemented for using the algorithm in multi-sequence alignments for searching the optimal sequence that matches the given sequence.
Energy Efficient Backoff Hierarchical Clustering Algorithms for Multi-Hop Wireless Sensor Networks
Institute of Scientific and Technical Information of China (English)
Jun Wang; Yong-Tao Cao; Jun-Yuan Xie; Shi-Fu Chen
2011-01-01
Compared with flat routing protocols, clustering is a fundamental performance improvement technique in wireless sensor networks, which can increase network scalability and lifetime. In this paper, we integrate the multi-hop technique with a backoff-based clustering algorithm to organize sensors. By using an adaptive backoff strategy, the algorithm not only realizes load balance among sensor node, but also achieves fairly uniform cluster head distribution across the network. Simulation results also demonstrate our algorithm is more energy-efficient than classical ones. Our algorithm is also easily extended to generate a hierarchy of cluster heads to obtain better network management and energy-efficiency.
Extension of K-Means Algorithm for clustering mixed data | Onuodu ...
African Journals Online (AJOL)
Extension of K-Means Algorithm for clustering mixed data. ... PROMOTING ACCESS TO AFRICAN RESEARCH ... In this work, a new hybrid method has been proposed which extends K-means algorithm to categorical domain and mixed-type ...
Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm
DEFF Research Database (Denmark)
Grotkjær, Thomas; Winther, Ole; Regenberg, Birgitte
2006-01-01
Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods...... analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset....... The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering...
Vinitsky, Sergue; Chuluunbaatar, Ochbadrakh; Rostovtsev, Vitaly; Hai, Luong Le; Derbov, Vladimir; Krassovitskiy, Pavel
2013-01-01
A model for quantum tunnelling of a cluster comprising A identical particles, coupled by oscillator-type potential, through short-range repulsive potential barriers is introduced for the first time in the new symmetrized-coordinate representation and studied within the s-wave approximation. The symbolic-numerical algorithms for calculating the effective potentials of the close-coupling equations in terms of the cluster wave functions and the energy of the barrier quasistationary states are formulated and implemented using the Maple computer algebra system. The effect of quantum transparency, manifesting itself in nonmonotonic resonance-type dependence of the transmission coefficient upon the energy of the particles, the number of the particles A=2,3,4, and their symmetry type, is analyzed. It is shown that the resonance behavior of the total transmission coefficient is due to the existence of barrier quasistationary states imbedded in the continuum.
MST-BASED CLUSTERING TOPOLOGY CONTROL ALGORITHM FOR WIRELESS SENSOR NETWORKS
Institute of Scientific and Technical Information of China (English)
Cai Wenyu; Zhang Meiyan
2010-01-01
In this paper,we propose a novel clustering topology control algorithm named Minimum Spanning Tree (MST)-based Clustering Topology Control (MCTC) for Wireless Sensor Networks (WSNs),which uses a hybrid approach to adjust sensor nodes' transmission power in two-tiered hierarchical WSNs. MCTC algorithm employs a one-hop Maximum Energy & Minimum Distance (MEMD) clustering algorithm to decide clustering status. Each cluster exchanges information between its own Cluster Members (CMs) locally and then deliveries information to the Cluster Head (CH). Moreover,CHs exchange information between CH and CH and afterwards transmits aggregated information to the base station finally. The intra-cluster topology control scheme uses MST to decide CMs' transmission radius,similarly,the inter-cluster topology control scheme applies MST to decide CHs' transmission radius. Since the intra-cluster topology control is a full distributed approach and the inter-cluster topology control is a pure centralized approach performed by the base station,therefore,MCTC algorithm belongs to one kind of hybrid clustering topology control algorithms and can obtain scalability topology and strong connectivity guarantees simultaneously. As a result,the network topology will be reduced by MCTC algorithm so that network energy efficiency will be improved. The simulation results verify that MCTC outperforms traditional topology control schemes such as LMST,DRNG and MEMD at the aspects of average node's degree,average node's power radius and network lifetime,respectively.
An Improved Artificial Immune Algorithm with a Dynamic Threshold
Institute of Scientific and Technical Information of China (English)
Zhang Qiao; Xu Xu; Liang Yan-chun
2006-01-01
An improved artificial immune algorithm with a dynamic threshold is presented. The calculation for the affinity function in the real-valued coding artificial immune algorithm is modified through considering the antibody's fitness and setting the dynamic threshold value. Numerical experiments show that compared with the genetic algorithm and the originally real-valued coding artificial immune algorithm, the improved algorithm possesses high speed of convergence and good performance for preventing premature convergence.
Scale-Invariant Correlations in Dynamic Bacterial Clusters
Chen, Xiao; Dong, Xu; Be'er, Avraham; Swinney, Harry L.; Zhang, H. P.
2012-04-01
In Bacillus subtilis colonies, motile bacteria move collectively, spontaneously forming dynamic clusters. These bacterial clusters share similarities with other systems exhibiting polarized collective motion, such as bird flocks or fish schools. Here we study experimentally how velocity and orientation fluctuations within clusters are spatially correlated. For a range of cell density and cluster size, the correlation length is shown to be 30% of the spatial size of clusters, and the correlation functions collapse onto a master curve after rescaling the separation with correlation length. Our results demonstrate that correlations of velocity and orientation fluctuations are scale invariant in dynamic bacterial clusters.
Identifying prototypical components in behaviour using clustering algorithms.
Directory of Open Access Journals (Sweden)
Elke Braun
Full Text Available Quantitative analysis of animal behaviour is a requirement to understand the task solving strategies of animals and the underlying control mechanisms. The identification of repeatedly occurring behavioural components is thereby a key element of a structured quantitative description. However, the complexity of most behaviours makes the identification of such behavioural components a challenging problem. We propose an automatic and objective approach for determining and evaluating prototypical behavioural components. Behavioural prototypes are identified using clustering algorithms and finally evaluated with respect to their ability to represent the whole behavioural data set. The prototypes allow for a meaningful segmentation of behavioural sequences. We applied our clustering approach to identify prototypical movements of the head of blowflies during cruising flight. The results confirm the previously established saccadic gaze strategy by the set of prototypes being divided into either predominantly translational or rotational movements, respectively. The prototypes reveal additional details about the saccadic and intersaccadic flight sections that could not be unravelled so far. Successful application of the proposed approach to behavioural data shows its ability to automatically identify prototypical behavioural components within a large and noisy database and to evaluate these with respect to their quality and stability. Hence, this approach might be applied to a broad range of behavioural and neural data obtained from different animals and in different contexts.
A Heuristic Clustering Algorithm for Mining Communities in Signed Networks
Institute of Scientific and Technical Information of China (English)
Bo Yang; Da-You Liu
2007-01-01
Signed network is an important kind of complex network, which includes both positive relations and negative relations. Communities of a signed network are defined as the groups of vertices, within which positive relations are dense and between which negative relations are also dense. Being able to identify communities of signed networks is helpful for analysis of such networks. Hitherto many algorithms for detecting network communities have been developed. However, most of them are designed exclusively for the networks including only positive relations and are not suitable for signed networks.So the problem of mining communities of signed networks quickly and correctly has not been solved satisfactorily. In this paper, we propose a heuristic algorithm to address this issue. Compared with major existing methods, our approach has three distinct features. First, it is very fast with a roughly linear time with respect to network size. Second, it exhibits a good clustering capability and especially can work well with complex networks without well-defined community structures.Finally, it is insensitive to its built-in parameters and requires no prior knowledge.
IMPROVING THE CLUSTER PERFORMANCE BY COMBINING PSO AND K-MEANS ALGORITHM
Directory of Open Access Journals (Sweden)
G. Komarasamy
2011-04-01
Full Text Available Clustering is a technique that can divide data objects into groups based on information found in the data that describes the objects and their relationships. In this paper describe to improving the clustering performance by combine Particle Swarm Optimization (PSO and K-means algorithm. The PSO algorithm successfully converges during the initial stages of a global search, but around global optimum, the search process will become very slow. On the contrary, K-means algorithm can achieve faster convergence to optimum solution. Unlike K-means method, new algorithm does not require a specific number of clusters given before performing the clustering process and it is able to find the local optimal number of clusters during the clustering process. In each iteration process, the inertia weight was changed based on the current iteration and best fitness. The experimental result shows that better performance of new algorithm by using different data sets.
A new-style clustering algorithm based on swarm intelligent theory
Institute of Scientific and Technical Information of China (English)
CHEN Zhuo; LIU Xiang-shuang
2007-01-01
Traditional clustering algorithms generally have some problems, such as the sensitivity to initializing parameter, difficulty in finding out the optimization clustering result and the validity of clustering. In this paper, a FSM and a mathematic model of a new-style clustering algorithm based on the swarm intelligence are provided. In this algorithm, the clustering main body moves in a three-dimensional space and has the abilities of memory, communication, analysis, judgment and coordinating information. Experimental results conform that this algorithm has many merits such as insensitive to the order of the data, capable of dealing with exceptional,high-dimension or complicated data. The algorithm can be used in the fields of Web mining, incremental clustering, economic analysis, pattern recognition, document classification and so on.
Domain decomposition algorithms and computational fluid dynamics
Chan, Tony F.
1988-01-01
Some of the new domain decomposition algorithms are applied to two model problems in computational fluid dynamics: the two-dimensional convection-diffusion problem and the incompressible driven cavity flow problem. First, a brief introduction to the various approaches of domain decomposition is given, and a survey of domain decomposition preconditioners for the operator on the interface separating the subdomains is then presented. For the convection-diffusion problem, the effect of the convection term and its discretization on the performance of some of the preconditioners is discussed. For the driven cavity problem, the effectiveness of a class of boundary probe preconditioners is examined.
Local and cluster critical dynamics of the 3d random-site Ising model
Ivaneyko, D.; Ilnytskyi, J.; Berche, B.; Holovatch, Yu.
2006-10-01
We present the results of Monte Carlo simulations for the critical dynamics of the three-dimensional site-diluted quenched Ising model. Three different dynamics are considered, these correspond to the local update Metropolis scheme as well as to the Swendsen-Wang and Wolff cluster algorithms. The lattice sizes of L=10-96 are analysed by a finite-size-scaling technique. The site dilution concentration p=0.85 was chosen to minimize the correction-to-scaling effects. We calculate numerical values of the dynamical critical exponents for the integrated and exponential autocorrelation times for energy and magnetization. As expected, cluster algorithms are characterized by lower values of dynamical critical exponent than the local one: also in the case of dilution critical slowing down is more pronounced for the Metropolis algorithm. However, the striking feature of our estimates is that they suggest that dilution leads to decrease of the dynamical critical exponent for the cluster algorithms. This phenomenon is quite opposite to the local dynamics, where dilution enhances critical slowing down.
Numerical simulation study of the dynamical behavior of the Niedermayer algorithm
Girardi, D.; Branco, N. S.
2010-01-01
We calculate the dynamic critical exponent for the Niedermayer algorithm applied to the two-dimensional Ising and XY models, for various values of the free parameter $E_0$. For $E_0=-1$ we regain the Metropolis algorithm and for $E_0=1$ we regain the Wolff algorithm. For $-11$, the autocorrelation time is always greater than for $E_0=1$ (Wolff) and, more important, it also grows faster than a power of $L$. Therefore, we show that the best choice of cluster algorithm is the Wolff one, when com...
Directory of Open Access Journals (Sweden)
Noha Negm
2013-06-01
Full Text Available Document Clustering is one of the main themes in text mining. It refers to the process of grouping documents with similar contents or topics into clusters to improve both availability and reliability of text mining applications. Some of the recent algorithms address the problem of high dimensionality of the text by using frequent termsets for clustering. Although the drawbacks of the Apriori algorithm, it still the basic algorithm for mining frequent termsets. This paper presents an approach for Clustering Web Documents based on Hashing algorithm for mining Frequent Termsets (CWDHFT. It introduces an efficient Multi-Tire Hashing algorithm for mining Frequent Termsets (MTHFT instead of Apriori algorithm. The algorithm uses new methodology for generating frequent termsets by building the multi-tire hash table during the scanning process of documents only one time. To avoid hash collision, Multi Tire technique is utilized in this proposed hashing algorithm. Based on the generated frequent termset the documents are partitioned and the clustering occurs by grouping the partitions through the descriptive keywords. By using MTHFT algorithm, the scanning cost and computational cost is improved moreover the performance is considerably increased and increase up the clustering process. The CWDHFT approach improved accuracy, scalability and efficiency when compared with existing clustering algorithms like Bisecting K-means and FIHC.
Oxidation dynamics of nanophase aluminum clusters : a molecular dynamics study.
Energy Technology Data Exchange (ETDEWEB)
Ogata, S.
1998-01-27
Oxidation of an aluminum nanocluster (252,158 atoms) of radius 100{angstrom} placed in gaseous oxygen (530,727 atoms) is investigated by performing molecular-dynamics simulations on parallel computers. The simulation takes into account the effect of charge transfer between Al and O based on the electronegativity equalization principles. We find that the oxidation starts at the surface of the cluster and the oxide layer grows to a thickness of {approximately}28{angstrom}. Evolutions of local temperature and densities of Al and O are investigated. The surface oxide melts because of the high temperature resulting from the release of energy associated with Al-O bondings. Amorphous surface-oxides are obtained by quenching the cluster. Vibrational density-of-states for the surface oxide is analyzed through comparisons with those for crystalline Al, Al nanocluster, and {alpha}-Al{sub 2}O{sub 3}.
HYBRID APPROACH FOR OPTIMAL CLUSTER HEAD SELECTION IN WSN USING LEACH AND MONKEY SEARCH ALGORITHMS
Directory of Open Access Journals (Sweden)
T. SHANKAR
2017-02-01
Full Text Available Wireless Sensor Networks (WSNs are being widely used with low-cost, lowpower, multifunction sensors based on the development of wireless communication, which has enabled a wide variety of new applications. In WSN, the main concern is that it contains a limited power battery and is constrained in energy consumption hence energy and lifetime are of paramount importance. To achieve high energy efficiency and prolong network lifetime in WSNs, clustering techniques have been widely adopted. The proposed algorithm is hybridization of well-known Low-Energy Adaptive Clustering Hierarchy (LEACH algorithm with a distinctive Monkey Search (MS algorithm, which is an optimization algorithm used for optimal cluster head selection. The proposed hybrid algorithm exhibit high throughput, residual energy and improved lifetime. Comparison of the proposed hybrid algorithm is made with the well-known cluster-based protocols for WSNs, namely, LEACH and monkey search algorithm, individually.
Adaptation dynamics in densely clustered chemoreceptors.
Directory of Open Access Journals (Sweden)
William Pontius
Full Text Available In many sensory systems, transmembrane receptors are spatially organized in large clusters. Such arrangement may facilitate signal amplification and the integration of multiple stimuli. However, this organization likely also affects the kinetics of signaling since the cytoplasmic enzymes that modulate the activity of the receptors must localize to the cluster prior to receptor modification. Here we examine how these spatial considerations shape signaling dynamics at rest and in response to stimuli. As a model system, we use the chemotaxis pathway of Escherichia coli, a canonical system for the study of how organisms sense, respond, and adapt to environmental stimuli. In bacterial chemotaxis, adaptation is mediated by two enzymes that localize to the clustered receptors and modulate their activity through methylation-demethylation. Using a novel stochastic simulation, we show that distributive receptor methylation is necessary for successful adaptation to stimulus and also leads to large fluctuations in receptor activity in the steady state. These fluctuations arise from noise in the number of localized enzymes combined with saturated modification kinetics between the localized enzymes and the receptor substrate. An analytical model explains how saturated enzyme kinetics and large fluctuations can coexist with an adapted state robust to variation in the expression levels of the pathway constituents, a key requirement to ensure the functionality of individual cells within a population. This contrasts with the well-mixed covalent modification system studied by Goldbeter and Koshland in which mean activity becomes ultrasensitive to protein abundances when the enzymes operate at saturation. Large fluctuations in receptor activity have been quantified experimentally and may benefit the cell by enhancing its ability to explore empty environments and track shallow nutrient gradients. Here we clarify the mechanistic relationship of these large
Dynamical Evolution of Young Embedded Clusters: A Parameter Space Survey
Proszkow, Eva-Marie
2009-01-01
This paper investigates the dynamical evolution of embedded stellar clusters from the protocluster stage, through the embedded star-forming phase, and out to ages of 10 Myr -- after the gas has been removed from the cluster. The relevant dynamical properties of young stellar clusters are explored over a wide range of possible star formation environments using N-body simulations. Many realizations of equivalent initial conditions are used to produce robust statistical descriptions of cluster evolution including the cluster bound fraction, radial probability distributions, as well as the distributions of close encounter distances and velocities. These cluster properties are presented as a function of parameters describing the initial configuration of the cluster, including the initial cluster membership N, initial stellar velocities, cluster radii, star formation efficiency, embedding gas dispersal time, and the degree of primordial mass segregation. The results of this parameter space survey, which includes ab...
Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms
Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel
2016-04-01
Diamantaras, K.: 'Programming and architecture of parallel processing systems', 1st Edition, Eds. Kleidarithmos, 2011 [4] NVIDIA.: 'NVidia CUDA C Programming Guide', version 5.0, NVidia (reference book) [5] Konstantaras, A.: 'Classification of Distinct Seismic Regions and Regional Temporal Modelling of Seismicity in the Vicinity of the Hellenic Seismic Arc', IEEE Selected Topics in Applied Earth Observations and Remote Sensing, vol. 6 (4), pp. 1857-1863, 2013 [6] Konstantaras, A. Varley, M.R.,. Valianatos, F., Collins, G. and Holifield, P.: 'Recognition of electric earthquake precursors using neuro-fuzzy models: methodology and simulation results', Proc. IASTED International Conference on Signal Processing Pattern Recognition and Applications (SPPRA 2002), Crete, Greece, 2002, pp 303-308, 2002 [7] Konstantaras, A., Katsifarakis, E., Maravelakis, E., Skounakis, E., Kokkinos, E. and Karapidakis, E.: 'Intelligent Spatial-Clustering of Seismicity in the Vicinity of the Hellenic Seismic Arc', Earth Science Research, vol. 1 (2), pp. 1-10, 2012 [8] Georgoulas, G., Konstantaras, A., Katsifarakis, E., Stylios, C.D., Maravelakis, E. and Vachtsevanos, G.: '"Seismic-Mass" Density-based Algorithm for Spatio-Temporal Clustering', Expert Systems with Applications, vol. 40 (10), pp. 4183-4189, 2013 [9] Konstantaras, A. J.: 'Expert knowledge-based algorithm for the dynamic discrimination of interactive natural clusters', Earth Science Informatics, 2015 (In Press, see: www.scopus.com) [10] Drakatos, G. and Latoussakis, J.: 'A catalog of aftershock sequences in Greece (1971-1997): Their spatial and temporal characteristics', Journal of Seismology, vol. 5, pp. 137-145, 2001
Analyzing Big Data with Dynamic Quantum Clustering
Weinstein, M; Hume, A; Sciau, Ph; Shaked, G; Hofstetter, R; Persi, E; Mehta, A; Horn, D
2013-01-01
How does one search for a needle in a multi-dimensional haystack without knowing what a needle is and without knowing if there is one in the haystack? This kind of problem requires a paradigm shift - away from hypothesis driven searches of the data - towards a methodology that lets the data speak for itself. Dynamic Quantum Clustering (DQC) is such a methodology. DQC is a powerful visual method that works with big, high-dimensional data. It exploits variations of the density of the data (in feature space) and unearths subsets of the data that exhibit correlations among all the measured variables. The outcome of a DQC analysis is a movie that shows how and why sets of data-points are eventually classified as members of simple clusters or as members of - what we call - extended structures. This allows DQC to be successfully used in a non-conventional exploratory mode where one searches data for unexpected information without the need to model the data. We show how this works for big, complex, real-world dataset...
Dynamic Correction Algorithm of Rolling Force in Plate Rolling
Institute of Scientific and Technical Information of China (English)
QIU Hong-lei; WANG Jun; HU Xian-lei; WANG Zhao-dong; WANG Guo-dong
2005-01-01
Based on the Shougang plat mill project, an on-line dynamic correction algorithm was analyzed. This algorithm can adjust model coefficients better because the reasonable correction is based on the measured and calculated rolling force. The results of application on site show that this on-line dynamic correction algorithm is effective.
A LEAP-FROG ALGORITHM FOR STOCHASTIC DYNAMICS
Van Gunsteren, W. F.; Berendsen, H. J. C.
1988-01-01
A third-order algorithm for stochastic dynamics (SD) simulations is proposed, identical to the powerful molecular dynamics leapfrog algorithm in the limit of infinitely small friction coefficient gamma. It belongs to the class of SD algorithms, in which the integration time step Delta t is not
A LEAP-FROG ALGORITHM FOR STOCHASTIC DYNAMICS
Van Gunsteren, W. F.; Berendsen, H. J. C.
1988-01-01
A third-order algorithm for stochastic dynamics (SD) simulations is proposed, identical to the powerful molecular dynamics leapfrog algorithm in the limit of infinitely small friction coefficient gamma. It belongs to the class of SD algorithms, in which the integration time step Delta t is not limit
Clustering of tethered satellite system simulation data by an adaptive neuro-fuzzy algorithm
Mitra, Sunanda; Pemmaraju, Surya
1992-01-01
Recent developments in neuro-fuzzy systems indicate that the concepts of adaptive pattern recognition, when used to identify appropriate control actions corresponding to clusters of patterns representing system states in dynamic nonlinear control systems, may result in innovative designs. A modular, unsupervised neural network architecture, in which fuzzy learning rules have been embedded is used for on-line identification of similar states. The architecture and control rules involved in Adaptive Fuzzy Leader Clustering (AFLC) allow this system to be incorporated in control systems for identification of system states corresponding to specific control actions. We have used this algorithm to cluster the simulation data of Tethered Satellite System (TSS) to estimate the range of delta voltages necessary to maintain the desired length rate of the tether. The AFLC algorithm is capable of on-line estimation of the appropriate control voltages from the corresponding length error and length rate error without a priori knowledge of their membership functions and familarity with the behavior of the Tethered Satellite System.
Dynamic airspace configuration by genetic algorithm
Directory of Open Access Journals (Sweden)
Marina Sergeeva
2017-06-01
Full Text Available With the continuous air traffic growth and limits of resources, there is a need for reducing the congestion of the airspace systems. Nowadays, several projects are launched, aimed at modernizing the global air transportation system and air traffic management. In recent years, special interest has been paid to the solution of the dynamic airspace configuration problem. Airspace sector configurations need to be dynamically adjusted to provide maximum efficiency and flexibility in response to changing weather and traffic conditions. The main objective of this work is to automatically adapt the airspace configurations according to the evolution of traffic. In order to reach this objective, the airspace is considered to be divided into predefined 3D airspace blocks which have to be grouped or ungrouped depending on the traffic situation. The airspace structure is represented as a graph and each airspace configuration is created using a graph partitioning technique. We optimize airspace configurations using a genetic algorithm. The developed algorithm generates a sequence of sector configurations for one day of operation with the minimized controller workload. The overall methodology is implemented and successfully tested with air traffic data taken for one day and for several different airspace control areas of Europe.
Dynamic Data Updating Algorithm for Image Superresolution Reconstruction
Institute of Scientific and Technical Information of China (English)
TAN Bing; XU Qing; ZHANG Yan; XING Shuai
2006-01-01
A dynamic data updating algorithm for image superesolution is proposed. On the basis of Delaunay triangulation and its local updating property, this algorithm can update the changed region directly under the circumstances that only a part of the source images has been changed. For its high efficiency and adaptability, this algorithm can serve as a fast algorithm for image superesolution reconstruction.
DYNAMIC K-MEANS ALGORITHM FOR OPTIMIZED ROUTING IN MOBILE AD HOC NETWORKS
Directory of Open Access Journals (Sweden)
Zahra Zandieh Shirazi
2016-04-01
Full Text Available In this paper, a dynamic K-means algorithm to improve the routing process in Mobile Ad-Hoc networks (MANETs is presented. Mobile ad-hoc networks are a collocation of mobile wireless nodes that can operate without using focal access points, pre-existing infrastructures, or a centralized management point. In MANETs, the quick motion of nodes modifies the topology of network. This feature of MANETS is lead to various problems in the routing process such as increase of the overhead massages and inefficient routing between nodes of network. A large variety of clustering methods have been developed for establishing an efficient routing process in MANETs. Routing is one of the crucial topics which are having significant impact on MANETs performance. The K-means algorithm is one of the effective clustering methods aimed to reduce routing difficulties related to bandwidth, throughput and power consumption. This paper proposed a new K-means clustering algorithm to find out optimal path from source node to destinations node in MANETs. The main goal of proposed approach which is called the dynamic K-means clustering methods is to solve the limitation of basic K-means method like permanent cluster head and fixed cluster members. The experimental results demonstrate that using dynamic K-means scheme enhance the performance of routing process in Mobile ad-hoc networks.
Directory of Open Access Journals (Sweden)
Lynne Cameron
2010-05-01
Full Text Available
Metaphor is examined in the very different iscourse contexts of the classroom and of reconciliation talk to highlight the neglected affective dimension. The distribution of metaphors across discourse shows clustering at certain points, often where speakers are engaged in critical interpersonal discourse activity. Clusters in classroom talk co-occur with sequences of agenda management where teachers prepare students for upcoming lessons and with giving feedback to students, both of which require careful management of interpersonal and affective issues. Clusters in reconciliation talk co-occur with discourse management and with two situations with significant affective dynamics: appropriation of metaphor and exploration of alternative scenarios.
Metaphor is examined in the very different iscourse contexts of the classroom and of reconciliation talk to highlight the neglected affective dimension. The distribution of metaphors across discourse shows clustering at certain points, often where speakers are engaged in critical interpersonal discourse activity. Clusters in classroom talk co-occur with sequences of agenda management where teachers prepare students for upcoming lessons and with giving feedback to students, both of which require careful management of interpersonal and affective issues. Clusters in reconciliation talk co-occur with discourse management and with two situations with significant affective dynamics: appropriation of metaphor and exploration of alternative scenarios.
A Novel Distributed Clustering Algorithm for Mobile Ad-hoc Networks
Directory of Open Access Journals (Sweden)
Sahar Adabi
2008-01-01
Full Text Available This paper proposed a new Distributed Score Based Clustering Algorithm (DSBCA for Mobile Ad-hoc Networks (MANETs.In MANETs, select suitable nodes in clusters as cluster heads are so important. The proposed Clustering Algorithm considers the Battery Remaining, Number of Neighbors, Number of Members, and Stability in order to calculate the node's score with a linear algorithm. After each node calculates its score independently, the neighbors of the node must be notified about it. Also each node selects one of its neighbors with the highest score to be its cluster head and, therefore the selection of cluster heads is performed in a distributed manner with most recent information about current status of neighbor nodes. The proposed algorithm was compared with Weighted Clustering Algorithm and Distributed Weighted Clustering Algorithm in terms of number of clusters, number of re-affiliations, lifespan of nodes in the system, end-to-end throughput and overhead. The simulation results proved that the proposed algorithm has achieved the goals.
User-Based Document Clustering by Redescribing Subject Descriptions with a Genetic Algorithm.
Gordon, Michael D.
1991-01-01
Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…
Contributions to "k"-Means Clustering and Regression via Classification Algorithms
Salman, Raied
2012-01-01
The dissertation deals with clustering algorithms and transforming regression problems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learning environment for solving regression problems as classification tasks by using…
A Cluster Algorithm for the 2-D SU(3) × SU(3) Chiral Model
Ji, Da-ren; Zhang, Jian-bo
1996-07-01
To extend the cluster algorithm to SU(N) × SU(N) chiral models, a variant version of Wolff's cluster algorithm is proposed and tested for the 2-dimensional SU(3) × SU(3) chiral model. The results show that the new method can reduce the critical slowing down in SU(3) × SU(3) chiral model.
Lowest-ID with Adaptive ID Reassignment: A Novel Mobile Ad-Hoc Networks Clustering Algorithm
Gavalas, Damianos; Konstantopoulos, Charalampos; Mamalis, Basilis
2011-01-01
Clustering is a promising approach for building hierarchies and simplifying the routing process in mobile ad-hoc network environments. The main objective of clustering is to identify suitable node representatives, i.e. cluster heads (CHs), to store routing and topology information and maximize clusters stability. Traditional clustering algorithms suggest CH election exclusively based on node IDs or location information and involve frequent broadcasting of control packets, even when network topology remains unchanged. More recent works take into account additional metrics (such as energy and mobility) and optimize initial clustering. However, in many situations (e.g. in relatively static topologies) re-clustering procedure is hardly ever invoked; hence initially elected CHs soon reach battery exhaustion. Herein, we introduce an efficient distributed clustering algorithm that uses both mobility and energy metrics to provide stable cluster formations. CHs are initially elected based on the time and cost-efficien...
Dynamical evolution of globular-cluster systems in clusters of galaxies
Energy Technology Data Exchange (ETDEWEB)
Muzzio, J.C.
1987-04-01
The dynamical processes that affect globular-cluster systems in clusters of galaxies are analyzed. Two-body and impulsive approximations are utilized to study dynamical friction, drag force, tidal stripping, tidal radii, globular-cluster swapping, tidal accretion, and galactic cannibalism. The evolution of galaxies and the collision of galaxies are simulated numerically; the steps involved in the simulation are described. The simulated data are compared with observations. Consideration is given to the number of galaxies, halo extension, location of the galaxies, distribution of the missing mass, nonequilibrium initial conditions, mass dependence, massive central galaxies, globular-cluster distribution, and lost globular clusters. 116 references.
Combinatorial Clustering Algorithm of Quantum-Behaved Particle Swarm Optimization and Cloud Model
Directory of Open Access Journals (Sweden)
Mi-Yuan Shan
2013-01-01
Full Text Available We propose a combinatorial clustering algorithm of cloud model and quantum-behaved particle swarm optimization (COCQPSO to solve the stochastic problem. The algorithm employs a novel probability model as well as a permutation-based local search method. We are setting the parameters of COCQPSO based on the design of experiment. In the comprehensive computational study, we scrutinize the performance of COCQPSO on a set of widely used benchmark instances. By benchmarking combinatorial clustering algorithm with state-of-the-art algorithms, we can show that its performance compares very favorably. The fuzzy combinatorial optimization algorithm of cloud model and quantum-behaved particle swarm optimization (FCOCQPSO in vague sets (IVSs is more expressive than the other fuzzy sets. Finally, numerical examples show the clustering effectiveness of COCQPSO and FCOCQPSO clustering algorithms which are extremely remarkable.
A Heuristic Clustering Algorithm for Intrusion Detection Based on Information Entropy
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
This paper studied on the clustering problem for intrusion detection with the theory of information entropy, it was put forward that the clustering problem for exact intrusion detection based on information entropy is NP-complete, therefore, the heuristic algorithm to solve the clustering problem for intrusion detection was designed, this algorithm has the characteristic of incremental development, it can deal with the database with large connection records from the internet.
A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
Wang, Zhihao; Yi, Jing
2016-01-01
For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule n and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result. PMID:28042291
Dynamics of clusters and molecules in contact with an environment
Dinh, P M; Suraud, E
2009-01-01
We present recent theoretical investigations on the dynamics of metal clusters in contact with an environment, deposited of embedded. This concerns soft deposition as well as irradiation of the deposited/embedded clusters by intense laser pulses. We discuss examples of applications for two typical test cases, Na clusters deposited on MgO(001) surface and Na clusters in/on Ar substrate. Both environments are insulators with sizeable polarizability. They differ in their geometrical and mechanical properties.
Parallelization of the Wolff single-cluster algorithm
Kaupužs, J.; Rimšāns, J.; Melnik, R. V. N.
2010-02-01
A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024 , we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n≥2 . Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available.
Using Clustering Algorithms to Identify Brown Dwarf Characteristics
Choban, Caleb
2016-06-01
Brown dwarfs are stars that are not massive enough to sustain core hydrogen fusion, and thus fade and cool over time. The molecular composition of brown dwarf atmospheres can be determined by observing absorption features in their infrared spectrum, which can be quantified using spectral indices. Comparing these indices to one another, we can determine what kind of brown dwarf it is, and if it is young or metal-poor. We explored a new method for identifying these subgroups through the expectation-maximization machine learning clustering algorithm, which provides a quantitative and statistical way of identifying index pairs which separate rare populations. We specifically quantified two statistics, completeness and concentration, to identify the best index pairs. Starting with a training set, we defined selection regions for young, metal-poor and binary brown dwarfs, and tested these on a large sample of L dwarfs. We present the results of this analysis, and demonstrate that new objects in these classes can be found through these methods.
A multi-sequential number-theoretic optimization algorithm using clustering methods
Institute of Scientific and Technical Information of China (English)
XU Qing-song; LIANG Yi-zeng; HOU Zhen-ting
2005-01-01
A multi-sequential number-theoretic optimization method based on clustering was developed and applied to the optimization of functions with many local extrema. Details of the procedure to generate the clusters and the sequential schedules were given. The algorithm was assessed by comparing its performance with generalized simulated annealing algorithm in a difficult instructive example and a D-optimum experimental design problem. It is shown the presented algorithm to be more effective and reliable based on the two examples.
Comparison and evaluation of network clustering algorithms applied to genetic interaction networks.
Hou, Lin; Wang, Lin; Berg, Arthur; Qian, Minping; Zhu, Yunping; Li, Fangting; Deng, Minghua
2012-01-01
The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes; Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets; the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.
Sonar Image Detection Algorithm Based on Two-Phase Manifold Partner Clustering
Institute of Scientific and Technical Information of China (English)
Xingmei Wang; Zhipeng Liu; Jianchuang Sun; Shu Liu
2015-01-01
According to the characteristics of sonar image data with manifold feature, the sonar image detection method based on two⁃phase manifold partner clustering algorithm is proposed. Firstly, K⁃means block clustering based on euclidean distance is proposed to reduce the data set. Mean value, standard deviation, and gray minimum value are considered as three features based on the relatinship between clustering model and data structure. Then K⁃means clustering algorithm based on manifold distance is utilized clustering again on the reduced data set to improve the detection efficiency. In K⁃means clustering algorithm based on manifold distance, line segment length on the manifold is analyzed, and a new power function line segment length is proposed to decrease the computational complexity. In order to quickly calculate the manifold distance, new all⁃source shortest path as the pretreatment of efficient algorithm is proposed. Based on this, the spatial feature of the image block is added in the three features to get the final precise partner clustering algorithm. The comparison with the other typical clustering algorithms demonstrates that the proposed algorithm gets good detection result. And it has better adaptability by experiments of the different real sonar images.
PERFORMANCE OF K-MEANS CLUSTERING AND BIRD FLOCKING ALGORITHM FOR GROUPING THE WEB LOG FILES
Directory of Open Access Journals (Sweden)
R. SUGUNA
2012-10-01
Full Text Available Data mining is the process of analyzing the interesting pattern and knowledge in different perspectives and summarizing it into useful information from the large amount of data. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. The unlabled vast amount of data can be grouped using clustering or classification algorithms. Cluster analysis or clustering is the task of assigning a set of objects into groups called clusters. So, the objects in the same cluster are more similar to each other than to those in other clusters. Many of the researchers evaluated the performance of thefamiliar K-means clustering algorithm and attempt to improve the efficiency of the algorithm. This paper will analyze the performance of the K-means clustering algorithm with the biological based algorithm called Bird flocking algorithm for grouping the web logs. Web logs are unformatted text files which contains the information regarding the user’s browser detail. The proposed system takes the input as web log files and groups the web sites based on the interesting rate of the users. The performance is evaluated in terms of no of clusters, CPU utilization time and accuracy.
Optimized dynamical decoupling via genetic algorithms
Quiroz, Gregory; Lidar, Daniel A.
2013-11-01
We utilize genetic algorithms aided by simulated annealing to find optimal dynamical decoupling (DD) sequences for a single-qubit system subjected to a general decoherence model under a variety of control pulse conditions. We focus on the case of sequences with equal pulse intervals and perform the optimization with respect to pulse type and order. In this manner, we obtain robust DD sequences, first in the limit of ideal pulses, then when including pulse imperfections such as finite-pulse duration and qubit rotation (flip-angle) errors. Although our optimization is numerical, we identify a deterministic structure that underlies the top-performing sequences. We use this structure to devise DD sequences which outperform previously designed concatenated DD (CDD) and quadratic DD (QDD) sequences in the presence of pulse errors. We explain our findings using time-dependent perturbation theory and provide a detailed scaling analysis of the optimal sequences.
Domain decomposition algorithms and computation fluid dynamics
Chan, Tony F.
1988-01-01
In the past several years, domain decomposition was a very popular topic, partly motivated by the potential of parallelization. While a large body of theory and algorithms were developed for model elliptic problems, they are only recently starting to be tested on realistic applications. The application of some of these methods to two model problems in computational fluid dynamics are investigated. Some examples are two dimensional convection-diffusion problems and the incompressible driven cavity flow problem. The construction and analysis of efficient preconditioners for the interface operator to be used in the iterative solution of the interface solution is described. For the convection-diffusion problems, the effect of the convection term and its discretization on the performance of some of the preconditioners is discussed. For the driven cavity problem, the effectiveness of a class of boundary probe preconditioners is discussed.
Optimized Dynamical Decoupling via Genetic Algorithms
Quiroz, Gregory
2013-01-01
We utilize genetic algorithms to find optimal dynamical decoupling (DD) sequences for a single-qubit system subjected to a general decoherence model under a variety of control pulse conditions. We focus on the case of sequences with equal pulse-intervals and perform the optimization with respect to pulse type and order. In this manner we obtain robust DD sequences, first in the limit of ideal pulses, then when including pulse imperfections such as finite pulse duration and qubit rotation (flip-angle) errors. Although our optimization is numerical, we identify a deterministic structure underlies the top-performing sequences. We use this structure to devise DD sequences which outperform previously designed concatenated DD (CDD) and quadratic DD (QDD) sequences in the presence of pulse errors. We explain our findings using time-dependent perturbation theory and provide a detailed scaling analysis of the optimal sequences.
Online Assignment Algorithms for Dynamic Bipartite Graphs
Sahai, Ankur
2011-01-01
This paper analyzes the problem of assigning weights to edges incrementally in a dynamic complete bipartite graph consisting of producer and consumer nodes. The objective is to minimize the overall cost while satisfying certain constraints. The cost and constraints are functions of attributes of the edges, nodes and online service requests. Novelty of this work is that it models real-time distributed resource allocation using an approach to solve this theoretical problem. This paper studies variants of this assignment problem where the edges, producers and consumers can disappear and reappear or their attributes can change over time. Primal-Dual algorithms are used for solving these problems and their competitive ratios are evaluated.
An efficient dynamic load balancing algorithm
Lagaros, Nikos D.
2014-01-01
In engineering problems, randomness and uncertainties are inherent. Robust design procedures, formulated in the framework of multi-objective optimization, have been proposed in order to take into account sources of randomness and uncertainty. These design procedures require orders of magnitude more computational effort than conventional analysis or optimum design processes since a very large number of finite element analyses is required to be dealt. It is therefore an imperative need to exploit the capabilities of computing resources in order to deal with this kind of problems. In particular, parallel computing can be implemented at the level of metaheuristic optimization, by exploiting the physical parallelization feature of the nondominated sorting evolution strategies method, as well as at the level of repeated structural analyses required for assessing the behavioural constraints and for calculating the objective functions. In this study an efficient dynamic load balancing algorithm for optimum exploitation of available computing resources is proposed and, without loss of generality, is applied for computing the desired Pareto front. In such problems the computation of the complete Pareto front with feasible designs only, constitutes a very challenging task. The proposed algorithm achieves linear speedup factors and almost 100% speedup factor values with reference to the sequential procedure.
Improved FIFO Scheduling Algorithm Based on Fuzzy Clustering in Cloud Computing
Directory of Open Access Journals (Sweden)
Jian Li
2017-02-01
Full Text Available In cloud computing, some large tasks may occupy too many resources and some small tasks may wait for a long time based on First-In-First-Out (FIFO scheduling algorithm. To reduce tasks’ waiting time, we propose a task scheduling algorithm based on fuzzy clustering algorithms. We construct a task model, resource model, and analyze tasks’ preference, then classify resources with fuzzy clustering algorithms. Based on the parameters of cloud tasks, the algorithm will calculate resource expectation and assign tasks to different resource clusters, so the complexity of resource selection will be decreased. As a result, the algorithm will reduce tasks’ waiting time and improve the resource utilization. The experiment results show that the proposed algorithm shortens the execution time of tasks and increases the resource utilization.
VR-Cluster: Dynamic Migration for Resource Fragmentation Problem in Virtual Router Platform
Directory of Open Access Journals (Sweden)
Xianming Gao
2016-01-01
Full Text Available Network virtualization technology is regarded as one of gradual schemes to network architecture evolution. With the development of network functions virtualization, operators make lots of effort to achieve router virtualization by using general servers. In order to ensure high performance, virtual router platform usually adopts a cluster of general servers, which can be also regarded as a special cloud computing environment. However, due to frequent creation and deletion of router instances, it may generate lots of resource fragmentation to prevent platform from establishing new router instances. In order to solve “resource fragmentation problem,” we firstly propose VR-Cluster, which introduces two extra function planes including switching plane and resource management plane. Switching plane is mainly used to support seamless migration of router instances without packet loss; resource management plane can dynamically move router instances from one server to another server by using VR-mapping algorithms. Besides, three VR-mapping algorithms including first-fit mapping algorithm, best-fit mapping algorithm, and worst-fit mapping algorithm are proposed based on VR-Cluster. At last, we establish VR-Cluster protosystem by using general X86 servers, evaluate its migration time, and further analyze advantages and disadvantages of our proposed VR-mapping algorithms to solve resource fragmentation problem.
Karayiannis, Nicolaos B; Randolph-Gips, Mary M
2005-03-01
This paper presents the development of soft clustering and learning vector quantization (LVQ) algorithms that rely on a weighted norm to measure the distance between the feature vectors and their prototypes. The development of LVQ and clustering algorithms is based on the minimization of a reformulation function under the constraint that the generalized mean of the norm weights be constant. According to the proposed formulation, the norm weights can be computed from the data in an iterative fashion together with the prototypes. An error analysis provides some guidelines for selecting the parameter involved in the definition of the generalized mean in terms of the feature variances. The algorithms produced from this formulation are easy to implement and they are almost as fast as clustering algorithms relying on the Euclidean norm. An experimental evaluation on four data sets indicates that the proposed algorithms outperform consistently clustering algorithms relying on the Euclidean norm and they are strong competitors to non-Euclidean algorithms which are computationally more demanding.
Karjee, Jyotirmoy
2011-01-01
Objective: The main objective of this paper is to construct a distributed clustering algorithm based upon spatial data correlation among sensor nodes and perform data accuracy for each distributed cluster at their respective cluster head node. Design Procedure/Approach: We investigate that due to deployment of high density of sensor nodes in the sensor field, spatial data are highly correlated among sensor nodes in spatial domain. Based on high data correlation among sensor nodes, we propose a non -overlapping irregular distributed clustering algorithm with different sizes to collect most accurate or precise data at the cluster head node for each respective distributed cluster. To collect the most accurate data at the cluster head node for each distributed cluster in sensor field, we propose a Data accuracy model and compare the results with Information accuracy model. Finding: Simulation results shows that our propose Data accuracy model collects more accurate data and gives better performance than Informati...
Wireless Meter Reading Based Energy-Balanced Steady Clustering Routing Algorithm for Sensor Networks
Directory of Open Access Journals (Sweden)
TANG, Z.
2011-05-01
Full Text Available According to the characteristics of wireless meter reading system, an energy-balanced and energy-efficient steady clustering routing algorithm (EBSC, Energy-Balanced Steady Clustering is proposed. In the clustering mechanism, the current cluster head nodes determine cluster head nodes for next round according to the residual energy of the cluster members. In the next round, each non-cluster head node decides the cluster to which it will belong according to energy-distance function. The cluster head nodes send data to base station by the communication model of single hop and multi-hop that is decided according to the criterion of minimum energy consumption. In EBSC algorithm, the number of cluster head nodes generated in each round is very steady, and EBSC combines the advantage both distributed and centralized clustering algorithm. Experimental results show that the proposed routing algorithm not only efficiently uses limited energy of network nodes, but also well balances energy consumption of all nodes, and significantly prolongs network lifetime.
EZDCP:A new static task scheduling algorithm with edge-zeroing based on dynamic critical paths
Institute of Scientific and Technical Information of China (English)
陈志刚; 华强胜
2003-01-01
A new static task scheduling algorithm named edge-zeroing based on dynamic critical paths is proposed.The main ideas of the algorithm are as follows: firstly suppose that all of the tasks are in different clusters; secondly, select one of the critical paths of the partially clustered directed acyclic graph; thirdly, try to zero one of graph communication edges; fourthly, repeat above three processes until all edges are zeroed; finally, check the generated clusters to see if some of them can be further merged without increasing the parallel time. Comparisons of the previous algorithms with edge-zeroing based on dynamic critical paths show that the new algorithm has not only a low complexity but also a desired performance comparable or even better on average to much higher complexity heuristic algorithms.
Directory of Open Access Journals (Sweden)
Tanti Octavia
2003-01-01
Full Text Available A Modified Giffler and Thompson algorithm combined with dynamic slack time is used to allocate machines resources in dynamic nature. It was compared with a Real Time Order Promising (RTP algorithm. The performance of modified Giffler and Thompson and RTP algorithms are measured by mean tardiness. The result shows that modified Giffler and Thompson algorithm combined with dynamic slack time provides significantly better result compared with RTP algorithm in terms of mean tardiness.
The formation and dynamical evolution of young star clusters
Fujii, Michiko
2015-01-01
Recent observations have revealed a variety of young star clusters, including embedded systems, young massive clusters, and associations. We study the formation and dynamical evolution of these clusters using a combination of simulations and theoretical models. Our simulations start with a turbulent molecular cloud that collapses under its own gravity. The stars are assumed to form in the densest regions in the collapsing cloud after an initial free-fall times of the molecular cloud. The dynamical evolution of these stellar distributions are continued by means of direct $N$-body simulations. The molecular clouds typical for the Milky Way Galaxy tend to form embedded clusters which evolve to resemble open clusters. The associations were initially considerably more clumpy, but lost their irregularity in about a dynamical time scale due to the relaxation process. The densest molecular clouds, which are absent in the Milky Way but are typical in starburst galaxies, form massive young star clusters. They indeed ar...
Sun, Liping; Luo, Yonglong; Ding, Xintao; Zhang, Ji
2014-01-01
An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE) algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect.
Directory of Open Access Journals (Sweden)
Liping Sun
2014-01-01
Full Text Available An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect.
Implementation of Clustering Algorithms for real datasets in Medical Diagnostics using MATLAB
Directory of Open Access Journals (Sweden)
B. Venkataramana
2017-03-01
Full Text Available As in the medical field, for one disease there require samples given by diagnosis. The samples will be analyzed by a doctor or a pharmacist. As the no. of patients increases their samples also increases, there require more time to analyze samples for deciding the stage of the disease. To analyze the sample every time requires a skilled person. The samples can be classified by applying them to clustering algorithms. Data clustering has been considered as the most important raw data analysis method used in data mining technology. Most of the clustering techniques proved their efficiency in many applications such as decision making systems, medical sciences, earth sciences etc. Partition based clustering is one of the main approach in clustering. There are various algorithms of data clustering, every algorithm has its own advantages and disadvantages. This work reports the results of classification performance of three such widely used algorithms namely K-means (KM, Fuzzy c-means and Fuzzy Possibilistic c-Means (FPCM clustering algorithms. To analyze these algorithms three known data sets from UCI machine learning repository are taken such as thyroid data, liver and wine. The efficiency of clustering output is compared with the classification performance, percentage of correctness. The experimental results show that K-means and FCM give same performance for liver data. And FCM and FPCM are giving same performance for thyroid and wine data. FPCM has more efficient classification performance in all the given data sets.
Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm
Frisca, Bustamam, Alhadi; Siswantining, Titin
2017-03-01
Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.
Genetic algorithm based two-mode clustering of metabolomics data
Hageman, J.A.; Berg, R.A. van den; Westerhuis, J.A.; Werf, M.J. van der; Smilde, A.K.
2008-01-01
Metabolomics and other omics tools are generally characterized by large data sets with many variables obtained under different environmental conditions. Clustering methods and more specifically two-mode clustering methods are excellent tools for analyzing this type of data. Two-mode clustering metho
Text clustering based on fusion of ant colony and genetic algorithms
Institute of Scientific and Technical Information of China (English)
Yun ZHANG; Boqin FENG; Shouqiang MA; Lianmeng LIU
2009-01-01
Focusing on the problem that the ant colony algorithm gets into stagnation easily and cannot fully search in solution space,a text clustering approach based on the fusion of the ant colony and genetic algorithms is proposed.The four parameters that influence the performance of the ant colony algorithm are encoded as chromosomes,thereby the fitness function,selection,crossover and mutation operator are designed to find the combination of optimal parameters through a number of iteration,and then it is applied to text clustering.The simulation.results show that compared with the classical k-means clustering and the basic ant colony clustering algorithm,the proposed algorithm has better performance and the value of F-Measure is enhanced by 5.69%,48.60% and 69.60%,respectively,in 3 test datasets.Therefore,it is more suitable for processing a larger dataset.
Sakaguchi, Hidetsugu; Maeyama, Satomi
2013-02-01
A model of clustering dynamics is proposed for a population of spatially distributed active rotators. A transition from excitable to oscillatory dynamics is induced by the increase of the local density of active rotators. It is interpreted as dynamical quorum sensing. In the oscillation regime, phase waves propagate without decay, which generates an effectively long-range interaction in the clustering dynamics. The clustering process becomes facilitated and only one dominant cluster appears rapidly as a result of the dynamical quorum sensing. An exact localized solution is found to a simplified model equation, and the competitive dynamics between two localized states is studied numerically.
A highly efficient multi-core algorithm for clustering extremely large datasets
Directory of Open Access Journals (Sweden)
Kraus Johann M
2010-04-01
Full Text Available Abstract Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer.
Force-Based Incremental Algorithm for Mining Community Structure in Dynamic Network
Institute of Scientific and Technical Information of China (English)
Bo Yang; Da-You Liu
2006-01-01
Community structure is an important property of network. Being able to identify communities can provide invaluable help in exploiting and understanding both social and non-social networks. Several algorithms have been developed up till now. However, all these algorithms can work well only with small or moderate networks with vertexes of order 104.Besides, all the existing algorithms are off-line and cannot work well with highly dynamic networks such as web, in which web pages are updated frequently. When an already clustered network is updated, the entire network including original and incremental parts has to be recalculated, even though only slight changes are involved. To address this problem, an incremental algorithm is proposed, which allows for mining community structure in large-scale and dynamic networks. Based on the community structure detected previously, the algorithm takes little time to reclassify the entire network including both the original and incremental parts. Furthermore, the algorithm is faster than most of the existing algorithms such as Girvan and Newman's algorithm and its improved versions. Also, the algorithm can help to visualize these community structures in network and provide a new approach to research on the evolving process of dynamic networks.
Molecular dynamical simulations of melting behaviors of metal clusters
Directory of Open Access Journals (Sweden)
Ilyar Hamid
2015-04-01
Full Text Available The melting behaviors of metal clusters are studied in a wide range by molecular dynamics simulations. The calculated results show that there are fluctuations in the heat capacity curves of some metal clusters due to the strong structural competition; For the 13-, 55- and 147-atom clusters, variations of the melting points with atomic number are almost the same; It is found that for different metal clusters the dynamical stabilities of the octahedral structures can be inferred in general by a criterion proposed earlier by F. Baletto et al. [J. Chem. Phys. 116 3856 (2002] for the statically stable structures.
Influence of Dynamical Change of Edges on Clustering Coefficients
Directory of Open Access Journals (Sweden)
Yuhong Ruan
2015-01-01
Full Text Available Clustering coefficient is a very important measurement in complex networks, and it describes the average ratio between the actual existent edges and probable existent edges in the neighbor of one vertex in a complex network. Besides, in a complex networks, the dynamic change of edges can trigger directly the evolution of network and further affect the clustering coefficients. As a result, in this paper, we investigate the effects of the dynamic change of edge on the clustering coefficients. It is illustrated that the increase and decrease of the clustering coefficient can be effectively controlled by adding or deleting several edges of the network in the evolution of complex networks.
Directory of Open Access Journals (Sweden)
Simon Fong
2014-01-01
Full Text Available Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.
A New Cooperative Algorithm Based on PSO and K-Means for Data Clustering
Directory of Open Access Journals (Sweden)
Mehdi Sargolzaei
2012-01-01
Full Text Available Problem statement: Data clustering has been applied in multiple fields such as machine learning, data mining, wireless sensor networks and pattern recognition. One of the most famous clustering approaches is K-means which effectively has been used in many clustering problems, but this algorithm has some drawbacks such as local optimal convergence and sensitivity to initial points. Approach: Particle Swarm Optimization (PSO algorithm is one of the swarm intelligence algorithms, which is applied in determining the optimal cluster centers. In this study, a cooperative algorithm based on PSO and k-means is presented. Result: The proposed algorithm utilizes both global search ability of PSO and local search ability of k-means. The proposed algorithm and also PSO, PSO with Contraction Factor (CF-PSO, k-means algorithms and KPSO hybrid algorithm have been used for clustering six datasets and their efficiencies are compared with each other. Conclusion: Experimental results show that the proposed algorithm has an acceptable efficiency and robustness.
Fong, Simon; Deb, Suash; Yang, Xin-She; Zhuang, Yan
2014-01-01
Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.
An Efficient Data Aggregation Algorithm for Cluster-based Sensor Network
Directory of Open Access Journals (Sweden)
Mohammad Mostafizur Rahman Mozumdar
2009-09-01
Full Text Available Data aggregation in wireless sensor networks eliminates redundancy to improve bandwidth utilization and energyefficiency of sensor nodes. One node, called the cluster leader, collects data from surrounding nodes and then sends the summarized information to upstream nodes. In this paper, we propose an algorithm to select a cluster leader that will perform data aggregation in a partially connected sensor network. The algorithm reduces the traffic flow inside the network by adaptively selecting the shortest route for packet routing to the cluster leader. We also describe a simulation framework for functional analysis of WSN applications taking our proposed algorithm as an example.
Institute of Scientific and Technical Information of China (English)
CHUShuchuan; JohnF.Roddick
2003-01-01
In this paper, a cluster generation algorithm for vector quantization using a tabu search approach with simulated annealing is proposed. The main iclea of this algorithm is to use the tabu search approach to gen-erate non-local moves for the clusters and apply the sim-ulated annealing technique to select the current best solu-tion, thus improving the cluster generation and reducing the mean squared error. Preliminary experimental results demonstrate that the proposed approach is superior to the tabu search approach with Generalised Lloyd algorithm.
Scaling up the DBSCAN Algorithm for Clustering Large Spatial Databases Based on Sampling Technique
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
Clustering, in data mining, is a useful technique for discoveringinte resting data distributions and patterns in the underlying data, and has many app lication fields, such as statistical data analysis, pattern recognition, image p rocessing, and etc. We combine sampling technique with DBSCAN alg orithm to cluster large spatial databases, and two sampling-based DBSCAN (SDBSC A N) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental resul ts demonstrate that our algorithms are effective and efficient in clustering lar ge-scale spatial databases.
K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data
Directory of Open Access Journals (Sweden)
Cheng Lu
2015-01-01
Full Text Available The Affinity Propagation (AP algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based on K-nearest neighbor intervals (KNNI for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.
Molecular dynamics simulations of cluster fission and fusion processes
DEFF Research Database (Denmark)
Lyalin, Andrey G.; Obolensky, Oleg I.; Solov'yov, Ilia
2004-01-01
Results of molecular dynamics simulations of fission reactions Na_10^2+ --> Na_7^+ +Na_3^+ and Na_18^2+ --> 2Na_9^+ are presented. The dependence of the fission barriers on the isomer structure of the parent cluster is analyzed. It is demonstrated that the energy necessary for removing homothetic...... groups of atoms from the parent cluster is largely independent of the isomer form of the parent cluster. The importance of rearrangement of the cluster structure during the fission process is elucidated. This rearrangement may include transition to another isomer state of the parent cluster before actual...
Gündüç, Semra; Dilaver, Mehmet; Aydın, Meral; Gündüç, Yiğit
2005-02-01
In this work we have studied the dynamic scaling behavior of two scaling functions and we have shown that scaling functions obey the dynamic finite size scaling rules. Dynamic finite size scaling of scaling functions opens possibilities for a wide range of applications. As an application we have calculated the dynamic critical exponent (z) of Wolff's cluster algorithm for 2-, 3- and 4-dimensional Ising models. Configurations with vanishing initial magnetization are chosen in order to avoid complications due to initial magnetization. The observed dynamic finite size scaling behavior during early stages of the Monte Carlo simulation yields z for Wolff's cluster algorithm for 2-, 3- and 4-dimensional Ising models with vanishing values which are consistent with the values obtained from the autocorrelations. Especially, the vanishing dynamic critical exponent we obtained for d=3 implies that the Wolff algorithm is more efficient in eliminating critical slowing down in Monte Carlo simulations than previously reported.
Application of a Dynamic Programming Algorithm for Weapon Target Assignment
2016-02-01
UNCLASSIFIED UNCLASSIFIED Application of a Dynamic Programming Algorithm for Weapon Target Assignment Lloyd Hammond Weapons and...Combat Systems Division Defence Science and Technology Group DST Group-TR-3221 ABSTRACT Threat evaluation and weapon assignment...dynamic programming algorithm for Weapon Target Assignment which, after more rigorous testing, could be used as a concept demonstrator and as an auxiliary
Genetic Algorithms in Dynamical Systems Optimisation and Adaptation
Reus, N.M. de; Visser, E.K.; Bruggeman, B.
1998-01-01
Both in the design of dynamical systems, ranging from control systems to state estimators as in the adaptation of these systems the use of genetic algorithms is worth studying. This paper presents some approaches for using genetic algorithms in dynamical systems. The layouts and specific uses are di
Hierarchical trie packet classification algorithm based on expectation-maximization clustering
Bi, Xia-an; Zhao, Junxia
2017-01-01
With the development of computer network bandwidth, packet classification algorithms which are able to deal with large-scale rule sets are in urgent need. Among the existing algorithms, researches on packet classification algorithms based on hierarchical trie have become an important packet classification research branch because of their widely practical use. Although hierarchical trie is beneficial to save large storage space, it has several shortcomings such as the existence of backtracking and empty nodes. This paper proposes a new packet classification algorithm, Hierarchical Trie Algorithm Based on Expectation-Maximization Clustering (HTEMC). Firstly, this paper uses the formalization method to deal with the packet classification problem by means of mapping the rules and data packets into a two-dimensional space. Secondly, this paper uses expectation-maximization algorithm to cluster the rules based on their aggregate characteristics, and thereby diversified clusters are formed. Thirdly, this paper proposes a hierarchical trie based on the results of expectation-maximization clustering. Finally, this paper respectively conducts simulation experiments and real-environment experiments to compare the performances of our algorithm with other typical algorithms, and analyzes the results of the experiments. The hierarchical trie structure in our algorithm not only adopts trie path compression to eliminate backtracking, but also solves the problem of low efficiency of trie updates, which greatly improves the performance of the algorithm. PMID:28704476
Numerical simulation study of the dynamical behavior of the Niedermayer algorithm
Girardi, D.; Branco, N. S.
2010-04-01
We calculate the dynamic critical exponent for the Niedermayer algorithm applied to the two-dimensional Ising and XY models, for various values of the free parameter E0. For E0 = - 1 we regain the Metropolis algorithm and for E0 = 1 we regain the Wolff algorithm. For - 1 clusters of (possibly) turned spins initially grows with the linear size of the lattice, L, but eventually saturates at a given lattice size \\widetilde {L} , which depends on E0. For L\\gt \\widetilde {L} , the Niedermayer algorithm is equivalent to the Metropolis one, i.e., they have the same dynamic exponent. For E0 > 1, the autocorrelation time is always greater than for E0 = 1 (Wolff) and, more important, it also grows faster than a power of L. Therefore, we show that the best choice of cluster algorithm is the Wolff one, when comparing against the Niedermayer generalization. We also obtain the dynamic behavior of the Wolff algorithm: although not conclusively, we propose a scaling law for the dependence of the autocorrelation time on L.
Anandakrishnan, Ramu; Onufriev, Alexey
2008-03-01
In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.
Novel density-based and hierarchical density-based clustering algorithms for uncertain data.
Zhang, Xianchao; Liu, Han; Zhang, Xiaotong
2017-09-01
Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing
Constructing a graph of connections in clustering algorithm of complex objects
Directory of Open Access Journals (Sweden)
Татьяна Шатовская
2015-05-01
Full Text Available The article describes the results of modifying the algorithm Chameleon. Hierarchical multi-level algorithm consists of several phases: the construction of the count, coarsening, the separation and recovery. Each phase can be used various approaches and algorithms. The main aim of the work is to study the quality of the clustering of different sets of data using a set of algorithms combinations at different stages of the algorithm and improve the stage of construction by the optimization algorithm of k choice in the graph construction of k of nearest neighbors
Two Parallel Swendsen-Wang Cluster Algorithms Using Message-Passing Paradigm
Lin, Shizeng
2008-01-01
In this article, we present two different parallel Swendsen-Wang Cluster(SWC) algorithms using message-passing interface(MPI). One is based on Master-Slave Parallel Model(MSPM) and the other is based on Data-Parallel Model(DPM). A speedup of 24 with 40 processors and 16 with 37 processors is achieved with the DPM and MSPM respectively. The speedup of both algorithms at different temperature and system size is carefully examined both experimentally and theoretically, and a comparison of their efficiency is made. In the last section, based on these two parallel SWC algorithms, two parallel probability changing cluster(PCC) algorithms are proposed.
Mobile robot dynamic path planning based on improved genetic algorithm
Wang, Yong; Zhou, Heng; Wang, Ying
2017-08-01
In dynamic unknown environment, the dynamic path planning of mobile robots is a difficult problem. In this paper, a dynamic path planning method based on genetic algorithm is proposed, and a reward value model is designed to estimate the probability of dynamic obstacles on the path, and the reward value function is applied to the genetic algorithm. Unique coding techniques reduce the computational complexity of the algorithm. The fitness function of the genetic algorithm fully considers three factors: the security of the path, the shortest distance of the path and the reward value of the path. The simulation results show that the proposed genetic algorithm is efficient in all kinds of complex dynamic environments.
Advances in molecular vibrations and collision dynamics molecular clusters
Bacic, Zatko
1998-01-01
This volume focuses on molecular clusters, bound by van der Waals interactions and hydrogen bonds. Twelve chapters review a wide range of recent theoretical and experimental advances in the areas of cluster vibrations, spectroscopy, and reaction dynamics. The authors are leading experts, who have made significant contributions to these topics.The first chapter describes exciting results and new insights in the solvent effects on the short-time photo fragmentation dynamics of small molecules, obtained by combining heteroclusters with femtosecond laser excitation. The second is on theoretical work on effects of single solvent (argon) atom on the photodissociation dynamics of the solute H2O molecule. The next two chapters cover experimental and theoretical aspects of the energetics and vibrations of small clusters. Chapter 5 describes diffusion quantum Monte Carlo calculations and non additive three-body potential terms in molecular clusters. The next six chapters deal with hydrogen-bonded clusters, refle...
Directory of Open Access Journals (Sweden)
Bohui Zhu
2013-01-01
Full Text Available This paper presents a novel maximum margin clustering method with immune evolution (IEMMC for automatic diagnosis of electrocardiogram (ECG arrhythmias. This diagnostic system consists of signal processing, feature extraction, and the IEMMC algorithm for clustering of ECG arrhythmias. First, raw ECG signal is processed by an adaptive ECG filter based on wavelet transforms, and waveform of the ECG signal is detected; then, features are extracted from ECG signal to cluster different types of arrhythmias by the IEMMC algorithm. Three types of performance evaluation indicators are used to assess the effect of the IEMMC method for ECG arrhythmias, such as sensitivity, specificity, and accuracy. Compared with K-means and iterSVR algorithms, the IEMMC algorithm reflects better performance not only in clustering result but also in terms of global search ability and convergence ability, which proves its effectiveness for the detection of ECG arrhythmias.
A fast SVM training algorithm based on the set segmentation and k-means clustering
Institute of Scientific and Technical Information of China (English)
YANG Xiaowei; LIN Daying; HAO Zhifeng; LIANG Yanchun; LIU Guirong; HAN Xu
2003-01-01
At present, studies on training algorithms for support vector machines (SVM) are important issues in the field of machine learning. It is a challenging task to improve the efficiency of the algorithm without reducing the generalization performance of SVM. To face this challenge, a new SVM training algorithm based on the set segmentation and k-means clustering is presented in this paper. The new idea is to divide all the original training data into many subsets, followed by clustering each subset using k-means clustering and finally train SVM using the new data set obtained from clustering centroids. Considering that the decomposition algorithm such as SVMlight is one of the major methods for solving support vector machines, the SVMlight is used in our experiments. Simulations on different types of problems show that the proposed method can solve efficiently not only large linear classification problems but also large nonlinear ones.
Scheduling algorithm of dual-armed cluster tools with residency time and reentrant constraints
Institute of Scientific and Technical Information of China (English)
周炳海; 高忠顺; 陈佳
2014-01-01
To solve the scheduling problem of dual-armed cluster tools for wafer fabrications with residency time and reentrant constraints, a heuristic scheduling algorithm was developed. Firstly, on the basis of formulating scheduling problems domain of dual-armed cluster tools, a non-integer programming model was set up with a minimizing objective function of the makespan. Combining characteristics of residency time and reentrant constraints, a scheduling algorithm of searching the optimal operation path of dual-armed transport module was presented under many kinds of robotic scheduling paths for dual-armed cluster tools. Finally, the experiments were designed to evaluate the proposed algorithm. The results show that the proposed algorithm is feasible and efficient for obtaining an optimal scheduling solution of dual-armed cluster tools with residency time and reentrant constraints.
A Load Balancing Algorithm Based on Maximum Entropy Methods in Homogeneous Clusters
Directory of Open Access Journals (Sweden)
Long Chen
2014-10-01
Full Text Available In order to solve the problems of ill-balanced task allocation, long response time, low throughput rate and poor performance when the cluster system is assigning tasks, we introduce the concept of entropy in thermodynamics into load balancing algorithms. This paper proposes a new load balancing algorithm for homogeneous clusters based on the Maximum Entropy Method (MEM. By calculating the entropy of the system and using the maximum entropy principle to ensure that each scheduling and migration is performed following the increasing tendency of the entropy, the system can achieve the load balancing status as soon as possible, shorten the task execution time and enable high performance. The result of simulation experiments show that this algorithm is more advanced when it comes to the time and extent of the load balance of the homogeneous cluster system compared with traditional algorithms. It also provides novel thoughts of solutions for the load balancing problem of the homogeneous cluster system.
Solving the Capacitated Vehicle Routing Problem Based on Improved Ant-clustering Algorithm
Directory of Open Access Journals (Sweden)
Zhang Jiashan
2015-01-01
Full Text Available The capacitated vehicle routing problems (CVRP are NP-hard. Most approaches can solve small-scale case studies to optimality. Furthermore, they are time-consuming. To overcome the limitation, this paper presents a novel three-phase heuristic approach for the capacitated vehicle routing problem. The first phase aims to identify sets of cost-effective feasible clusters through an improved ant-clustering algorithm, in which the adaptive strategy is adopted. The second phase assigns clusters to vehicles and sequences them on each tour. The third phase orders nodes within clusters for every tour and genetic algorithm is used to order nodes within clusters. The simulation indicates the algorithm attains high quality results in a short time.
Kernel Clustering with a Differential Harmony Search Algorithm for Scheme Classification
Directory of Open Access Journals (Sweden)
Yu Feng
2017-01-01
Full Text Available This paper presents a kernel fuzzy clustering with a novel differential harmony search algorithm to coordinate with the diversion scheduling scheme classification. First, we employed a self-adaptive solution generation strategy and differential evolution-based population update strategy to improve the classical harmony search. Second, we applied the differential harmony search algorithm to the kernel fuzzy clustering to help the clustering method obtain better solutions. Finally, the combination of the kernel fuzzy clustering and the differential harmony search is applied for water diversion scheduling in East Lake. A comparison of the proposed method with other methods has been carried out. The results show that the kernel clustering with the differential harmony search algorithm has good performance to cooperate with the water diversion scheduling problems.
New Approach to Cluster Synchronization in Complex Dynamical Networks
Institute of Scientific and Technical Information of China (English)
LU Xin-Biao; QIN Bu-Zhi; LU Xin-Yu
2009-01-01
In this paper, a distributed control strategy is proposed to make a complex dynamical network achieve cluster synchronization, which means that nodes in the same group achieve the same synchronization state, while nodes in different groups achieve different synchronization states. The local and global stability of the cluster synchronization state are analyzed. Moreover, simulation results verify the effectiveness of the new approach
DYNER: A DYNamic ClustER for Education and Research
Kehagias, Dimitris; Grivas, Michael; Mamalis, Basilis; Pantziou, Grammati
2006-01-01
Purpose: The purpose of this paper is to evaluate the use of a non-expensive dynamic computing resource, consisting of a Beowulf class cluster and a NoW, as an educational and research infrastructure. Design/methodology/approach: Clusters, built using commodity-off-the-shelf (COTS) hardware components and free, or commonly used, software, provide…
Thermodynamics of small clusters of atoms: A molecular dynamics simulation
DEFF Research Database (Denmark)
Damgaard Kristensen, W.; Jensen, E. J.; Cotterill, Rodney M J
1974-01-01
The thermodynamic properties of clusters containing 55, 135, and 429 atoms have been calculated using the molecular dynamics method. Structural and vibrational properties of the clusters were examined at different temperatures in both the solid and the liquid phase. The nature of the melting...
Detectability thresholds and optimal algorithms for community structure in dynamic networks
Ghasemian, Amir; Clauset, Aaron; Moore, Cristopher; Peel, Leto
2015-01-01
We study the fundamental limits on learning latent community structure in dynamic networks. Specifically, we study dynamic stochastic block models where nodes change their community membership over time, but where edges are generated independently at each time step. In this setting (which is a special case of several existing models), we are able to derive the detectability threshold exactly, as a function of the rate of change and the strength of the communities. Below this threshold, we claim that no algorithm can identify the communities better than chance. We then give two algorithms that are optimal in the sense that they succeed all the way down to this limit. The first uses belief propagation (BP), which gives asymptotically optimal accuracy, and the second is a fast spectral clustering algorithm, based on linearizing the BP equations. We verify our analytic and algorithmic results via numerical simulation, and close with a brief discussion of extensions and open questions.
Optimization of self-interstitial clusters in 3C-SiC with genetic algorithm
Ko, Hyunseok; Kaczmarowski, Amy; Szlufarska, Izabela; Morgan, Dane
2017-08-01
Under irradiation, SiC develops damage commonly referred to as black spot defects, which are speculated to be self-interstitial atom clusters. To understand the evolution of these defect clusters and their impacts (e.g., through radiation induced swelling) on the performance of SiC in nuclear applications, it is important to identify the cluster composition, structure, and shape. In this work the genetic algorithm code StructOpt was utilized to identify groundstate cluster structures in 3C-SiC. The genetic algorithm was used to explore clusters of up to ∼30 interstitials of C-only, Si-only, and Si-C mixtures embedded in the SiC lattice. We performed the structure search using Hamiltonians from both density functional theory and empirical potentials. The thermodynamic stability of clusters was investigated in terms of their composition (with a focus on Si-only, C-only, and stoichiometric) and shape (spherical vs. planar), as a function of the cluster size (n). Our results suggest that large Si-only clusters are likely unstable, and clusters are predominantly C-only for n ≤ 10 and stoichiometric for n > 10. The results imply that there is an evolution of the shape of the most stable clusters, where small clusters are stable in more spherical geometries while larger clusters are stable in more planar configurations. We also provide an estimated energy vs. size relationship, E(n), for use in future analysis.
Sun, Xu; Yang, Lina; Gao, Lianru; Zhang, Bing; Li, Shanshan; Li, Jun
2015-01-01
Center-oriented hyperspectral image clustering methods have been widely applied to hyperspectral remote sensing image processing; however, the drawbacks are obvious, including the over-simplicity of computing models and underutilized spatial information. In recent years, some studies have been conducted trying to improve this situation. We introduce the artificial bee colony (ABC) and Markov random field (MRF) algorithms to propose an ABC-MRF-cluster model to solve the problems mentioned above. In this model, a typical ABC algorithm framework is adopted in which cluster centers and iteration conditional model algorithm's results are considered as feasible solutions and objective functions separately, and MRF is modified to be capable of dealing with the clustering problem. Finally, four datasets and two indices are used to show that the application of ABC-cluster and ABC-MRF-cluster methods could help to obtain better image accuracy than conventional methods. Specifically, the ABC-cluster method is superior when used for a higher power of spectral discrimination, whereas the ABC-MRF-cluster method can provide better results when used for an adjusted random index. In experiments on simulated images with different signal-to-noise ratios, ABC-cluster and ABC-MRF-cluster showed good stability.
Directory of Open Access Journals (Sweden)
Dong Yumin
2014-01-01
Full Text Available A quantum optimization scheme in network cluster server task scheduling is proposed. We explore and research the distribution theory of energy field in quantum mechanics; specially, we apply it to data clustering. We compare the quantum optimization method with genetic algorithm (GA, ant colony optimization (ACO, simulated annealing algorithm (SAA. At the same time, we prove its validity and rationality by analog simulation and experiment.
A Novel Image Fusion Algorithm for Visible and PMMW Images based on Clustering and NSCT
Xiong Jintao; Xie Weichao; Yang Jianyu; Fu Yanlong; Hu Kuan; Zhong Zhibin
2016-01-01
Aiming at the fusion of visible and Passive Millimeter Wave (PMMW) images, a novel algorithm based on clustering and NSCT (Nonsubsampled Contourlet Transform) is proposed. It takes advantages of the particular ability of PMMW image in presenting metal target and uses the clustering algorithm for PMMW image to extract the potential target regions. In the process of fusion, NSCT is applied to both input images, and then the decomposition coefficients on different scale are combined using differ...
Directory of Open Access Journals (Sweden)
D. A. Viattchenin
2009-01-01
Full Text Available A method for constructing a subset of labeled objects which is used in a heuristic algorithm of possible clusterization with partial training is proposed in the paper. The method is based on data preprocessing by the heuristic algorithm of possible clusterization using a transitive closure of a fuzzy tolerance. Method efficiency is demonstrated by way of an illustrative example.
A Chinese Web Page Clustering Algorithm Based on the Suffix Tree
Institute of Scientific and Technical Information of China (English)
YANG Jian-wu
2004-01-01
In this paper, an improved algorithm, named STC-I, is proposed for Chinese Web page clustering based on Chinese language characteristics, which adopts a new unit choice principle and a novel suffix tree construction policy.The experimental results show that the new algorithm keeps advantages of STC, and is better than STC in precision and speed when they are used to cluster Chinese Web page.
Single cluster dynamics for the infinite range O(n) model
Brower, R. C.; Gross, N. A.; Moriarty, K. J. M.; Tamayo, P.
1994-03-01
This paper presents a study of Wolff's single cluster acceleration algorithm for O( n) models in the infinite range or mean-field limit. Numerical results for n = 2, 3 and 4 are consistent with the complete elimination of critical slowing down. Also a heuristic argument is advanced to support the value of z = 0 for the dynamic critical exponent. A new cluster growth algorithm is formulated for the infinite range model that has optimal efficiency of O(inN) in the system size N for the Swendsen-Wang update scheme. Using an asymptotically correct version of this cluster method, we are able to perform simulations for the Wolff update scheme up to 262,144 spins for 10 5 time steps for the O( N) models.
A Coupled User Clustering Algorithm Based on Mixed Data for Web-Based Learning Systems
Directory of Open Access Journals (Sweden)
Ke Niu
2015-01-01
Full Text Available In traditional Web-based learning systems, due to insufficient learning behaviors analysis and personalized study guides, a few user clustering algorithms are introduced. While analyzing the behaviors with these algorithms, researchers generally focus on continuous data but easily neglect discrete data, each of which is generated from online learning actions. Moreover, there are implicit coupled interactions among the data but are frequently ignored in the introduced algorithms. Therefore, a mass of significant information which can positively affect clustering accuracy is neglected. To solve the above issues, we proposed a coupled user clustering algorithm for Wed-based learning systems by taking into account both discrete and continuous data, as well as intracoupled and intercoupled interactions of the data. The experiment result in this paper demonstrates the outperformance of the proposed algorithm.
Semi-Supervised Clustering Fingerprint Positioning Algorithm Based on Distance Constraints
Institute of Scientific and Technical Information of China (English)
Ying Xia; Zhongzhao Zhang; Lin Ma; Yao Wang
2015-01-01
With the rapid development of WLAN ( Wireless Local Area Network ) technology, an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation. In this paper, it proposes a novel fingerprint positioning algorithm known as semi⁃supervised affinity propagation clustering based on distance function constraints. We show that by employing affinity propagation techniques, it is able to use a fractional labeled data to adjust similarity matrix of signal space to cluster reference points with high accuracy. The semi⁃supervised APC uses a combination of machine learning, clustering analysis and fingerprinting algorithm. By collecting data and testing our algorithm in a realistic indoor WLAN environment, the experimental results indicate that the proposed algorithm can improve positioning accuracy while reduce the online localization computation, as compared with the widely used K nearest neighbor and maximum likelihood estimation algorithms.
Institute of Scientific and Technical Information of China (English)
Wu Naixing; Liao Jianxin; Zhu Xiaomin
2006-01-01
Based on the system feature of softswitch-based heterogeneous clustered media server, this paper proposed a limited resource vector load-balancing algorithm. The purpose of the algorithm was to balance the load of clusters by utilizing all system resources effectively and to avoid violent shaking of the system performance. A lot of simulations on the Petri net model of load balance system are conducted and the algorithm is compared with some traditional algorithms on balancing ability for heterogeneity, system throughput, request response time and performance stability. The results of simulations show that the algorithm achieves system higher performance and it has excellent ability to deal with the heterogeneity of clustered media server.
Cluster fusion-fission dynamics in the Singapore stock exchange
Teh, Boon Kin; Cheong, Siew Ann
2015-10-01
In this paper, we investigate how the cross-correlations between stocks in the Singapore stock exchange (SGX) evolve over 2008 and 2009 within overlapping one-month time windows. In particular, we examine how these cross-correlations change before, during, and after the Sep-Oct 2008 Lehman Brothers Crisis. To do this, we extend the complete-linkage hierarchical clustering algorithm, to obtain robust clusters of stocks with stronger intracluster correlations, and weaker intercluster correlations. After we identify the robust clusters in all time windows, we visualize how these change in the form of a fusion-fission diagram. Such a diagram depicts graphically how the cluster sizes evolve, the exchange of stocks between clusters, as well as how strongly the clusters mix. From the fusion-fission diagram, we see a giant cluster growing and disintegrating in the SGX, up till the Lehman Brothers Crisis in September 2008 and the market crashes of October 2008. After the Lehman Brothers Crisis, clusters in the SGX remain small for few months before giant clusters emerge once again. In the aftermath of the crisis, we also find strong mixing of component stocks between clusters. As a result, the correlation between initially strongly-correlated pairs of stocks decay exponentially with average life time of about a month. These observations impact strongly how portfolios and trading strategies should be formulated.
Genetic Algorithms Applied to Multi-Class Clustering for Gene Expression Data
Institute of Scientific and Technical Information of China (English)
Haiyan Pan; Jun Zhu; Danfu Han
2003-01-01
A hybrid GA (genetic algorithm)-based clustering (HGACLUS) schema, combining merits of the Simulated Annealing, was described for finding an optimal or near-optimal set of medoids. This schema maximized the clustering success by achieving internal cluster cohesion and external cluster isolation. The performance of HGACLUS and other methods was compared by using simulated data and open microarray gene-expression datasets. HGACLUS was generally found to be more accurate and robust than other methods discussed in this paper by the exact validation strategy and the explicit cluster number.
Variance Clustering Improved Dynamic Conditional Correlation MGARCH Estimators
Gian Piero Aielli; Massimiliano Caporin
2011-01-01
It is well-known that the estimated GARCH dynamics exhibit common patterns. Starting from this fact we extend the Dynamic Conditional Correlation (DCC) model by allowing for a cluster- ing structure of the univariate GARCH parameters. The model can be estimated in two steps, the first devoted to the clustering structure, and the second focusing on correlation parameters. Differently from the traditional two-step DCC estimation, we get large system feasibility of the joint estimation of the wh...
Clustering Algorithm As A Planning Support Tool For Rural Electrification Optimization
Directory of Open Access Journals (Sweden)
Ronaldo Pornillosa Parreno Jr
2015-08-01
Full Text Available Abstract In this study clustering algorithm was developed to optimize electrification plans by screening and grouping potential customers to be supplied with electricity. The algorithm provided adifferent approach in clustering problem which combines conceptual and distance-based clustering algorithmsto analyze potential clusters using spanning tree with the shortest possible edge weight and creating final cluster trees based on the test of inconsistency for the edges. The clustering criteria consists of commonly used distance measure with the addition of household information as basis for the ability to pay ATP value. The combination of these two parameters resulted to a more significant and realistic clusters since distance measure alone could not take the effect of the household characteristics in screening the most sensible groupings of households. In addition the implications of varying geographical features were incorporated in the algorithm by using routing index across the locations of the households. This new approach of connecting the households in an area was applied in an actual case study of one village or barangay that was not yet energized. The results of clustering algorithm generated cluster trees which could becomethetheoretical basis for power utilities to plan the initial network arrangement of electrification. Scenario analysis conducted on the two strategies of clustering the households provideddifferent alternatives for the optimization of the cost of electrification. Futhermorethe benefits associated with the two strategies formulated from the two scenarios was evaluated using benefit cost ratio BC to determine which is more economically advantageous. The results of the study showed that clustering algorithm proved to be effective in solving electrification optimization problem and serves its purpose as a planning support tool which can facilitate electrification in rural areas and achieve cost-effectiveness.
A Hybrid Distributed Mutual Exclusion Algorithm for Cluster-Based Systems
Directory of Open Access Journals (Sweden)
Moharram Challenger
2013-01-01
Full Text Available Distributed mutual exclusion is a fundamental problem which arises in various systems such as grid computing, mobile ad hoc networks (MANETs, and distributed databases. Reducing key metrics like message count per any critical section (CS and delay between two CS entrances, which is known as synchronization delay, is a great challenge for this problem. Various algorithms use either permission-based or token-based protocols. Token-based algorithms offer better communication costs and synchronization delay. Raymond's and Suzuki-Kasami's algorithms are well-known token-based ones. Raymond's algorithm needs only O(log2(N messages per CS and Suzuki-Kasami's algorithm needs just one message delivery time between two CS entrances. Nevertheless, both algorithms are weak in the other metric, synchronization delay and message complexity correspondingly. In this work, a new hybrid algorithm is proposed which gains from powerful aspects of both algorithms. Raysuz's algorithm (the proposed algorithm uses a clustered graph and executes Suzuki-Kasami's algorithm intraclusters and Raymond's algorithm interclusters. This leads to have better message complexity than that of pure Suzuki-Kasami's algorithm and better synchronization delay than that of pure Raymond's algorithm, resulting in an overall efficient DMX algorithm pure algorithm.
An Improved Fuzzy c-Means Clustering Algorithm Based on Shadowed Sets and PSO
Directory of Open Access Journals (Sweden)
Jian Zhang
2014-01-01
Full Text Available To organize the wide variety of data sets automatically and acquire accurate classification, this paper presents a modified fuzzy c-means algorithm (SP-FCM based on particle swarm optimization (PSO and shadowed sets to perform feature clustering. SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzy clustering, utilizes vagueness balance property of shadowed sets to handle overlapping among clusters, and models uncertainty in class boundaries. This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster number within a specific range with cluster partitions that provide compact and well-separated clusters. Experiments show that the proposed approach significantly improves the clustering effect.
Clustered K nearest neighbor algorithm for daily inflow forecasting
Akbari, M.; Van Overloop, P.J.A.T.M.; Afshar, A.
2010-01-01
Instance based learning (IBL) algorithms are a common choice among data driven algorithms for inflow forecasting. They are based on the similarity principle and prediction is made by the finite number of similar neighbors. In this sense, the similarity of a query instance is estimated according to
Distributed multicast routing algorithm with dynamic performance in multimedia networks
Institute of Scientific and Technical Information of China (English)
Zhu Baoping; Zhang Kun
2009-01-01
Tbe delay and DVBMT problem is known to be NP-complete. In this paper, an efficient distributed dynamic multicast muting algorithm was proposed to produce muting trees with delay and delay variation constraints. The pro-posed algorithm is fully distributed, and supports the dynamic reorganizing of the muhicast tree in response to changes for the destination. Simulations demonstrate that our algorithm is better in terms of tree delay and muting success ratio as compared with other existing algorithms, and performs excellently in delay variation performance under lower time complexity, which ensures it to support the requirements of real-time multimedia communications more effectively.
Tracing the Cluster Internal Dynamics with Member Galaxies
Biviano, Andrea
2001-01-01
The analysis of the spatial distribution and kinematics of galaxies in clusters allows one to determine the cluster internal dynamics. In this paper, I review the state of the art of this topic. In particular, I summarize what we have learned so far about galaxy orbits in clusters, and about the cluster mass distribution. I then compare four methods that have recently been used in the literature, by applying them to the same data-set. The results stress the importance of reducing systematic b...
A New Dynamical Evolutionary Algorithm Based on Statistical Mechanics
Institute of Scientific and Technical Information of China (English)
LI YuanXiang(李元香); ZOU XiuFen(邹秀芬); KANG LiShan(康立山); Zbigniew Michalewicz
2003-01-01
In this paper, a new dynamical evolutionary algorithm (DEA) is presented basedon the theory of statistical mechanics. The novelty of this kind of dynamical evolutionary algorithmis that all individuals in a population (called particles in a dynamical system) are running andsearching with their population evolving driven by a nev selecting mechanism. This mechanismsimulates the principle of molecular dynamics, which is easy to design and implement. A basictheoretical analysis for the dynamical evolutionary algorithm is given and as a consequence twostopping criteria of the algorithm are derived from the principle of energy minimization and the lawof entropy increasing. In order to verify the effectiveness of the scheme, DEA is applied to solvingsome typical numerical function minimization problems which are poorly solved by traditionalevolutionary algorithms. The experimental results show that DEA is fast and reliable.
Directory of Open Access Journals (Sweden)
Jing Chen
2015-06-01
Full Text Available This study takes the concept of food logistics distribution as the breakthrough point, by means of the aim of optimization of food logistics distribution routes and analysis of the optimization model of food logistics route, as well as the interpretation of the genetic algorithm, it discusses the optimization of food logistics distribution route based on genetic and cluster scheme algorithm.
Algebraic dynamics solution to and algebraic dynamics algorithm for nonlinear advection equation
Institute of Scientific and Technical Information of China (English)
2008-01-01
Algebraic dynamics approach and algebraic dynamics algorithm for the solution of nonlinear partial differential equations are applied to the nonlinear advection equa-tion. The results show that the approach is effective for the exact analytical solu-tion and the algorithm has higher precision than other existing algorithms in nu-merical computation for the nonlinear advection equation.
GenClust: A genetic algorithm for clustering gene expression data
Directory of Open Access Journals (Sweden)
Raimondi Alessandra
2005-12-01
Full Text Available Abstract Background Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. Results GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a a novel coding of the search space that is simple, compact and easy to update; (b it can be used naturally in conjunction with data driven internal validation methods. We have experimented with the FOM methodology, specifically conceived for validating clusters of gene expression data. The validity of GenClust has been assessed experimentally on real data sets, both with the use of validation measures and in comparison with other algorithms, i.e., Average Link, Cast, Click and K-means. Conclusion Experiments show that none of the algorithms we have used is markedly superior to the others across data sets and validation measures; i.e., in many cases the observed differences between the worst and best performing algorithm may be statistically insignificant and they could be considered equivalent. However, there are cases in which an algorithm may be better than others and therefore worthwhile. In particular, experiments for GenClust show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms. In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology.
GenClust: a genetic algorithm for clustering gene expression data.
Di Gesú, Vito; Giancarlo, Raffaele; Lo Bosco, Giosué; Raimondi, Alessandra; Scaturro, Davide
2005-12-07
Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a) a novel coding of the search space that is simple, compact and easy to update; (b) it can be used naturally in conjunction with data driven internal validation methods. We have experimented with the FOM methodology, specifically conceived for validating clusters of gene expression data. The validity of GenClust has been assessed experimentally on real data sets, both with the use of validation measures and in comparison with other algorithms, i.e., Average Link, Cast, Click and K-means. Experiments show that none of the algorithms we have used is markedly superior to the others across data sets and validation measures; i.e., in many cases the observed differences between the worst and best performing algorithm may be statistically insignificant and they could be considered equivalent. However, there are cases in which an algorithm may be better than others and therefore worthwhile. In particular, experiments for GenClust show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms. In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology.
The Loop-Cluster Algorithm for the Case of the 6 Vertex Model
Evertz, H G
1993-01-01
We present the loop algorithm, a new type of cluster algorithm that we recently introduced for the F model. Using the framework of Kandel and Domany, we show how to GENERALIZE the algorithm to the arrow flip symmetric 6 vertex model. We propose the principle of least possible freezing as the guide to choosing the values of free parameters in the algorithm. Finally, we briefly discuss the application of our algorithm to simulations of quantum spin systems. In particular, all necessary information is provided for the simulation of spin $\\half$ Heisenberg and $xxz$ models.
CMA: an efficient index algorithm of clustering supporting fast retrieval of large image databases
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
To realize content-based retrieval of large image databases, it is required to develop an efficient index and retrieval scheme. This paper proposes an index algorithm of clustering called CMA, which supports fast retrieval of large image databases. CMA takes advantages of k-means and self-adaptive algorithms. It is simple and works without any user interactions. There are two main stages in this algorithm. In the first stage, it classifies images in a database into several clusters, and automatically gets the necessary parameters for the next stage-k-means iteration. The CMA algorithm is tested on a large database of more than ten thousand images and compare it with k-means algorithm. Experimental results show that this algorithm is effective in both precision and retrieval time.
Diffusion Dynamics of Cux Cluster on Cu(111) Surface
Institute of Scientific and Technical Information of China (English)
Jian-feng Tang; Mai-chang Xu; Xue-song Li; Wo-yun Long
2008-01-01
The diffusion dynamics of small two-dimensional atomic clusters Cux(1≤x≤8) on Cu(111) surface were studied using the molecular dynamics simulations and a modified analytic embedded-atom method in the temperature range from 200 K to 800 K.The cluster size and temperature dependence of the diffusion coefficients and migration energies are presented.Our simulations show that the diffusion migration energy of the Cu7 cluster is the highest and the prefactor for the CuT cluster is almost three orders of magnitude larger than that for single atom diffusion.This conclusion is consistent with the experimental results for similar metals.In addition,the dependence of cluster diffusion on film growth is also discussed.
DYNAMIC LABELING BASED FPGA DELAY OPTIMIZATION ALGORITHM
Institute of Scientific and Technical Information of China (English)
吕宗伟; 林争辉; 张镭
2001-01-01
DAG-MAP is an FPGA technology mapping algorithm for delay optimization and the labeling phase is the algorithm's kernel. This paper studied the labeling phase and presented an improved labeling method. It is shown through the experimental results on MCNC benchmarks that the improved method is more effective than the original method while the computation time is almost the same.
Partially dynamic vehicle routing - models and algorithms
DEFF Research Database (Denmark)
Larsen, Allan; Madsen, Oli B.G.; Solomon, M.
2002-01-01
In this paper we propose a framework for dynamic routing systems based on their degree of dynamism. Next, we consider its impact on solution methodology and quality. Specifically, we introduce the Partially Dynamic Travelling Repairman Problem and describe several dynamic policies to minimize rou...
Pueyo, Adrián Gómez; Castro, Alberto
2016-01-01
We present an implementation of optimal control theory for the first-principles non-adiabatic Ehrenfest Molecular Dynamics model, which describes a condensed matter system by considering classical point-particle nuclei, and quantum electrons, handled in our case with time-dependent density-functional theory. The scheme is demonstrated by optimizing the Coulomb explosion of small Sodium clusters: the algorithm is set to find the optimal femtosecond laser pulses that disintegrate the clusters, for a given total pulse duration, fluence, and cut-off frequency. We describe the numerical details and difficulties of the methodology.
Femtosecond Excited State Dynamics of Size Selected Neutral Molecular Clusters.
Montero, Raúl; León, Iker; Fernández, José A; Longarte, Asier
2016-07-21
The work describes a novel experimental approach to track the relaxation dynamics of an electronically excited distribution of neutral molecular clusters formed in a supersonic expansion, by pump-probe femtosecond ionization. The introduced method overcomes fragmentation issues and makes possible to retrieve the dynamical signature of a particular cluster from each mass channel, by associating it to an IR transition of the targeted structure. We have applied the technique to study the nonadiabatic relaxation of pyrrole homoclusters. The results obtained exciting at 243 nm, near the origin of the bare pyrrole electronic absorption, allow us to identify the dynamical signature of the dimer (Py)2, which exhibits a distinctive lifetime of τ1 ∼ 270 fs, considerably longer than the decays recorded for the monomer and bigger size clusters (Py)n>2. A possible relationship between the measured lifetime and the clusters geometries is tentatively discussed.
Binaries and the dynamical mass of star clusters
Kouwenhoven, M B N
2007-01-01
The total mass of a distant star cluster is often derived from the virial theorem, using line-of-sight velocity dispersion measurements and half-light radii, under the implicit assumption that all stars are single (although it is known that most stars form part of binary systems). The components of binary stars exhibit orbital motion, which increases the measured velocity dispersion, resulting in a dynamical mass overestimation. In this article we quantify the effect of neglecting the binary population on the derivation of the dynamical mass of a star cluster. We find that the presence of binaries plays an important role for clusters with total mass M 10^5 Msun, binaries do not affect the dynamical mass estimation significantly, provided that the cluster is significantly compact (half-mass radius < 5 pc).
Scheme for Implementing Quantum Search Algorithm in a Cluster State Quantum Computer
Institute of Scientific and Technical Information of China (English)
ZHANG Da-Li; WANG Yan-Hui; ZHANG Yong
2008-01-01
Using cluster state and single qubit measurement one can perform the one-way quantum computation. Here we give a detailed scheme for realizing a modified Grover search algorithm using measurements on cluster state. We give the measurement pattern for the duster-state realization of the algorithm and estimated the number of measurement needed for its implementation. It is found that O(23n/2n2) number of single qubit measurements is required for its realization in a cluster-state quantum computer.
Cluster Based Hybrid Niche Mimetic and Genetic Algorithm for Text Document Categorization
Directory of Open Access Journals (Sweden)
A. K. Santra
2011-09-01
Full Text Available An efficient cluster based hybrid niche mimetic and genetic algorithm for text document categorization to improve the retrieval rate of relevant document fetching is addressed. The proposal minimizes the processing of structuring the document with better feature selection using hybrid algorithm. In addition restructuring of feature words to associated documents gets reduced, in turn increases document clustering rate. The performance of the proposed work is measured in terms of cluster objects accuracy, term weight, term frequency and inverse document frequency. Experimental results demonstrate that it achieves very good performance on both feature selection and text document categorization, compared to other classifier methods.
Melting behaviour of gold-platinum nanoalloy clusters by molecular dynamics simulations
Energy Technology Data Exchange (ETDEWEB)
Ong, Yee Pin; Yoon, Tiem Leong [School of Physics, Universiti Sains Malaysia, 11800 USM, Penang (Malaysia); Lim, Thong Leng [Faculty of Engineering and Technology, Multimedia University, Melaka Campus, 75450 Melaka (Malaysia)
2015-04-24
The melting behavior of bimetallic gold-platinum nanoclusters is studied by applying Brownian-type isothermal molecular dynamics (MD) simulation, a program modified from the cubic coupling scheme (CCS). The process begins with the ground-state structures obtained from global minimum search algorithm and proceeds with the investigation of the effect of temperature on the thermal properties of gold-platinum nanoalloy clusters. N-body Gupta potential has been employed in order to account for the interactions between gold and platinum atoms. The ground states of the nanoalloy clusters, which are core-shell segregated, are heated until they become thermally segregated. The detailed melting mechanism of the nanoalloy clusters is studied via this approach to provide insight into the thermal stability of the nanoalloy clusters.
Coevolutionary dynamics with clustering behaviors on cyclic competition
Dong, Linrong; Yang, Guangcan
2012-05-01
We propose a dynamic model for describing clustering behaviors on a cyclic game, in which the same species form a cluster to compete. The rates of consuming the prey depend not only on the individual competing ability v, but also on the two interacting cluster’s sizes. The fragmentation and coagulation rates of the clusters are related to the cohesive strength among the individuals. A new parameter u is introduced to indicate the uniting degree. We find that the probability distribution of the clustering sizes is almost a power law in a large regime specified by the two parameters, which reflects the scale-free behavior in complex systems. In addition, the exponential magnitudes are mostly in the range of real social systems. Our simulation shows that clustering promotes biodiversity. At steady state, the amounts about the three species evolve tempestuously with asymmetric period; the aggregations about big size’s clusters to compete are obvious and on-off intermittence.
AN IMPROVED ALGORITHM FOR SUPERVISED FUZZY C-MEANS CLUSTERING OF REMOTELY SENSED DATA
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
This paper describes an improved algorithm for fuzzy c-means clustering of remotely sensed data, by which the degree of fuzziness of the resultant classification is de creased as comparing with that by a conventional algorithm: that is , the classification accura cy is increased. This is achieved by incorporating covariance matrices at the level of individual classes rather than assuming a global one. Empirical results from a fuzzy classification of an Edinburgh suburban land cover confirmed the improved performance of the new algorithm for fuzzy c-means clustering, in particular when fuzziness is also accommodated in the assumed reference data.
A randomized algorithm for two-cluster partition of a set of vectors
Kel'manov, A. V.; Khandeev, V. I.
2015-02-01
A randomized algorithm is substantiated for the strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum-of-the sum-of-squared-distances criterion. It is assumed that the centroid of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The centroid of the other cluster is fixed at the origin. For an established parameter value, the algorithm finds an approximate solution of the problem in time that is linear in the space dimension and the input size of the problem for given values of the relative error and failure probability. The conditions are established under which the algorithm is asymptotically exact and runs in time that is linear in the space dimension and quadratic in the input size of the problem.
A Comparison of Algorithms for the Construction of SZ Cluster Catalogues
Melin, J -B; Bartelmann, M; Bartlett, J G; Betoule, M; Bobin, J; Carvalho, P; Chon, G; Delabrouille, J; Diego, J M; Harrison, D L; Herranz, D; Hobson, M; Kneissl, R; Lasenby, A N; Jeune, M Le; Lopez-Caniego, M; Mazzotta, P; Rocha, G M; Schaefer, B M; Starck, J -L; Waizmann, J -C; Yvon, D
2012-01-01
We evaluate the construction methodology of an all-sky catalogue of galaxy clusters detected through the Sunyaev-Zel'dovich (SZ) effect. We perform an extensive comparison of twelve algorithms applied to the same detailed simulations of the millimeter and submillimeter sky based on a Planck-like case. We present the results of this "SZ Challenge" in terms of catalogue completeness, purity, astrometric and photometric reconstruction. Our results provide a comparison of a representative sample of SZ detection algorithms and highlight important issues in their application. In our study case, we show that the exact expected number of clusters remains uncertain (about a thousand cluster candidates at |b|> 20 deg with 90% purity) and that it depends on the SZ model and on the detailed sky simulations, and on algorithmic implementation of the detection methods. We also estimate the astrometric precision of the cluster candidates which is found of the order of ~2 arcmins on average, and the photometric uncertainty of...
An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing
Directory of Open Access Journals (Sweden)
Luo Zhong
2014-01-01
Full Text Available With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data.
A reliable cluster detection technique using photometric redshifts: introducing the 2TecX algorithm
van Breukelen, Caroline
2009-01-01
We present a new cluster detection algorithm designed for finding high-redshift clusters using optical/infrared imaging data. The algorithm has two main characteristics. First, it utilises each galaxy's full redshift probability function, instead of an estimate of the photometric redshift based on the peak of the probability function and an associated Gaussian error. Second, it identifies cluster candidates through cross-checking the results of two substantially different selection techniques (the name 2TecX representing the cross-check of the two techniques). These are adaptations of the Voronoi Tesselations and Friends-Of-Friends methods. Monte-Carlo simulations of mock catalogues show that cross-checking the cluster candidates found by the two techniques significantly reduces the detection of spurious sources. Furthermore, we examine the selection effects and relative strengths and weaknesses of either method. The simulations also allow us to fine-tune the algorithm's parameters, and define completeness an...
GPU-based single-cluster algorithm for the simulation of the Ising model
Komura, Yukihiro; Okabe, Yutaka
2012-02-01
We present the GPU calculation with the common unified device architecture (CUDA) for the Wolff single-cluster algorithm of the Ising model. Proposing an algorithm for a quasi-block synchronization, we realize the Wolff single-cluster Monte Carlo simulation with CUDA. We perform parallel computations for the newly added spins in the growing cluster. As a result, the GPU calculation speed for the two-dimensional Ising model at the critical temperature with the linear size L = 4096 is 5.60 times as fast as the calculation speed on a current CPU core. For the three-dimensional Ising model with the linear size L = 256, the GPU calculation speed is 7.90 times as fast as the CPU calculation speed. The idea of quasi-block synchronization can be used not only in the cluster algorithm but also in many fields where the synchronization of all threads is required.
GPU-based single-cluster algorithm for the simulation of the Ising model
Komura, Yukihiro
2011-01-01
We present the GPU calculation with the common unified device architecture (CUDA) for the Wolff single-cluster algorithm of the Ising model. Proposing an algorithm for a quasi-block synchronization, we realize the Wolff single-cluster Monte Carlo simulation with CUDA. We perform parallel computations for the newly added spins in the growing cluster. As a result, the GPU calculation speed for the two-dimensional Ising model at the critical temperature with the linear size L=4096 is 5.60 times as fast as the calculation speed on a current CPU core. For the three-dimensional Ising model with the linear size L=256, the GPU calculation speed is 7.90 times as fast as the CPU calculation speed. The idea of quasi-block synchronization can be used not only in the cluster algorithm but also in many fields where the synchronization of all threads is required.
An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network
Directory of Open Access Journals (Sweden)
C. Vimalarani
2016-01-01
Full Text Available Wireless Sensor Network (WSN is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption.
An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network.
Vimalarani, C; Subramanian, R; Sivanandam, S N
2016-01-01
Wireless Sensor Network (WSN) is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO) algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO) algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption.
Evaluation of clustering algorithms for gene expression data using gene ontology annotations
Institute of Scientific and Technical Information of China (English)
MA Ning; ZHANG Zheng-guo
2012-01-01
Background Clustering is a useful exploratory technique for interpreting gene expression data to reveal groups of genes sharing common functional attributes.Biologists frequently face the problem of choosing an appropriate algorithm.We aimed to provide a standalone,easily accessible and biologically oriented criterion for expression data clustering evaluation.Methods An external criterion utilizing annotation based similarities between genes is proposed in this work.Gene ontology information is employed as the annotation source.Comparisons among six widely used clustering algorithms over various types of gene expression data sets were carried out based on the criterion proposed.Results The rank of these algorithms given by the criterion coincides with our common knowledge.Single-linkage has significantly poorer performance,even worse than the random algorithm.Ward's method archives the best performance in most cases.Conclusions The criterion proposed has a strong ability to distinguish among different clustering algorithms with different distance measurements.It is also demonstrated that analyzing main contributors of the criterion may offer some guidelines in finding local compact clusters.As an addition,we suggest using Ward's algorithm for gene expression data analysis.
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Directory of Open Access Journals (Sweden)
Nan Lin
Full Text Available Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis.
Directory of Open Access Journals (Sweden)
Ashim Kumar Ghosh
2011-12-01
Full Text Available Wireless sensor nodes are use most embedded computing application. Multihop cluster hierarchy has been presented for large wireless sensor networks (WSNs that can provide scalable routing, data aggregation, and querying. The energy consumption rate for sensors in a WSN varies greatly based on the protocols the sensors use for communications. In this paper we present a cluster based routing algorithm. One of our main goals is to design the energy efficient routing protocol. Here we try to solve the usual problems of WSNs. We know the efficiency of WSNs depend upon the distance between node to base station and the amount of data to be transferred and the performance of clustering is greatly influenced by the selection of cluster-heads, which are in charge of creating clusters and controlling member nodes. This algorithm makes the best use of node with low number of cluster head know as super node. Here we divided the full region in four equal zones and the centre area of the region is used to select for super node. Each zone is considered separately and the zone may be or not divided further that’s depending upon the density of nodes in that zone and capability of the super node. This algorithm forms multilayer communication. The no of layer depends on the network current load and statistics. Our algorithm is easily extended to generate a hierarchy of cluster heads to obtain better network management and energy efficiency.
UNSUPERVISED DATA AND HISTOGRAM CLUSTERING USING INCLINED PLANES SYSTEM OPTIMIZATION ALGORITHM
Directory of Open Access Journals (Sweden)
Mohammad Hamed Mozaffari
2014-03-01
Full Text Available Within the last decades, clustering has gained significant recognition as one of the data mining methods, especially in the relatively new field of medical engineering for diagnosing cancer. Clustering is used as a database to automatically group items with similar characteristics. Researchers aim to introduce a novel and powerful algorithm known as Inclined Planes system Optimization (IPO, with capacity to overcome clustering problems. The proposed method identifies each agent used in the algorithm to indicate the centroids of the clusters and automatically select the number of centroids in each time interval (unsupervised clustering. The evaluation method for clustering is based on the Davies Bouldin index (DBi to show cluster validity. Researchers compare known algorithm on series of data bases from various studies to demonstrate the power and capability of the proposed method. These datasets are popular for pattern recognition with diversity in space dimension. Method performance was tested on standard images as a dataset. Study results show significant method advantage over other algorithms.
Congested Link Inference Algorithms in Dynamic Routing IP Network
Directory of Open Access Journals (Sweden)
Yu Chen
2017-01-01
Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.
Directory of Open Access Journals (Sweden)
Lamiaa F. Ibrahim
2011-01-01
Full Text Available Problem statement: The process of network planning is divided into two sub steps. The first step is determining the location of the Multi Service Access Node (MSAN. The second step is the construction of subscriber network lines from MSAN to subscribers to satisfy optimization criteria and design constraints. Due to the complexity of this process artificial intelligence and clustering techniques have been successfully deployed to solve many problems. The problems of the locations of MSAN, the cabling layout and the computation of optimum cable network layouts have been addressed in this study. The proposed algorithm, Clustering density-Based Spatial of Applications with Noise original, minimal Spanning tree and modified Ant-Colony-Based algorithm (CBSCAN-SPANT, used two clustering algorithms which are density-based and agglomerative clustering algorithm using distances which are shortest paths distance and satisfying the network constraints. This algorithm used wire and wireless technology to serve the subscribers demand and place the switches in a real optimal place. Approach: The density-based Spatial Clustering of Applications with Noise original (DBSCAN algorithm has been modified and a new algorithm (NetPlan algorithm has been proposed by the author in a recent work to solve the first step in the problem of network planning. In the present study, the NetPlan algorithm is modified by introduce the modified Ant-Colony-Based algorithm to find the optimal path between any node and the corresponding MSAN node in the first step of network planning process to determine nodes belonging to each cluster. The second step, in the process of network planning, is also introduced in the present study. For each cluster, the optimal cabling layout from each MSAN to the subscriber premises is determining by introduce the Prime algorithm which construct minimal spanning tree. Results: Experimental results and analysis indicate that the
Multidistribution Center Location Based on Real-Parameter Quantum Evolutionary Clustering Algorithm
Directory of Open Access Journals (Sweden)
Huaixiao Wang
2014-01-01
Full Text Available To determine the multidistribution center location and the distribution scope of the distribution center with high efficiency, the real-parameter quantum-inspired evolutionary clustering algorithm (RQECA is proposed. RQECA is applied to choose multidistribution center location on the basis of the conventional fuzzy C-means clustering algorithm (FCM. The combination of the real-parameter quantum-inspired evolutionary algorithm (RQIEA and FCM can overcome the local search defect of FCM and make the optimization result independent of the choice of initial values. The comparison of FCM, clustering based on simulated annealing genetic algorithm (CSAGA, and RQECA indicates that RQECA has the same good convergence as CSAGA, but the search efficiency of RQECA is better than that of CSAGA. Therefore, RQECA is more efficient to solve the multidistribution center location problem.
New two-dimensional fuzzy C-means clustering algorithm for image segmentation
Institute of Scientific and Technical Information of China (English)
无
2008-01-01
To solve the problem of poor anti-noise performance of the traditional fuzzy C-means (FCM) algorithm in image segmentation,a novel two-dimensional FCM clustering algorithm for image segmentation was proposed.In this method,the image segmentation was converted into an optimization problem.The fitness function containing neighbor information was set up based on the gray information and the neighbor relations between the pixcls described by the improved two-dimensional histogram.By making use of the global searching ability of the predator-prey particle swarm optimization,the optimal cluster center could be obtained by iterative optimization,and the image segmentation could be accomplished.The simulation results show that the segmentation accuracy ratio of the proposed method is above 99%.The proposed algorithm has strong anti-noise capability,high clustering accuracy and good segment effect,indicating that it is an effective algorithm for image segmentation.
Energy Technology Data Exchange (ETDEWEB)
Uy, D.L.
1996-02-01
An algorithm for detection and identification of image clusters or {open_quotes}blobs{close_quotes} based on color information for an autonomous mobile robot is developed. The input image data are first processed using a crisp color fuszzyfier, a binary smoothing filter, and a median filter. The processed image data is then inputed to the image clusters detection and identification program. The program employed the concept of {open_quotes}elastic rectangle{close_quotes}that stretches in such a way that the whole blob is finally enclosed in a rectangle. A C-program is develop to test the algorithm. The algorithm is tested only on image data of 8x8 sizes with different number of blobs in them. The algorithm works very in detecting and identifying image clusters.
Workload dynamics on clusters and grids
Li, H.
2009-01-01
This paper presents a comprehensive statistical analysis of a variety of workloads collected on production clusters and Grids. The applications are mostly computational-intensive and each task requires single CPU for processing data, which dominate the workloads on current production Grid systems. T
An approximation polynomial-time algorithm for a sequence bi-clustering problem
Kel'manov, A. V.; Khamidullin, S. A.
2015-06-01
We consider a strongly NP-hard problem of partitioning a finite sequence of vectors in Euclidean space into two clusters using the criterion of the minimal sum of the squared distances from the elements of the clusters to the centers of the clusters. The center of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The center of the other cluster is fixed at the origin. Moreover, the partition is such that the difference between the indices of two successive vectors in the first cluster is bounded above and below by prescribed constants. A 2-approximation polynomial-time algorithm is proposed for this problem.