WorldWideScience

Sample records for hybrid clustering algorithm

  1. Cluster hybrid Monte Carlo simulation algorithms

    Science.gov (United States)

    Plascak, J. A.; Ferrenberg, Alan M.; Landau, D. P.

    2002-06-01

    We show that addition of Metropolis single spin flips to the Wolff cluster-flipping Monte Carlo procedure leads to a dramatic increase in performance for the spin-1/2 Ising model. We also show that adding Wolff cluster flipping to the Metropolis or heat bath algorithms in systems where just cluster flipping is not immediately obvious (such as the spin-3/2 Ising model) can substantially reduce the statistical errors of the simulations. A further advantage of these methods is that systematic errors introduced by the use of imperfect random-number generation may be largely healed by hybridizing single spin flips with cluster flipping.

  2. A hybrid monkey search algorithm for clustering analysis.

    Science.gov (United States)

    Chen, Xin; Zhou, Yongquan; Luo, Qifang

    2014-01-01

    Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.

  3. A Hybrid Monkey Search Algorithm for Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Xin Chen

    2014-01-01

    Full Text Available Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.

  4. Study of the Artificial Fish Swarm Algorithm for Hybrid Clustering

    Directory of Open Access Journals (Sweden)

    Hongwei Zhao

    2015-06-01

    Full Text Available The basic Artificial Fish Swarm (AFS Algorithm is a new type of an heuristic swarm intelligence algorithm, but it is difficult to optimize to get high precision due to the randomness of the artificial fish behavior, which belongs to the intelligence algorithm. This paper presents an extended AFS algorithm, namely the Cooperative Artificial Fish Swarm (CAFS, which significantly improves the original AFS in solving complex optimization problems. K-medoids clustering algorithm is being used to classify data, but the approach is sensitive to the initial selection of the centers with low quality of the divided cluster. A novel hybrid clustering method based on the CAFS and K-medoids could be used for solving clustering problems. In this work, first, CAFS algorithm is used for optimizing six widely-used benchmark functions, coming up with comparative results produced by AFS and CAFS, then Particle Swarm Optimization (PSO is studied. Second, the hybrid algorithm with K-medoids and CAFS algorithms is used for data clustering on several benchmark data sets. The performance of the hybrid algorithm based on K-medoids and CAFS is compared with AFS and CAFS algorithms on a clustering problem. The simulation results show that the proposed CAFS outperforms the other two algorithms in terms of accuracy and robustness.

  5. Intelligent Hybrid Cluster Based Classification Algorithm for Social Network Analysis

    Directory of Open Access Journals (Sweden)

    S. Muthurajkumar

    2014-05-01

    Full Text Available In this paper, we propose an hybrid clustering based classification algorithm based on mean approach to effectively classify to mine the ordered sequences (paths from weblog data in order to perform social network analysis. In the system proposed in this work for social pattern analysis, the sequences of human activities are typically analyzed by switching behaviors, which are likely to produce overlapping clusters. In this proposed system, a robust Modified Boosting algorithm is proposed to hybrid clustering based classification for clustering the data. This work is useful to provide connection between the aggregated features from the network data and traditional indices used in social network analysis. Experimental results show that the proposed algorithm improves the decision results from data clustering when combined with the proposed classification algorithm and hence it is proved that of provides better classification accuracy when tested with Weblog dataset. In addition, this algorithm improves the predictive performance especially for multiclass datasets which can increases the accuracy.

  6. A new hybrid imperialist competitive algorithm on data clustering

    Indian Academy of Sciences (India)

    Taher Niknam; Elahe Taherian Fard; Shervin Ehrampoosh; Alireza Rousta

    2011-06-01

    Clustering is a process for partitioning datasets. This technique is very useful for optimum solution. -means is one of the simplest and the most famous methods that is based on square error criterion. This algorithm depends on initial states and converges to local optima. Some recent researches show that -means algorithm has been successfully applied to combinatorial optimization problems for clustering. In this paper, we purpose a novel algorithm that is based on combining two algorithms of clustering; -means and Modify Imperialist Competitive Algorithm. It is named hybrid K-MICA. In addition, we use a method called modified expectation maximization (EM) to determine number of clusters. The experimented results show that the new method carries out better results than the ACO, PSO, Simulated Annealing (SA), Genetic Algorithm (GA), Tabu Search (TS), Honey Bee Mating Optimization (HBMO) and -means.

  7. A HYBRID HEURISTIC ALGORITHM FOR THE CLUSTERED TRAVELING SALESMAN PROBLEM

    Directory of Open Access Journals (Sweden)

    Mário Mestria

    2016-04-01

    Full Text Available ABSTRACT This paper proposes a hybrid heuristic algorithm, based on the metaheuristics Greedy Randomized Adaptive Search Procedure, Iterated Local Search and Variable Neighborhood Descent, to solve the Clustered Traveling Salesman Problem (CTSP. Hybrid Heuristic algorithm uses several variable neighborhood structures combining the intensification (using local search operators and diversification (constructive heuristic and perturbation routine. In the CTSP, the vertices are partitioned into clusters and all vertices of each cluster have to be visited contiguously. The CTSP is -hard since it includes the well-known Traveling Salesman Problem (TSP as a special case. Our hybrid heuristic is compared with three heuristics from the literature and an exact method. Computational experiments are reported for different classes of instances. Experimental results show that the proposed hybrid heuristic obtains competitive results within reasonable computational time.

  8. Application of hybrid clustering using parallel k-means algorithm and DIANA algorithm

    Science.gov (United States)

    Umam, Khoirul; Bustamam, Alhadi; Lestari, Dian

    2017-03-01

    DNA is one of the carrier of genetic information of living organisms. Encoding, sequencing, and clustering DNA sequences has become the key jobs and routine in the world of molecular biology, in particular on bioinformatics application. There are two type of clustering, hierarchical clustering and partitioning clustering. In this paper, we combined two type clustering i.e. K-Means (partitioning clustering) and DIANA (hierarchical clustering), therefore it called Hybrid clustering. Application of hybrid clustering using Parallel K-Means algorithm and DIANA algorithm used to clustering DNA sequences of Human Papillomavirus (HPV). The clustering process is started with Collecting DNA sequences of HPV are obtained from NCBI (National Centre for Biotechnology Information), then performing characteristics extraction of DNA sequences. The characteristics extraction result is store in a matrix form, then normalize this matrix using Min-Max normalization and calculate genetic distance using Euclidian Distance. Furthermore, the hybrid clustering is applied by using implementation of Parallel K-Means algorithm and DIANA algorithm. The aim of using Hybrid Clustering is to obtain better clusters result. For validating the resulted clusters, to get optimum number of clusters, we use Davies-Bouldin Index (DBI). In this study, the result of implementation of Parallel K-Means clustering is data clustered become 5 clusters with minimal IDB value is 0.8741, and Hybrid Clustering clustered data become 13 sub-clusters with minimal IDB values = 0.8216, 0.6845, 0.3331, 0.1994 and 0.3952. The IDB value of hybrid clustering less than IBD value of Parallel K-Means clustering only that perform at 1ts stage. Its means clustering using Hybrid Clustering have the better result to clustered DNA sequence of HPV than perform parallel K-Means Clustering only.

  9. The Ordered Clustered Travelling Salesman Problem: A Hybrid Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Zakir Hussain Ahmed

    2014-01-01

    Full Text Available The ordered clustered travelling salesman problem is a variation of the usual travelling salesman problem in which a set of vertices (except the starting vertex of the network is divided into some prespecified clusters. The objective is to find the least cost Hamiltonian tour in which vertices of any cluster are visited contiguously and the clusters are visited in the prespecified order. The problem is NP-hard, and it arises in practical transportation and sequencing problems. This paper develops a hybrid genetic algorithm using sequential constructive crossover, 2-opt search, and a local search for obtaining heuristic solution to the problem. The efficiency of the algorithm has been examined against two existing algorithms for some asymmetric and symmetric TSPLIB instances of various sizes. The computational results show that the proposed algorithm is very effective in terms of solution quality and computational time. Finally, we present solution to some more symmetric TSPLIB instances.

  10. The ordered clustered travelling salesman problem: a hybrid genetic algorithm.

    Science.gov (United States)

    Ahmed, Zakir Hussain

    2014-01-01

    The ordered clustered travelling salesman problem is a variation of the usual travelling salesman problem in which a set of vertices (except the starting vertex) of the network is divided into some prespecified clusters. The objective is to find the least cost Hamiltonian tour in which vertices of any cluster are visited contiguously and the clusters are visited in the prespecified order. The problem is NP-hard, and it arises in practical transportation and sequencing problems. This paper develops a hybrid genetic algorithm using sequential constructive crossover, 2-opt search, and a local search for obtaining heuristic solution to the problem. The efficiency of the algorithm has been examined against two existing algorithms for some asymmetric and symmetric TSPLIB instances of various sizes. The computational results show that the proposed algorithm is very effective in terms of solution quality and computational time. Finally, we present solution to some more symmetric TSPLIB instances.

  11. A Novel Hybrid Data Clustering Algorithm Based on Artificial Bee Colony Algorithm and K-Means

    Institute of Scientific and Technical Information of China (English)

    TRAN Dang Cong; WU Zhijian; WANG Zelin; DENG Changshou

    2015-01-01

    To improve the performance of K-means clustering algorithm, this paper presents a new hybrid ap-proach of Enhanced artificial bee colony algorithm and K-means (EABCK). In EABCK, the original artificial bee colony algorithm (called ABC) is enhanced by a new mu-tation operation and guided by the global best solution (called EABC). Then, the best solution is updated by K-means in each iteration for data clustering. In the experi-ments, a set of benchmark functions was used to evaluate the performance of EABC with other comparative ABC variants. To evaluate the performance of EABCK on data clustering, eleven benchmark datasets were utilized. The experimental results show that EABC and EABCK out-perform other comparative ABC variants and data clus-tering algorithms, respectively.

  12. HYBRID APPROACH FOR OPTIMAL CLUSTER HEAD SELECTION IN WSN USING LEACH AND MONKEY SEARCH ALGORITHMS

    Directory of Open Access Journals (Sweden)

    T. SHANKAR

    2017-02-01

    Full Text Available Wireless Sensor Networks (WSNs are being widely used with low-cost, lowpower, multifunction sensors based on the development of wireless communication, which has enabled a wide variety of new applications. In WSN, the main concern is that it contains a limited power battery and is constrained in energy consumption hence energy and lifetime are of paramount importance. To achieve high energy efficiency and prolong network lifetime in WSNs, clustering techniques have been widely adopted. The proposed algorithm is hybridization of well-known Low-Energy Adaptive Clustering Hierarchy (LEACH algorithm with a distinctive Monkey Search (MS algorithm, which is an optimization algorithm used for optimal cluster head selection. The proposed hybrid algorithm exhibit high throughput, residual energy and improved lifetime. Comparison of the proposed hybrid algorithm is made with the well-known cluster-based protocols for WSNs, namely, LEACH and monkey search algorithm, individually.

  13. An efficient hybrid evolutionary optimization algorithm based on PSO and SA for clustering

    Institute of Scientific and Technical Information of China (English)

    Taher NIKNAM; Babak AMIRI; Javad OLAMAEI; Ali AREFI

    2009-01-01

    The K-means algorithm is one of the most popular techniques in clustering. Nevertheless, the performance of the Kmeans algorithm depends highly on initial cluster centers and converges to local minima. This paper proposes a hybrid evolutionary programming based clustering algorithm, called PSO-SA, by combining particle swarm optimization (PSO) and simulated annealing (SA). The basic idea is to search around the global solution by SA and to increase the information exchange among particles using a mutation operator to escape local optima. Three datasets, Iris, Wisconsin Breast Cancer, and Riplcy's Glass, have been considered to show the effectiveness of the proposed clustering algorithm in providing optimal clusters. The simulation results show that the PSO-SA clustering algorithm not only has a better response but also converges more quickly than the K-means, PSO, and SA algorithms.

  14. Cluster Based Hybrid Niche Mimetic and Genetic Algorithm for Text Document Categorization

    Directory of Open Access Journals (Sweden)

    A. K. Santra

    2011-09-01

    Full Text Available An efficient cluster based hybrid niche mimetic and genetic algorithm for text document categorization to improve the retrieval rate of relevant document fetching is addressed. The proposal minimizes the processing of structuring the document with better feature selection using hybrid algorithm. In addition restructuring of feature words to associated documents gets reduced, in turn increases document clustering rate. The performance of the proposed work is measured in terms of cluster objects accuracy, term weight, term frequency and inverse document frequency. Experimental results demonstrate that it achieves very good performance on both feature selection and text document categorization, compared to other classifier methods.

  15. Wavelet neural networks initialization using hybridized clustering and harmony search algorithm: Application in epileptic seizure detection

    Science.gov (United States)

    Zainuddin, Zarita; Lai, Kee Huong; Ong, Pauline

    2013-04-01

    Artificial neural networks (ANNs) are powerful mathematical models that are used to solve complex real world problems. Wavelet neural networks (WNNs), which were developed based on the wavelet theory, are a variant of ANNs. During the training phase of WNNs, several parameters need to be initialized; including the type of wavelet activation functions, translation vectors, and dilation parameter. The conventional k-means and fuzzy c-means clustering algorithms have been used to select the translation vectors. However, the solution vectors might get trapped at local minima. In this regard, the evolutionary harmony search algorithm, which is capable of searching for near-optimum solution vectors, both locally and globally, is introduced to circumvent this problem. In this paper, the conventional k-means and fuzzy c-means clustering algorithms were hybridized with the metaheuristic harmony search algorithm. In addition to obtaining the estimation of the global minima accurately, these hybridized algorithms also offer more than one solution to a particular problem, since many possible solution vectors can be generated and stored in the harmony memory. To validate the robustness of the proposed WNNs, the real world problem of epileptic seizure detection was presented. The overall classification accuracy from the simulation showed that the hybridized metaheuristic algorithms outperformed the standard k-means and fuzzy c-means clustering algorithms.

  16. A Hybrid Distributed Mutual Exclusion Algorithm for Cluster-Based Systems

    Directory of Open Access Journals (Sweden)

    Moharram Challenger

    2013-01-01

    Full Text Available Distributed mutual exclusion is a fundamental problem which arises in various systems such as grid computing, mobile ad hoc networks (MANETs, and distributed databases. Reducing key metrics like message count per any critical section (CS and delay between two CS entrances, which is known as synchronization delay, is a great challenge for this problem. Various algorithms use either permission-based or token-based protocols. Token-based algorithms offer better communication costs and synchronization delay. Raymond's and Suzuki-Kasami's algorithms are well-known token-based ones. Raymond's algorithm needs only O(log2(N messages per CS and Suzuki-Kasami's algorithm needs just one message delivery time between two CS entrances. Nevertheless, both algorithms are weak in the other metric, synchronization delay and message complexity correspondingly. In this work, a new hybrid algorithm is proposed which gains from powerful aspects of both algorithms. Raysuz's algorithm (the proposed algorithm uses a clustered graph and executes Suzuki-Kasami's algorithm intraclusters and Raymond's algorithm interclusters. This leads to have better message complexity than that of pure Suzuki-Kasami's algorithm and better synchronization delay than that of pure Raymond's algorithm, resulting in an overall efficient DMX algorithm pure algorithm.

  17. Hybrid Swarm Intelligence Energy Efficient Clustered Routing Algorithm for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Rajeev Kumar

    2016-01-01

    Full Text Available Currently, wireless sensor networks (WSNs are used in many applications, namely, environment monitoring, disaster management, industrial automation, and medical electronics. Sensor nodes carry many limitations like low battery life, small memory space, and limited computing capability. To create a wireless sensor network more energy efficient, swarm intelligence technique has been applied to resolve many optimization issues in WSNs. In many existing clustering techniques an artificial bee colony (ABC algorithm is utilized to collect information from the field periodically. Nevertheless, in the event based applications, an ant colony optimization (ACO is a good solution to enhance the network lifespan. In this paper, we combine both algorithms (i.e., ABC and ACO and propose a new hybrid ABCACO algorithm to solve a Nondeterministic Polynomial (NP hard and finite problem of WSNs. ABCACO algorithm is divided into three main parts: (i selection of optimal number of subregions and further subregion parts, (ii cluster head selection using ABC algorithm, and (iii efficient data transmission using ACO algorithm. We use a hierarchical clustering technique for data transmission; the data is transmitted from member nodes to the subcluster heads and then from subcluster heads to the elected cluster heads based on some threshold value. Cluster heads use an ACO algorithm to discover the best route for data transmission to the base station (BS. The proposed approach is very useful in designing the framework for forest fire detection and monitoring. The simulation results show that the ABCACO algorithm enhances the stability period by 60% and also improves the goodput by 31% against LEACH and WSNCABC, respectively.

  18. Realization of R-tree for GIS on hybrid clustering algorithm

    Institute of Scientific and Technical Information of China (English)

    HUANG Ji-xian; BAO Guang-shu; LI Qing-song

    2005-01-01

    The characteristic of geographic information system(GIS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.

  19. Clustering and Genetic Algorithm Based Hybrid Flowshop Scheduling with Multiple Operations

    Directory of Open Access Journals (Sweden)

    Yingfeng Zhang

    2014-01-01

    Full Text Available This research is motivated by a flowshop scheduling problem of our collaborative manufacturing company for aeronautic products. The heat-treatment stage (HTS and precision forging stage (PFS of the case are selected as a two-stage hybrid flowshop system. In HTS, there are four parallel machines and each machine can process a batch of jobs simultaneously. In PFS, there are two machines. Each machine can install any module of the four modules for processing the workpeices with different sizes. The problem is characterized by many constraints, such as batching operation, blocking environment, and setup time and working time limitations of modules, and so forth. In order to deal with the above special characteristics, the clustering and genetic algorithm is used to calculate the good solution for the two-stage hybrid flowshop problem. The clustering is used to group the jobs according to the processing ranges of the different modules of PFS. The genetic algorithm is used to schedule the optimal sequence of the grouped jobs for the HTS and PFS. Finally, a case study is used to demonstrate the efficiency and effectiveness of the designed genetic algorithm.

  20. Implementation of hybrid clustering based on partitioning around medoids algorithm and divisive analysis on human Papillomavirus DNA

    Science.gov (United States)

    Arimbi, Mentari Dian; Bustamam, Alhadi; Lestari, Dian

    2017-03-01

    Data clustering can be executed through partition or hierarchical method for many types of data including DNA sequences. Both clustering methods can be combined by processing partition algorithm in the first level and hierarchical in the second level, called hybrid clustering. In the partition phase some popular methods such as PAM, K-means, or Fuzzy c-means methods could be applied. In this study we selected partitioning around medoids (PAM) in our partition stage. Furthermore, following the partition algorithm, in hierarchical stage we applied divisive analysis algorithm (DIANA) in order to have more specific clusters and sub clusters structures. The number of main clusters is determined using Davies Bouldin Index (DBI) value. We choose the optimal number of clusters if the results minimize the DBI value. In this work, we conduct the clustering on 1252 HPV DNA sequences data from GenBank. The characteristic extraction is initially performed, followed by normalizing and genetic distance calculation using Euclidean distance. In our implementation, we used the hybrid PAM and DIANA using the R open source programming tool. In our results, we obtained 3 main clusters with average DBI value is 0.979, using PAM in the first stage. After executing DIANA in the second stage, we obtained 4 sub clusters for Cluster-1, 9 sub clusters for Cluster-2 and 2 sub clusters in Cluster-3, with the BDI value 0.972, 0.771, and 0.768 for each main cluster respectively. Since the second stage produce lower DBI value compare to the DBI value in the first stage, we conclude that this hybrid approach can improve the accuracy of our clustering results.

  1. A Novel Cluster-head Selection Algorithm Based on Hybrid Genetic Optimization for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Lejiang Guo

    2011-05-01

    Full Text Available Wireless Sensor Networks (WSN represent a new dimension in the field of network research. The cluster algorithm can significantly reduce the energy consumption of wireless sensor networks and prolong the network lifetime. This paper uses neuron to describe the WSN node and constructs neural network model for WSN. The neural network model includes three aspects: WSN node neuron model, WSN node control model and WSN node connection model. Through learning the framework of cluster algorithm for wireless sensor networks, this paper presents a weighted average of cluster-head selection algorithm based on an improved Genetic Optimization which makes the node weights directly related to the decision-making predictions. The Algorithm consists of two stages: single-parent evolution and population evolution. The initial population is formed in the stage of single-parent evolution by using gene pool, then the algorithm continues to the next further evolution process, finally the best solution will be generated and saved in the population. The simulation results illustrate that the new algorithm has the high convergence speed and good global searching capacity. It is to effectively balance the network energy consumption, improve the network life-cycle, ensure the communication quality and provide a certain theoretical foundation for the applications of the neural networks.

  2. A Hybrid LBFGS-DE Algorithm for Global Optimization of the Lennard-Jones Cluster Problem

    Directory of Open Access Journals (Sweden)

    Ernesto Padernal Adorio

    2004-12-01

    Full Text Available The Lennard-Jones cluster conformation problem is to determine a configuration of n atoms in three-dimensional space where the sum of the nonlinear pairwise potential function is at a minimum. In this formula, ri,j is the distance between atoms i and j. This optimization problem is a severe test for global optimization algorithms due to its computational complexity: the number of local minima grows exponentially large as the number of atoms in the cluster is increased. As a specific test case, a better cluster configuration than the previously published putative minimum for the 38-atom case was found in the mid-1990s.

  3. Hybrid cluster identification

    Science.gov (United States)

    Martín-Herrero, J.

    2004-10-01

    I present a hybrid method for the labelling of clusters in two-dimensional lattices, which combines the recursive approach with iterative scanning to reduce the stack size required by the pure recursive technique, while keeping its benefits: single pass and straightforward cluster characterization and percolation detection parallel to the labelling. While the capacity to hold the entire lattice in memory is usually regarded as the major constraint for the applicability of the recursive technique, the required stack size is the real limiting factor. Resorting to recursion only for the transverse direction greatly reduces the recursion depth and therefore the required stack. It also enhances the overall performance of the recursive technique, as is shown by results on a set of uniform random binary lattices and on a set of samples of the Ising model. I also show how this technique may replace the recursive technique in Wolff's cluster algorithm, decreasing the risk of stack overflow and increasing its speed, and the Hoshen-Kopelman algorithm in the Swendsen-Wang cluster algorithm, allowing effortless characterization during generation of the samples and increasing its speed.

  4. A Hybrid Spectral Clustering and Deep Neural Network Ensemble Algorithm for Intrusion Detection in Sensor Networks

    Directory of Open Access Journals (Sweden)

    Tao Ma

    2016-10-01

    Full Text Available The development of intrusion detection systems (IDS that are adapted to allow routers and network defence systems to detect malicious network traffic disguised as network protocols or normal access is a critical challenge. This paper proposes a novel approach called SCDNN, which combines spectral clustering (SC and deep neural network (DNN algorithms. First, the dataset is divided into k subsets based on sample similarity using cluster centres, as in SC. Next, the distance between data points in a testing set and the training set is measured based on similarity features and is fed into the deep neural network algorithm for intrusion detection. Six KDD-Cup99 and NSL-KDD datasets and a sensor network dataset were employed to test the performance of the model. These experimental results indicate that the SCDNN classifier not only performs better than backpropagation neural network (BPNN, support vector machine (SVM, random forest (RF and Bayes tree models in detection accuracy and the types of abnormal attacks found. It also provides an effective tool of study and analysis of intrusion detection in large networks.

  5. A Hybrid Spectral Clustering and Deep Neural Network Ensemble Algorithm for Intrusion Detection in Sensor Networks.

    Science.gov (United States)

    Ma, Tao; Wang, Fen; Cheng, Jianjun; Yu, Yang; Chen, Xiaoyun

    2016-10-13

    The development of intrusion detection systems (IDS) that are adapted to allow routers and network defence systems to detect malicious network traffic disguised as network protocols or normal access is a critical challenge. This paper proposes a novel approach called SCDNN, which combines spectral clustering (SC) and deep neural network (DNN) algorithms. First, the dataset is divided into k subsets based on sample similarity using cluster centres, as in SC. Next, the distance between data points in a testing set and the training set is measured based on similarity features and is fed into the deep neural network algorithm for intrusion detection. Six KDD-Cup99 and NSL-KDD datasets and a sensor network dataset were employed to test the performance of the model. These experimental results indicate that the SCDNN classifier not only performs better than backpropagation neural network (BPNN), support vector machine (SVM), random forest (RF) and Bayes tree models in detection accuracy and the types of abnormal attacks found. It also provides an effective tool of study and analysis of intrusion detection in large networks.

  6. Partitional clustering algorithms

    CERN Document Server

    2015-01-01

    This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised classification of patterns into groups, is one of the most important tasks in exploratory data analysis. Primary goals of clustering include gaining insight into, classifying, and compressing data. Clustering has a long and rich history that spans a variety of scientific disciplines including anthropology, biology, medicine, psychology, statistics, mathematics, engineering, and computer science. As a result, numerous clustering algorithms have been proposed since the early 1950s. Among these algorithms, partitional (nonhierarchical) ones have found many applications, especially in engineering and computer science. This book provides coverage of consensus clustering, constrained clustering, large scale and/or high dimensional clustering, cluster validity, cluster visualization, and applications of clustering. Examines clustering as it applies to large and/or high-dimensional data sets commonly encountered in reali...

  7. Scalable classification by clustering: Hybrid can be better than Pure

    Institute of Scientific and Technical Information of China (English)

    Deng Shengchun; He Zengyou; Xu Xiaofei

    2007-01-01

    The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms . To classify new coming data points , it finds the k nearest clusters of the data point as neighbors , and assign each data point to the dominant class of these neighbors . Existing algorithms incorporated class information in making clustering decisions and produced pure clusters (each cluster associated with only one class) . We presented hybrid cluster based algorithms , which produce clusters by unsupervised clustering and allow each cluster associated with multiple classes . Experimental results show that hybrid cluster based algorithms outperform pure ones in both classification accuracy and training speed.

  8. The implementation of hybrid clustering using fuzzy c-means and divisive algorithm for analyzing DNA human Papillomavirus cause of cervical cancer

    Science.gov (United States)

    Andryani, Diyah Septi; Bustamam, Alhadi; Lestari, Dian

    2017-03-01

    Clustering aims to classify the different patterns into groups called clusters. In this clustering method, we use n-mers frequency to calculate the distance matrix which is considered more accurate than using the DNA alignment. The clustering results could be used to discover biologically important sub-sections and groups of genes. Many clustering methods have been developed, while hard clustering methods considered less accurate than fuzzy clustering methods, especially if it is used for outliers data. Among fuzzy clustering methods, fuzzy c-means is one the best known for its accuracy and simplicity. Fuzzy c-means clustering uses membership function variable, which refers to how likely the data could be members into a cluster. Fuzzy c-means clustering works using the principle of minimizing the objective function. Parameters of membership function in fuzzy are used as a weighting factor which is also called the fuzzier. In this study we implement hybrid clustering using fuzzy c-means and divisive algorithm which could improve the accuracy of cluster membership compare to traditional partitional approach only. In this study fuzzy c-means is used in the first step to find partition results. Furthermore divisive algorithms will run on the second step to find sub-clusters and dendogram of phylogenetic tree. To find the best number of clusters is determined using the minimum value of Davies Bouldin Index (DBI) of the cluster results. In this research, the results show that the methods introduced in this paper is better than other partitioning methods. Finally, we found 3 clusters with DBI value of 1.126628 at first step of clustering. Moreover, DBI values after implementing the second step of clustering are always producing smaller IDB values compare to the results of using first step clustering only. This condition indicates that the hybrid approach in this study produce better performance of the cluster results, in term its DBI values.

  9. Parallel Wolff Cluster Algorithms

    Science.gov (United States)

    Bae, S.; Ko, S. H.; Coddington, P. D.

    The Wolff single-cluster algorithm is the most efficient method known for Monte Carlo simulation of many spin models. Due to the irregular size, shape and position of the Wolff clusters, this method does not easily lend itself to efficient parallel implementation, so that simulations using this method have thus far been confined to workstations and vector machines. Here we present two parallel implementations of this algorithm, and show that one gives fairly good performance on a MIMD parallel computer.

  10. A Hybrid Constrained Semi-Supervised Clustering Algorithm%一种混合约束的半监督聚类算法

    Institute of Scientific and Technical Information of China (English)

    李雪梅; 王立宏; 宋宜斌

    2011-01-01

    提出一种混合约束的半监督聚类算法(HCC),综合考虑标号点和成对点约束信息的作用,使两种先验信息在聚类的过程中能以不同的方式发挥作用.给出理论推导、具体算法步骤、实验及分析.实验表明在HCC算法中,标号点对提高聚类结果的作用要比成对点约束信息的作用更明显,算法得到的CRI、聚类数、运行时间等多项指标都比对比算法好.%A hybrid constrained semi-supervised clustering algorithm (HCC) is proposed based on consistency algorithm. To get a better clustering result, both labeled data and pairwise constraints are considered in clustering to make use of two types of prior knowledge supplementary to each other. The theoretical derivation and the algorithm are presented in detail. Experimental results show that labeled data outperform pairwise constraints in promoting the quality of clustering. Additionally, for many indices, such as CRI, number of clusters and running time, HCC is better than comparative algorithms.

  11. Intuitionistic Fuzzy Possibilistic C Means Clustering Algorithms

    Directory of Open Access Journals (Sweden)

    Arindam Chaudhuri

    2015-01-01

    Full Text Available Intuitionistic fuzzy sets (IFSs provide mathematical framework based on fuzzy sets to describe vagueness in data. It finds interesting and promising applications in different domains. Here, we develop an intuitionistic fuzzy possibilistic C means (IFPCM algorithm to cluster IFSs by hybridizing concepts of FPCM, IFSs, and distance measures. IFPCM resolves inherent problems encountered with information regarding membership values of objects to each cluster by generalizing membership and nonmembership with hesitancy degree. The algorithm is extended for clustering interval valued intuitionistic fuzzy sets (IVIFSs leading to interval valued intuitionistic fuzzy possibilistic C means (IVIFPCM. The clustering algorithm has membership and nonmembership degrees as intervals. Information regarding membership and typicality degrees of samples to all clusters is given by algorithm. The experiments are performed on both real and simulated datasets. It generates valuable information and produces overlapped clusters with different membership degrees. It takes into account inherent uncertainty in information captured by IFSs. Some advantages of algorithms are simplicity, flexibility, and low computational complexity. The algorithm is evaluated through cluster validity measures. The clustering accuracy of algorithm is investigated by classification datasets with labeled patterns. The algorithm maintains appreciable performance compared to other methods in terms of pureness ratio.

  12. Recovery Rate of Clustering Algorithms

    NARCIS (Netherlands)

    Li, Fajie; Klette, Reinhard; Wada, T; Huang, F; Lin, S

    2009-01-01

    This article provides a simple and general way for defining the recovery rate of clustering algorithms using a given family of old clusters for evaluating the performance of the algorithm when calculating a family of new clusters. Under the assumption of dealing with simulated data (i.e., known old

  13. Data clustering algorithms and applications

    CERN Document Server

    Aggarwal, Charu C

    2013-01-01

    Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data mining, and machine learning communities. Addressing this problem in a unified way, Data Clustering: Algorithms and Applications provides complete coverage of the entire area of clustering, from basic methods to more refined and complex data clustering approaches. It pays special attention to recent issues in graphs, social networks, and other domains.The book focuses on three primary aspects of data clustering: Methods, describing key techniques commonly used for clustering, such as fea

  14. 两阶段混合粒子群优化聚类%Two-Step Hybrid PSO-Based Clustering Algorithm

    Institute of Scientific and Technical Information of China (English)

    王纵虎; 刘志镜; 陈东辉

    2012-01-01

    In order to solve the problems of the existing PSO (particle swarm optimization) K-means algorithms, i. e. , their calculation speeds are slow and the clustering results are unstable when samples have a high dimension, some high-quality sub-clusters were generated by hierarchical agglomerative clustering. These sub-clusters were used as the search space of candidate centroids of the PSO K-means. In order to reduce the computational complexity when the dimension of a sample is high, a simplified particle encoding method was proposed. In addition, chaotic idea was introduced to keep the diversity of particle swarm to avoid premature. By two-step hybrid clustering the advantages of the hierarchical clustering, the partitioning clustering and the PSO were combined. The experimental results on several UCI data sets show that compared with the best results of several contrastive algorithms, the purity of its clustering result increases by 1% to 8% and the consuming time reduces by 50% at least.%为解决数据集样本维数较高时已有粒子群优化K均值算法计算速度较慢且聚类结果不稳定的问题,利用第1阶段聚类层次凝聚聚类获得准确率较高的子簇集合,作为粒子群优化K均值聚类算法初始聚类中心的搜索空间,进行第2阶段聚类.提出了一种简化的粒子编码方法,以减小样本维数对计算复杂度的影响;引入混沌的思想,以保持粒子种群的多样性,从而避免粒子群优化算法可能出现的早熟现象.通过两阶段聚类,有效地融合了粒子群优化、层次聚类与划分聚类算法的优点.在多个UCI数据集上的聚类结果表明,与几种对比算法聚类结果的最优值相比,其纯度分别提高了1%~8%,且耗时减少50%以上.

  15. Kernel Generalized Noise Clustering Algorithm

    Institute of Scientific and Technical Information of China (English)

    WU Xiao-hong; ZHOU Jian-jiang

    2007-01-01

    To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and the GNC model which are based on Euclidean distance, the presented model is based on kernel-induced distance by using kernel method. By kernel method the input data are nonlinearly and implicitly mapped into a high-dimensional feature space, where the nonlinear pattern appears linear and the GNC algorithm is performed. It is unnecessary to calculate in high-dimensional feature space because the kernel function can do itjust in input space. The effectiveness of the proposed algorithm is verified by experiments on three data sets. It is concluded that the KGNC algorithm has better clustering accuracy than FCM and GNC in clustering data sets containing noisy data.

  16. Cluster Synchronization Algorithms

    NARCIS (Netherlands)

    Xia, Weiguo; Cao, Ming

    2010-01-01

    This paper presents two approaches to achieving cluster synchronization in dynamical multi-agent systems. In contrast to the widely studied synchronization behavior, where all the coupled agents converge to the same value asymptotically, in the cluster synchronization problem studied in this paper,

  17. Intuitionistic fuzzy hierarchical clustering algorithms

    Institute of Scientific and Technical Information of China (English)

    Xu Zeshui

    2009-01-01

    Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a mem-bership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set (IVIFS), whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty. However, it seems that little attention has been focused on the clus-tering analysis of IFSs and IVIFSs. An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs, which is based on the traditional hierarchical clustering procedure, the intuitionistic fuzzy aggregation operator, and the basic distance measures between IFSs: the Hamming distance, normalized Hamming, weighted Hamming, the Euclidean distance, the normalized Euclidean distance, and the weighted Euclidean distance. Subsequently, the algorithm is extended for clustering IVIFSs. Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.

  18. Fuzzy clustering, genetic algorithms and neuro-fuzzy methods compared for hybrid fuzzy-first principles modeling

    NARCIS (Netherlands)

    van Lith, Pascal; van Lith, P.F.; Betlem, Bernardus H.L.; Roffel, B.

    2002-01-01

    Hybrid fuzzy-first principles models can be a good alternative if a complete physical model is difficult to derive. These hybrid models consist of a framework of dynamic mass and energy balances, supplemented by fuzzy submodels describing additional equations, such as mass transformation and

  19. Fuzzy Clustering, Genetic Algorithms and Neuro-Fuzzy Methods Compared for Hybrid Fuzzy-First Principles Modeling

    NARCIS (Netherlands)

    Lith, Pascal F. van; Betlem, Ben H.L.; Roffel, Brian

    2002-01-01

    Hybrid fuzzy-first principles models can be a good alternative if a complete physical model is difficult to derive. These hybrid models consist of a framework of dynamic mass and energy balances, supplemented by fuzzy submodels describing additional equations, such as mass transformation and

  20. Fuzzy Clustering, Genetic Algorithms and Neuro-Fuzzy Methods Compared for Hybrid Fuzzy-First Principles Modeling

    NARCIS (Netherlands)

    Lith, Pascal F. van; Betlem, Ben H.L.; Roffel, Brian

    2002-01-01

    Hybrid fuzzy-first principles models can be a good alternative if a complete physical model is difficult to derive. These hybrid models consist of a framework of dynamic mass and energy balances, supplemented by fuzzy submodels describing additional equations, such as mass transformation and transfe

  1. Fuzzy Clustering, Genetic Algorithms and Neuro-Fuzzy Methods Compared for Hybrid Fuzzy-First Principles Modeling

    NARCIS (Netherlands)

    Lith, Pascal F. van; Betlem, Ben H.L.; Roffel, Brian

    2002-01-01

    Hybrid fuzzy-first principles models can be a good alternative if a complete physical model is difficult to derive. These hybrid models consist of a framework of dynamic mass and energy balances, supplemented by fuzzy submodels describing additional equations, such as mass transformation and transfe

  2. Extended Fuzzy Clustering Algorithms

    NARCIS (Netherlands)

    U. Kaymak (Uzay); M. Setnes

    2000-01-01

    textabstractFuzzy clustering is a widely applied method for obtaining fuzzy models from data. It has been applied successfully in various fields including finance and marketing. Despite the successful applications, there are a number of issues that must be dealt with in practical applications of fuz

  3. Parallel algorithms and cluster computing

    CERN Document Server

    Hoffmann, Karl Heinz

    2007-01-01

    This book presents major advances in high performance computing as well as major advances due to high performance computing. It contains a collection of papers in which results achieved in the collaboration of scientists from computer science, mathematics, physics, and mechanical engineering are presented. From the science problems to the mathematical algorithms and on to the effective implementation of these algorithms on massively parallel and cluster computers we present state-of-the-art methods and technology as well as exemplary results in these fields. This book shows that problems which seem superficially distinct become intimately connected on a computational level.

  4. Determination of atomic cluster structure with cluster fusion algorithm

    DEFF Research Database (Denmark)

    Obolensky, Oleg I.; Solov'yov, Ilia; Solov'yov, Andrey V.

    2005-01-01

    We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters.......We report an efficient scheme of global optimization, called cluster fusion algorithm, which has proved its reliability and high efficiency in determination of the structure of various atomic clusters....

  5. Particle identification using clustering algorithms

    CERN Document Server

    Wirth, R; Löher, B; Savran, D; Silva, J; Pol, H Álvarez; Gil, D Cortina; Pietras, B; Bloch, T; Kröll, T; Nácher, E; Perea, Á; Tengblad, O; Bendel, M; Dierigl, M; Gernhäuser, R; Bleis, T Le; Winkel, M

    2013-01-01

    A method that uses fuzzy clustering algorithms to achieve particle identification based on pulse shape analysis is presented. The fuzzy c-means clustering algorithm is used to compute mean (principal) pulse shapes induced by different particle species in an automatic and unsupervised fashion from a mixed set of data. A discrimination amplitude is proposed using these principal pulse shapes to identify the originating particle species of a detector pulse. Since this method does not make any assumptions about the specific features of the pulse shapes, it is very generic and suitable for multiple types of detectors. The method is applied to discriminate between photon- and proton-induced signals in CsI(Tl) scintillator detectors and the results are compared to the well-known integration method.

  6. An Improved Weighted Clustering Algorithm in MANET

    Institute of Scientific and Technical Information of China (English)

    WANG Jin; XU Li; ZHENG Bao-yu

    2004-01-01

    The original clustering algorithms in Mobile Ad hoc Network (MANET) are firstly analyzed in this paper.Based on which, an Improved Weighted Clustering Algorithm (IWCA) is proposed. Then, the principle and steps of our algorithm are explained in detail, and a comparison is made between the original algorithms and our improved method in the aspects of average cluster number, topology stability, clusterhead load balance and network lifetime. The experimental results show that our improved algorithm has the best performance on average.

  7. Kernel method-based fuzzy clustering algorithm

    Institute of Scientific and Technical Information of China (English)

    Wu Zhongdong; Gao Xinbo; Xie Weixin; Yu Jianping

    2005-01-01

    The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, data with noise, data with mixture of heterogeneous cluster prototypes, asymmetric data, etc. Based on the Mercer kernel, FKCM clustering algorithm is derived from FCM algorithm united with kernel method. The results of experiments with the synthetic and real data show that the FKCM clustering algorithm is universality and can effectively unsupervised analyze datasets with variform structures in contrast to FCM algorithm. It is can be imagined that kernel-based clustering algorithm is one of important research direction of fuzzy clustering analysis.

  8. A Hybrid Architecture Approach for Quantum Algorithms

    Directory of Open Access Journals (Sweden)

    Mohammad R.S. Aghaei

    2009-01-01

    Full Text Available Problem statement: In this study, a general plan of hybrid architecture for quantum algorithms is proposed. Approach: Analysis of the quantum algorithms shows that these algorithms were hybrid with two parts. First, the relationship of classical and quantum parts of the hybrid algorithms was extracted. Then a general plan of hybrid structure was designed. Results: This plan was illustrated the hybrid architecture and the relationship of classical and quantum parts of the algorithms. This general plan was used to increase implementation performance of quantum algorithms. Conclusion/Recommendations: Moreover, simulation results of quantum algorithms on the hybrid architecture proved that quantum algorithms can be implemented on the general plan as well.

  9. A new cluster algorithm for graphs

    NARCIS (Netherlands)

    Dongen, S. van

    1998-01-01

    A new cluster algorithm for graphs called the emph{Markov Cluster algorithm ($MCL$ algorithm) is introduced. The graphs may be both weighted (with nonnegative weight) and directed. Let~$G$~be such a graph. The $MCL$ algorithm simulates flow in $G$ by first identifying $G$ in a canonical way with

  10. Cluster Tree Based Hybrid Document Similarity Measure

    Directory of Open Access Journals (Sweden)

    M. Varshana Devi

    2015-10-01

    Full Text Available <Cluster tree based hybrid similarity measure is established to measure the hybrid similarity. In cluster tree, the hybrid similarity measure can be calculated for the random data even it may not be the co-occurred and generate different views. Different views of tree can be combined and choose the one which is significant in cost. A method is proposed to combine the multiple views. Multiple views are represented by different distance measures into a single cluster. Comparing the cluster tree based hybrid similarity with the traditional statistical methods it gives the better feasibility for intelligent based search. It helps in improving the dimensionality reduction and semantic analysis.

  11. Extension of K-Means Algorithm for clustering mixed data | Onuodu ...

    African Journals Online (AJOL)

    Extension of K-Means Algorithm for clustering mixed data. ... PROMOTING ACCESS TO AFRICAN RESEARCH ... In this work, a new hybrid method has been proposed which extends K-means algorithm to categorical domain and mixed-type ...

  12. Using Hyper Clustering Algorithms in Mobile Network Planning

    Directory of Open Access Journals (Sweden)

    Lamiaa F. Ibrahim

    2011-01-01

    Full Text Available Problem statement: As a large amount of data stored in spatial databases, people may like to find groups of data which share similar features. Thus cluster analysis becomes an important area of research in data mining. Applications of clustering analysis have been utilized in many fields, such as when we search to construct a cluster served by base station in mobile network. Deciding upon the optimum placement for the base stations to achieve best services while reducing the cost is a complex task requiring vast computational resource. Approach: This study addresses antenna placement problem or the cell planning problem, involves locating and configuring infrastructure for mobile networks by modified the original density-based Spatial Clustering of Applications with Noise algorithm. The Cluster Partitioning around Medoids original algorithm has been modified and a new algorithm has been proposed by the authors in a recent work. In this study, the density-based Spatial Clustering of Applications with Noise original algorithm has been modified and combined with old algorithm to produce the hybrid algorithm Clustering Density Base and Clustering with Weighted Node-Partitioning around Medoids algorithm to solve the problems in Mobile Network Planning. Results: Implementation of this algorithm to a real case study is presented. Results demonstrate that the proposed algorithm has minimum run time minimum cost and high grade of service. Conclusion: The proposed hyper algorithm has the advantage of quick divide the area into clusters where the density base algorithm has a limit iteration and the advantage of accuracy (no sampling method is used and highly grade of service due to the moving of the location of the base stations (medoid toward the heavy loaded (weighted nodes.

  13. Symbolic Algorithmic Analysis of Rectangular Hybrid Systems

    Institute of Scientific and Technical Information of China (English)

    Hai-Bin Zhang; Zhen-Hua Duan

    2009-01-01

    This paper investigates symbolic algorithmic analysis of rectangular hybrid systems. To deal with the symbolic reachability problem, a restricted constraint system called hybrid zone is formalized for the representation and manipulation of rectangular automata state-spaces. Hybrid zones are proved to be closed over symbolic reachability operations of rectangular hybrid systems. They are also applied to model-checking procedures for verifying some important classes of timed computation tree logic formulas. To represent hybrid zones, a data structure called difference constraint matrix is defined.These enable us to deal with the symbolic algorithmic analysis of rectangular hybrid systems in an efficient way.

  14. A HYBRID FIREFLY ALGORITHM WITH FUZZY-C MEAN ALGORITHM FOR MRI BRAIN SEGMENTATION

    Directory of Open Access Journals (Sweden)

    Mutasem K. Alsmadi

    2014-01-01

    Full Text Available Image processing is one of the essential tasks to extract suspicious region and robust features from the Magnetic Resonance Imaging (MRI. A numbers of the segmentation algorithms were developed in order to satisfy and increasing the accuracy of brain tumor detection. In the medical image processing brain image segmentation is considered as a complex and challenging part. Fuzzy c-means is unsupervised method that has been implemented for clustering of the MRI and different purposes such as recognition of the pattern of interest and image segmentation. However; fuzzy c-means algorithm still suffers many drawbacks, such as low convergence rate, getting stuck in the local minima and vulnerable to initialization sensitivity. Firefly algorithm is a new population-based optimization method that has been used successfully for solving many complex problems. This paper proposed a new dynamic and intelligent clustering method for brain tumor segmentation using the hybridization of Firefly Algorithm (FA with Fuzzy C-Means algorithm (FCM. In order to automatically segment MRI brain images and improve the capability of the FCM to automatically elicit the proper number and location of cluster centres and the number of pixels in each cluster in the abnormal (multiple sclerosis lesions MRI images. The experimental results proved the effectiveness of the proposed FAFCM in enhancing the performance of the traditional FCM clustering. Moreover; the superiority of the FAFCM with other state-of-the-art segmentation methods is shown qualitatively and quantitatively. Conclusion: A novel efficient and reliable clustering algorithm presented in this work, which is called FAFCM based on the hybridization of the firefly algorithm with fuzzy c-mean clustering algorithm. Automatically; the hybridized algorithm has the capability to cluster and segment MRI brain images.

  15. Frequent Pattern Mining Algorithms for Data Clustering

    DEFF Research Database (Denmark)

    Zimek, Arthur; Assent, Ira; Vreeken, Jilles

    2014-01-01

    that frequent pattern mining was at the cradle of subspace clustering—yet, it quickly developed into an independent research field. In this chapter, we discuss how frequent pattern mining algorithms have been extended and generalized towards the discovery of local clusters in high-dimensional data......Discovering clusters in subspaces, or subspace clustering and related clustering paradigms, is a research field where we find many frequent pattern mining related influences. In fact, as the first algorithms for subspace clustering were based on frequent pattern mining algorithms, it is fair to say....... In particular, we discuss several example algorithms for subspace clustering or projected clustering as well as point out recent research questions and open topics in this area relevant to researchers in either clustering or pattern mining...

  16. Introduction to Cluster Monte Carlo Algorithms

    Science.gov (United States)

    Luijten, E.

    This chapter provides an introduction to cluster Monte Carlo algorithms for classical statistical-mechanical systems. A brief review of the conventional Metropolis algorithm is given, followed by a detailed discussion of the lattice cluster algorithm developed by Swendsen and Wang and the single-cluster variant introduced by Wolff. For continuum systems, the geometric cluster algorithm of Dress and Krauth is described. It is shown how their geometric approach can be generalized to incorporate particle interactions beyond hardcore repulsions, thus forging a connection between the lattice and continuum approaches. Several illustrative examples are discussed.

  17. Hybrid cluster state proposal for a quantum game

    CERN Document Server

    Paternostro, M; Kim, M S

    2005-01-01

    We propose an experimental implementation of a quantum game algorithm in a hybrid scheme combining the quantum circuit approach and the cluster state model. An economical cluster configuration is suggested to embody a quantum version of the Prisoners' Dilemma. Our proposal is shown to be within the experimental state-of-art and can be realized with existing technology. The effects of relevant experimental imperfections are also carefully examined.

  18. A Novel Research on Rough Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Tao Qu

    2014-01-01

    Full Text Available The aim of this study is focusing the issue of traditional clustering algorithm subjects to data space distribution influence, a novel clustering algortihm combined with rough set theory is employed to the normal clustering. The proposed rough clustering algorithm takes the condition attributes and decision attributes displayed in the information table as the consistency principle, meanwhile it takes the data supercubic and information entropy to realize data attribute shortcutting and discretizing. Based on above discussion, by applying assemble feature vector addition principle computiation only one scanning information table can realize clustering for the data subject. Experiments reveal that the proposed algorithm is efficient and feasible.

  19. A Hybrid Chaotic Quantum Evolutionary Algorithm

    DEFF Research Database (Denmark)

    Cai, Y.; Zhang, M.; Cai, H.

    2010-01-01

    A hybrid chaotic quantum evolutionary algorithm is proposed to reduce amount of computation, speed up convergence and restrain premature phenomena of quantum evolutionary algorithm. The proposed algorithm adopts the chaotic initialization method to generate initial population which will form...... and enhance the global search ability. A large number of tests show that the proposed algorithm has higher convergence speed and better optimizing ability than quantum evolutionary algorithm, real-coded quantum evolutionary algorithm and hybrid quantum genetic algorithm. Tests also show that when chaos...... is introduced to quantum evolutionary algorithm, the hybrid chaotic search strategy is superior to the carrier chaotic strategy, and has better comprehensive performance than the chaotic mutation strategy in most of cases. Especially, the proposed algorithm is the only one that has 100% convergence rate in all...

  20. Hesitant fuzzy agglomerative hierarchical clustering algorithms

    Science.gov (United States)

    Zhang, Xiaolu; Xu, Zeshui

    2015-02-01

    Recently, hesitant fuzzy sets (HFSs) have been studied by many researchers as a powerful tool to describe and deal with uncertain data, but relatively, very few studies focus on the clustering analysis of HFSs. In this paper, we propose a novel hesitant fuzzy agglomerative hierarchical clustering algorithm for HFSs. The algorithm considers each of the given HFSs as a unique cluster in the first stage, and then compares each pair of the HFSs by utilising the weighted Hamming distance or the weighted Euclidean distance. The two clusters with smaller distance are jointed. The procedure is then repeated time and again until the desirable number of clusters is achieved. Moreover, we extend the algorithm to cluster the interval-valued hesitant fuzzy sets, and finally illustrate the effectiveness of our clustering algorithms by experimental results.

  1. The Parallel Maximal Cliques Algorithm for Protein Sequence Clustering

    Directory of Open Access Journals (Sweden)

    Khalid Jaber

    2009-01-01

    Full Text Available Problem statement: Protein sequence clustering is a method used to discover relations between proteins. This method groups the proteins based on their common features. It is a core process in protein sequence classification. Graph theory has been used in protein sequence clustering as a means of partitioning the data into groups, where each group constitutes a cluster. Mohseni-Zadeh introduced a maximal cliques algorithm for protein clustering. Approach: In this study we adapted the maximal cliques algorithm of Mohseni-Zadeh to find cliques in protein sequences and we then parallelized the algorithm to improve computation times and allowed large protein databases to be processed. We used the N-Gram Hirschberg approach proposed by Abdul Rashid to calculate the distance between protein sequences. The task farming parallel program model was used to parallelize the enhanced cliques algorithm. Results: Our parallel maximal cliques algorithm was implemented on the stealth cluster using the C programming language and a hybrid approach that includes both the Message Passing Interface (MPI library and POSIX threads (PThread to accelerate protein sequence clustering. Conclusion: Our results showed a good speedup over sequential algorithms for cliques in protein sequences.

  2. Algorithm for Spatial Clustering with Obstacles

    CERN Document Server

    El-Sharkawi, Mohamed E

    2009-01-01

    In this paper, we propose an efficient clustering technique to solve the problem of clustering in the presence of obstacles. The proposed algorithm divides the spatial area into rectangular cells. Each cell is associated with statistical information that enables us to label the cell as dense or non-dense. We also label each cell as obstructed (i.e. intersects any obstacle) or non-obstructed. Then the algorithm finds the regions (clusters) of connected, dense, non-obstructed cells. Finally, the algorithm finds a center for each such region and returns those centers as centers of the relatively dense regions (clusters) in the spatial area.

  3. A new fusion algorithm for fuzzy clustering

    Directory of Open Access Journals (Sweden)

    Ivan Vidović

    2014-12-01

    Full Text Available In this paper, we have considered the merging problem of two ellipsoidal clusters in order to construct a new fusion algorithm for fuzzy clustering. We have proposed a criterion for merging two ellipsoidal clusters ∏1, ∏2 with associated main Mahalanobis circles Ej(cj,σj, where cj is the centroid and σ^2j is the Mahalanobis variance of cluster ∏j . Based on the well-known Davies-Bouldin index, we have constructed a new fusion algorithm. The criterion has been tested on several data sets, and the performance of the fusion algorithm has been demonstrated on an illustrative example.

  4. Novel Cluster Validity Index for FCM Algorithm

    Institute of Scientific and Technical Information of China (English)

    Jian Yu; Cui-Xia Li

    2006-01-01

    How to determine an appropriate number of clusters is very important when implementing a specific clustering algorithm, like c-means, fuzzy c-means (FCM). In the literature, most cluster validity indices are originated from partition or geometrical property of the data set. In this paper, the authors developed a novel cluster validity index for FCM, based on the optimality test of FCM. Unlike the previous cluster validity indices, this novel cluster validity index is inherent in FCM itself. Comparison experiments show that the stability index can be used as cluster validity index for the fuzzy c-means.

  5. Multithreaded Implementation of Hybrid String Matching Algorithm

    Directory of Open Access Journals (Sweden)

    Akhtar Rasool

    2012-03-01

    Full Text Available Reading and taking reference from many books and articles, and then analyzing the Navies algorithm, Boyer Moore algorithm and Knuth Morris Pratt (KMP algorithm and a variety of improved algorithms, summarizes various advantages and disadvantages of the pattern matching algorithms. And on this basis, a new algorithm – Multithreaded Hybrid algorithm is introduced. The algorithm refers to Boyer Moore algorithm, KMP algorithm and the thinking of improved algorithms. Utilize the last character of the string, the next character and the method to compare from side to side, and then advance a new hybrid pattern matching algorithm. And it adjusted the comparison direction and the order of the comparison to make the maximum moving distance of each time to reduce the pattern matching time. The algorithm reduces the comparison number and greatlyreduces the moving number of the pattern and improves the matching efficiency. Multithreaded implementation of hybrid, pattern matching algorithm performs the parallel string searching on different text data by executing a number of threads simultaneously. This approach is advantageous from all other string-pattern matching algorithm in terms of time complexity. This again improves the overall string matching efficiency.

  6. The Rational Hybrid Monte Carlo Algorithm

    CERN Document Server

    Clark, M A

    2006-01-01

    The past few years have seen considerable progress in algorithmic development for the generation of gauge fields including the effects of dynamical fermions. The Rational Hybrid Monte Carlo (RHMC) algorithm, where Hybrid Monte Carlo is performed using a rational approximation in place the usual inverse quark matrix kernel is one of these developments. This algorithm has been found to be extremely beneficial in many areas of lattice QCD (chiral fermions, finite temperature, Wilson fermions etc.). We review the algorithm and some of these benefits, and we compare against other recent algorithm developements. We conclude with an update of the Berlin wall plot comparing costs of all popular fermion formulations.

  7. The Rational Hybrid Monte Carlo algorithm

    Science.gov (United States)

    Clark, Michael

    2006-12-01

    The past few years have seen considerable progress in algorithmic development for the generation of gauge fields including the effects of dynamical fermions. The Rational Hybrid Monte Carlo (RHMC) algorithm, where Hybrid Monte Carlo is performed using a rational approximation in place the usual inverse quark matrix kernel is one of these developments. This algorithm has been found to be extremely beneficial in many areas of lattice QCD (chiral fermions, finite temperature, Wilson fermions etc.). We review the algorithm and some of these benefits, and we compare against other recent algorithm developements. We conclude with an update of the Berlin wall plot comparing costs of all popular fermion formulations.

  8. An object-oriented cluster search algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Silin, Dmitry; Patzek, Tad

    2003-01-24

    In this work we describe two object-oriented cluster search algorithms, which can be applied to a network of an arbitrary structure. First algorithm calculates all connected clusters, whereas the second one finds a path with the minimal number of connections. We estimate the complexity of the algorithm and infer that the number of operations has linear growth with respect to the size of the network.

  9. An extended EM algorithm for subspace clustering

    Institute of Scientific and Technical Information of China (English)

    Lifei CHEN; Qingshan JIANG

    2008-01-01

    Clustering high dimensional data has become a challenge in data mining due to the curse of dimension-ality. To solve this problem, subspace clustering has been defined as an extension of traditional clustering that seeks to find clusters in subspaces spanned by different combinations of dimensions within a dataset. This paper presents a new subspace clustering algorithm that calcu-lates the local feature weights automatically in an EM-based clustering process. In the algorithm, the features are locally weighted by using a new unsupervised weight-ing method, as a means to minimize a proposed cluster-ing criterion that takes into account both the average intra-clusters compactness and the average inter-clusters separation for subspace clustering. For the purposes of capturing accurate subspace information, an additional outlier detection process is presented to identify the pos-sible local outliers of subspace clusters, and is embedded between the E-step and M-step of the algorithm. The method has been evaluated in clustering real-world gene expression data and high dimensional artificial data with outliers, and the experimental results have shown its effectiveness.

  10. Data clustering theory, algorithms, and applications

    CERN Document Server

    Gan, Guojun; Wu, Jianhong

    2007-01-01

    Cluster analysis is an unsupervised process that divides a set of objects into homogeneous groups. This book starts with basic information on cluster analysis, including the classification of data and the corresponding similarity measures, followed by the presentation of over 50 clustering algorithms in groups according to some specific baseline methodologies such as hierarchical, center-based, and search-based methods. As a result, readers and users can easily identify an appropriate algorithm for their applications and compare novel ideas with existing results. The book also provides examples of clustering applications to illustrate the advantages and shortcomings of different clustering architectures and algorithms. Application areas include pattern recognition, artificial intelligence, information technology, image processing, biology, psychology, and marketing. Readers also learn how to perform cluster analysis with the C/C++ and MATLAB® programming languages.

  11. Load Balancing Algorithm for Cache Cluster

    Institute of Scientific and Technical Information of China (English)

    刘美华; 古志民; 曹元大

    2003-01-01

    By the load definition of cluster, the request is regarded as granularity to compute load and implement the load balancing in cache cluster. First, the processing power of cache-node is studied from four aspects: network bandwidth, memory capacity, disk access rate and CPU usage. Then, the weighted load of cache-node is customized. Based on this, a load-balancing algorithm that can be applied to the cache cluster is proposed. Finally, Polygraph is used as a benchmarking tool to test the cache cluster possessing the load-balancing algorithm and the cache cluster with cache array routing protocol respectively. The results show the load-balancing algorithm can improve the performance of the cache cluster.

  12. Semantic Based Cluster Content Discovery in Description First Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    MUHAMMAD WASEEM KHAN

    2017-01-01

    Full Text Available In the field of data analytics grouping of like documents in textual data is a serious problem. A lot of work has been done in this field and many algorithms have purposed. One of them is a category of algorithms which firstly group the documents on the basis of similarity and then assign the meaningful labels to those groups. Description first clustering algorithm belong to the category in which the meaningful description is deduced first and then relevant documents are assigned to that description. LINGO (Label Induction Grouping Algorithm is the algorithm of description first clustering category which is used for the automatic grouping of documents obtained from search results. It uses LSI (Latent Semantic Indexing; an IR (Information Retrieval technique for induction of meaningful labels for clusters and VSM (Vector Space Model for cluster content discovery. In this paper we present the LINGO while it is using LSI during cluster label induction and cluster content discovery phase. Finally, we compare results obtained from the said algorithm while it uses VSM and Latent semantic analysis during cluster content discovery phase.

  13. A SCALABLE HYBRID MODULAR MULTIPLICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Meng Qiang; Chen Tao; Dai Zibin; Chen Quji

    2008-01-01

    Based on the analysis of several familiar large integer modular multiplication algorithms,this paper proposes a new Scalable Hybrid modular multiplication (SHyb) algorithm which has scalable operands, and presents an RSA algorithm model with scalable key size. Theoretical analysis shows that SHyb algorithm requires m2n/2+2m iterations to complete an mn-bit modular multiplication with the application of an n-bit modular addition hardware circuit. The number of the required iterations can be reduced to a half of that of the scalable Montgomery algorithm. Consequently, the application scope of the RSA cryptosystem is expanded and its operation speed is enhanced based on SHyb algorithm.

  14. Self-organization and clustering algorithms

    Science.gov (United States)

    Bezdek, James C.

    1991-01-01

    Kohonen's feature maps approach to clustering is often likened to the k or c-means clustering algorithms. Here, the author identifies some similarities and differences between the hard and fuzzy c-Means (HCM/FCM) or ISODATA algorithms and Kohonen's self-organizing approach. The author concludes that some differences are significant, but at the same time there may be some important unknown relationships between the two methodologies. Several avenues of research are proposed.

  15. Non-convex polygons clustering algorithm

    Directory of Open Access Journals (Sweden)

    Kruglikov Alexey

    2016-01-01

    Full Text Available A clustering algorithm is proposed, to be used as a preliminary step in motion planning. It is tightly coupled to the applied problem statement, i.e. uses parameters meaningful only with respect to it. Use of geometrical properties for polygons clustering allows for a better calculation time as opposed to general-purpose algorithms. A special form of map optimized for quick motion planning is constructed as a result.

  16. The Georgi Algorithms of Jet Clustering

    OpenAIRE

    Ge, Shao-Feng

    2014-01-01

    We reveal the direct link between the jet clustering algorithms recently proposed by Howard Georgi and parton shower kinematics, providing firm foundation from the theoretical side. The kinematics of this class of elegant algorithms is explored systematically for partons with arbitrary masses and the jet function is generalized to $J^{(n)}_\\beta$ with a jet function index $n$ in order to achieve more degrees of freedom. Based on three basic requirements that, the result of jet clustering is p...

  17. A hybrid clustering algorithm combining K-harmonic means and simulated annealing particle swarm optimization%融合K-调和均值和模拟退火粒子群的混合聚类算法

    Institute of Scientific and Technical Information of China (English)

    毛力; 刘兴阳; 沈明明

    2011-01-01

    In view of the advantages and disadvantages of K-harmonic means (KHM) and simulated annealing particle swarm optimization (SAPSO), a hybrid clustering algorithm combining KHM and SAPSO (KHM-SAPSO) was presented in this paper. With KHM, the particle swarm was divided into several sub-groups. Each particle iteratively updated its location based on its individual extreme value and the global extreme value of the sub-group it belonged to. With simulated annealing technique, the algorithm prevented premature convergence and improved the calculation accuracy. Using the databases of Iris, Zoo, Wine and Image Segmentation, and taking F-measure as a measure to evaluate the clustering effect, this paper qualified the new hybrid algorithm. Our experimental results indicated that the new algorithm significantly improved the clustering effectiveness by avoiding being trapped in local optimum, enhanced the global search capability while achieved faster convergence rate. This algorithm is adopted by an aquaculture water quality analysis system of a freshwater breeding base in Wuxi, which is running effectively.%针对K-调和均值和模拟退火粒子群聚类算法的优缺点,提出了1种融合K-调和均值和模拟退火粒子群的混合聚类算法.首先通过K-调和均值方法将粒子群分成若干个子群,每个粒子根据其个体极值和所在子种群的全局极值来更新位置.同时引入模拟退火思想,抑制了早期收敛,提高了计算精度.本文使用Iris、Zoo、Wine和Image Segmentation,4个数据库,以F-measure为评价聚类效果的标准,对混合聚类算法进行了验证.研究发现,该混合聚类算法可以有效地避免陷入局部最优,在保证收敛速度的同时增强了算法的全局搜索能力,明显改善了聚类效果.该算法目前已用于无锡一淡水养殖基地的水产健康养殖水质分析系统,运行效果良好.

  18. The theory of variational hybrid quantum-classical algorithms

    CERN Document Server

    McClean, Jarrod R; Babbush, Ryan; Aspuru-Guzik, Alán

    2015-01-01

    Many quantum algorithms have daunting resource requirements when compared to what is available today. To address this discrepancy, a quantum-classical hybrid optimization scheme known as "the quantum variational eigensolver" was developed with the philosophy that even minimal quantum resources could be made useful when used in conjunction with classical routines. In this work we extend the general theory of this algorithm and suggest algorithmic improvements for practical implementations. Specifically, we develop a variational adiabatic ansatz and explore unitary coupled cluster where we establish a connection from second order unitary coupled cluster to universal gate sets through relaxation of exponential splitting. We introduce the concept of quantum variational error suppression that allows some errors to be suppressed naturally in this algorithm on a pre-threshold quantum device. Additionally, we analyze truncation and correlated sampling in Hamiltonian averaging as ways to reduce the cost of this proced...

  19. Fitting PAC spectra with a hybrid algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Alves, M. A., E-mail: mauro@sepn.org [Instituto de Aeronautica e Espaco (Brazil); Carbonari, A. W., E-mail: carbonar@ipen.br [Instituto de Pesquisas Energeticas e Nucleares (Brazil)

    2008-01-15

    A hybrid algorithm (HA) that blends features of genetic algorithms (GA) and simulated annealing (SA) was implemented for simultaneous fits of perturbed angular correlation (PAC) spectra. The main characteristic of the HA is the incorporation of a selection criterion based on SA into the basic structure of GA. The results obtained with the HA compare favorably with fits performed with conventional methods.

  20. Impulse denoising using Hybrid Algorithm

    Directory of Open Access Journals (Sweden)

    Ms.Arumugham Rajamani

    2015-03-01

    Full Text Available Many real time images facing a problem of salt and pepper noise contaminated,due to poor illumination and environmental factors. Many filters and algorithms are used to remove salt and pepper noise from the image, but it also removes image information. This paper proposes a new effective algorithm for diagnosing and removing salt and pepper noise is presented. The existing standard algorithms like Median Filter (MF, Weighted Median Filter (WMF, Standard Median Filter (SMF and so on, will yield poor performance particularly at high noise density. The suggested algorithm is compared with the above said standard algorithms using the metrics Mean Square Error (MSE and Peak Signal to Noise Ratio (PSNR value.The proposed algorithm exhibits more competitive performance results at all noise densities. The joint sorting and diagonal averaging algorithm has lower computational time,better quantitative results and improved qualitative result by a better visual appearance at all noise densities.

  1. EARLY EXPERIENCE WITH A HYBRID PROCESSOR: K-MEANS CLUSTERING

    Energy Technology Data Exchange (ETDEWEB)

    M. GOKHALE; ET AL

    2001-02-01

    We discuss hardware/software coprocessing on a hybrid processor for a compute- and data-intensive hyper-spectral imaging algorithm, K-Means Clustering. The experiments are performed on the Altera Excalibur board using the soft IP core 32-bit NIOS RISC processor. In our experiments, we compare performance of the sequential algorithm with two different accelerated versions. We consider granularity and synchronization issues when mapping an algorithm to a hybrid processor. Our results show that on the Excalibur NIOS, a 15% speedup can be achieved over the sequential algorithm on images with 8 spectral bands where the pixels are divided into 8 categories. Speedup is limited by the communication cost of transferring data from external memory through the NIOS processor to the customized circuits. Our results indicate that future hybrid processors must either (1) have a clock rate 10X the speed of the configurable logic circuits or (2) include dual port memories that both the processor and configurable logic can access. If either of these conditions is met, the hybrid processor will show a factor of 10 speedup over the sequential algorithm. Such systems will combine the convenience of conventional processors with the speed of configurable logic.

  2. Hybridizing Differential Evolution with a Genetic Algorithm for Color Image Segmentation

    Directory of Open Access Journals (Sweden)

    R. V. V. Krishna

    2016-10-01

    Full Text Available This paper proposes a hybrid of differential evolution and genetic algorithms to solve the color image segmentation problem. Clustering based color image segmentation algorithms segment an image by clustering the features of color and texture, thereby obtaining accurate prototype cluster centers. In the proposed algorithm, the color features are obtained using the homogeneity model. A new texture feature named Power Law Descriptor (PLD which is a modification of Weber Local Descriptor (WLD is proposed and further used as a texture feature for clustering. Genetic algorithms are competent in handling binary variables, while differential evolution on the other hand is more efficient in handling real parameters. The obtained texture feature is binary in nature and the color feature is a real value, which suits very well the hybrid cluster center optimization problem in image segmentation. Thus in the proposed algorithm, the optimum texture feature centers are evolved using genetic algorithms, whereas the optimum color feature centers are evolved using differential evolution.

  3. Optimal Hops-Based Adaptive Clustering Algorithm

    Science.gov (United States)

    Xuan, Xin; Chen, Jian; Zhen, Shanshan; Kuo, Yonghong

    This paper proposes an optimal hops-based adaptive clustering algorithm (OHACA). The algorithm sets an energy selection threshold before the cluster forms so that the nodes with less energy are more likely to go to sleep immediately. In setup phase, OHACA introduces an adaptive mechanism to adjust cluster head and load balance. And the optimal distance theory is applied to discover the practical optimal routing path to minimize the total energy for transmission. Simulation results show that OHACA prolongs the life of network, improves utilizing rate and transmits more data because of energy balance.

  4. Issues Challenges and Tools of Clustering Algorithms

    Directory of Open Access Journals (Sweden)

    Parul Agarwal

    2011-05-01

    Full Text Available Clustering is an unsupervised technique of Data Mining. It means grouping similar objects together and separating the dissimilar ones. Each object in the data set is assigned a class label in the clustering process using a distance measure. This paper has captured the problems that are faced in real when clustering algorithms are implemented .It also considers the most extensively used tools which are readily available and support functions which ease the programming. Once algorithms have been implemented, they also need to be tested for its validity. There exist several validation indexes for testing the performance and accuracy which have also been discussed here.

  5. A hybrid clustering approach to recognition of protein families in 114 microbial genomes

    Directory of Open Access Journals (Sweden)

    Gogarten J Peter

    2004-04-01

    Full Text Available Abstract Background Grouping proteins into sequence-based clusters is a fundamental step in many bioinformatic analyses (e.g., homology-based prediction of structure or function. Standard clustering methods such as single-linkage clustering capture a history of cluster topologies as a function of threshold, but in practice their usefulness is limited because unrelated sequences join clusters before biologically meaningful families are fully constituted, e.g. as the result of matches to so-called promiscuous domains. Use of the Markov Cluster algorithm avoids this non-specificity, but does not preserve topological or threshold information about protein families. Results We describe a hybrid approach to sequence-based clustering of proteins that combines the advantages of standard and Markov clustering. We have implemented this hybrid approach over a relational database environment, and describe its application to clustering a large subset of PDB, and to 328577 proteins from 114 fully sequenced microbial genomes. To demonstrate utility with difficult problems, we show that hybrid clustering allows us to constitute the paralogous family of ATP synthase F1 rotary motor subunits into a single, biologically interpretable hierarchical grouping that was not accessible using either single-linkage or Markov clustering alone. We describe validation of this method by hybrid clustering of PDB and mapping SCOP families and domains onto the resulting clusters. Conclusion Hybrid (Markov followed by single-linkage clustering combines the advantages of the Markov Cluster algorithm (avoidance of non-specific clusters resulting from matches to promiscuous domains and single-linkage clustering (preservation of topological information as a function of threshold. Within the individual Markov clusters, single-linkage clustering is a more-precise instrument, discerning sub-clusters of biological relevance. Our hybrid approach thus provides a computationally efficient

  6. Blockspin Cluster Algorithms for Quantum Spin Systems

    CERN Document Server

    Wiese, U J

    1992-01-01

    Cluster algorithms are developed for simulating quantum spin systems like the one- and two-dimensional Heisenberg ferro- and anti-ferromagnets. The corresponding two- and three-dimensional classical spin models with four-spin couplings are maped to blockspin models with two-blockspin interactions. Clusters of blockspins are updated collectively. The efficiency of the method is investigated in detail for one-dimensional spin chains. Then in most cases the new algorithms solve the problems of slowing down from which standard algorithms are suffering.

  7. A New Clustering Algorithm for Face Classification

    Directory of Open Access Journals (Sweden)

    Shaker K. Ali

    2016-06-01

    Full Text Available In This paper, we proposed new clustering algorithm depend on other clustering algorithm ideas. The proposed algorithm idea is based on getting distance matrix, then the exclusion of the matrix points which will be clustered by saving the location (row, column of these points and determine the minimum distance of these points which will be belongs the group (class and keep the other points which are not clustering yet. The propose algorithm is applied to image data base of the human face with different environment (direction, angles... etc.. These data are collected from different resource (ORL site and real images collected from random sample of Thi_Qar city population in lraq. Our algorithm has been implemented on three types of distance to calculate the minimum distance between points (Euclidean, Correlation and Minkowski distance .The efficiency ratio of proposed algorithm has varied according to the data base and threshold, the efficiency of our algorithm is exceeded (96%. Matlab (2014 has been used in this work.

  8. A Survey of Grid Based Clustering Algorithms

    Directory of Open Access Journals (Sweden)

    MR ILANGO

    2010-08-01

    Full Text Available Cluster Analysis, an automatic process to find similar objects from a database, is a fundamental operation in data mining. A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. Clustering techniques have been discussed extensively in SimilaritySearch, Segmentation, Statistics, Machine Learning, Trend Analysis, Pattern Recognition and Classification [1]. Clustering methods can be classified into i Partitioning methods ii Hierarchical methods iii Density-based methods iv Grid-based methods v Model-based methods. Grid based methods quantize the object space into a finite number of cells (hyper-rectangles and then perform the required operations on the quantized space. The main advantage of Grid based method is its fast processing time which depends on number of cells in each dimension in quantized space. In this research paper, we present some of the grid based methods such as CLIQUE (CLustering In QUEst [2], STING (STatistical INformation Grid [3], MAFIA (Merging of Adaptive Intervals Approach to Spatial Data Mining [4], Wave Cluster [5]and O-CLUSTER (Orthogonal partitioning CLUSTERing [6], as a survey andalso compare their effectiveness in clustering data objects. We also present some of the latest developments in Grid Based methods such as Axis Shifted Grid Clustering Algorithm [7] and Adaptive Mesh Refinement [Wei-Keng Liao etc] [8] to improve the processing time of objects.

  9. A Hybrid Evolutionary Algorithm for Discrete Optimization

    Directory of Open Access Journals (Sweden)

    J. Bhuvana

    2015-03-01

    Full Text Available Most of the real world multi-objective problems demand us to choose one Pareto optimal solution out of a finite set of choices. Flexible job shop scheduling problem is one such problem whose solutions are required to be selected from a discrete solution space. In this study we have designed a hybrid genetic algorithm to solve this scheduling problem. Hybrid genetic algorithms combine both the aspects of the search, exploration and exploitation of the search space. Proposed algorithm, Hybrid GA with Discrete Local Search, performs global search through the GA and exploits the locality through discrete local search. Proposed hybrid algorithm not only has the ability to generate Pareto optimal solutions and also identifies them with less computation. Five different benchmark test instances are used to evaluate the performance of the proposed algorithm. Results observed shown that the proposed algorithm has produced the known Pareto optimal solutions through exploration and exploitation of the search space with less number of functional evaluations.

  10. MST-BASED CLUSTERING TOPOLOGY CONTROL ALGORITHM FOR WIRELESS SENSOR NETWORKS

    Institute of Scientific and Technical Information of China (English)

    Cai Wenyu; Zhang Meiyan

    2010-01-01

    In this paper,we propose a novel clustering topology control algorithm named Minimum Spanning Tree (MST)-based Clustering Topology Control (MCTC) for Wireless Sensor Networks (WSNs),which uses a hybrid approach to adjust sensor nodes' transmission power in two-tiered hierarchical WSNs. MCTC algorithm employs a one-hop Maximum Energy & Minimum Distance (MEMD) clustering algorithm to decide clustering status. Each cluster exchanges information between its own Cluster Members (CMs) locally and then deliveries information to the Cluster Head (CH). Moreover,CHs exchange information between CH and CH and afterwards transmits aggregated information to the base station finally. The intra-cluster topology control scheme uses MST to decide CMs' transmission radius,similarly,the inter-cluster topology control scheme applies MST to decide CHs' transmission radius. Since the intra-cluster topology control is a full distributed approach and the inter-cluster topology control is a pure centralized approach performed by the base station,therefore,MCTC algorithm belongs to one kind of hybrid clustering topology control algorithms and can obtain scalability topology and strong connectivity guarantees simultaneously. As a result,the network topology will be reduced by MCTC algorithm so that network energy efficiency will be improved. The simulation results verify that MCTC outperforms traditional topology control schemes such as LMST,DRNG and MEMD at the aspects of average node's degree,average node's power radius and network lifetime,respectively.

  11. Polyclonal clustering algorithm and its convergence

    Institute of Scientific and Technical Information of China (English)

    MA Li; JIAO Li-cheng; BAI Lin; CHEN Chang-guo

    2008-01-01

    Being characteristic of non-teacher learning, self-organization, memory, and noise resistance, the artificial immune system is a research focus in the field of intelligent information processing. Based on the basic principles of organism immune and clonal selection, this article presents a polyclonal clustering algorithm characteristic of self-adaptation. According to the core idea of the algorithm, various immune operators in the artificial immune system are employed in the clustering process; moreover, clustering numbers are adjusted in accordance with the affinity function. Introduction of the recombination operator can effectively enhance the diversity of the individual antibody in a generation population, so that the searching scope for solutions is enlarged and the premature phenomenon of the algorithm is avoided. Besides, introduction of the inconsistent mutation operator enhances the adaptability and optimizes the performance of local solution seeking. Meanwhile, the convergence of the algorithm is accelerated. In addition, the article also proves the convergence of the algorithm by employing the Markov chain. Results of the data simulation experiment show that the algorithm is capable of obtaining reasonable and effective cluster.

  12. Maximum-entropy clustering algorithm and its global convergence analysis

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Constructing a batch of differentiable entropy functions touniformly approximate an objective function by means of the maximum-entropy principle, a new clustering algorithm, called maximum-entropy clustering algorithm, is proposed based on optimization theory. This algorithm is a soft generalization of the hard C-means algorithm and possesses global convergence. Its relations with other clustering algorithms are discussed.

  13. Hybrid Self Organizing Map for Overlapping Clusters

    Directory of Open Access Journals (Sweden)

    M.N.M. Sap

    2008-12-01

    Full Text Available The Kohonen self organizing map is an excellent tool in exploratoryphase of data mining and pattern recognition. The SOM is a popular tool that maps high dimensional space into a small number of dimensions by placing similar elements close together, forming clusters. Recently researchers found that to capture the uncertainty involved in cluster analysis, it is not necessary to have crisp boundaries in some clustering operations. In this paper to overcomethe uncertainty, a two-level clustering algorithm based on SOM which employs the rough set theory is proposed. The two-level stage Rough SOM (first using SOM to produce the prototypes that are then clustered in the second stage is found to perform well and more accurate compared with the proposed crisp clustering method (Incremental SOM and reduces the errors.

  14. An Adaptive Clustering Algorithm for Intrusion Detection

    Institute of Scientific and Technical Information of China (English)

    QIU Juli

    2007-01-01

    In this paper,we introduce an adaptive clustering algorithm for intrusion detection based on wavecluster which was introduced by Gholamhosein in 1999 and used with success in image processing.Because of the non-stationary characteristic of network traffic,we extend and develop an adaptive wavecluster algorithm for intrusion detection.Using the multiresolution property of wavelet transforms,we can effectively identify arbitrarily shaped clusters at different scales and degrees of detail,moreover,applying wavelet transform removes the noise from the original feature space and make more accurate cluster found.Experimental results on KDD-99 intrusion detection dataset show the efficiency and accuracy of this algorithm.A detection rate above 96% and a false alarm rate below 3% are achieved.

  15. Efficient Cluster Head Selection Algorithm for MANET

    Directory of Open Access Journals (Sweden)

    Khalid Hussain

    2013-01-01

    Full Text Available In mobile ad hoc network (MANET cluster head selection is considered a gigantic challenge. In wireless sensor network LEACH protocol can be used to select cluster head on the bases of energy, but it is still a dispute in mobil ad hoc networks and especially when nodes are itinerant. In this paper we proposed an efficient cluster head selection algorithm (ECHSA, for selection of the cluster head efficiently in Mobile ad hoc networks. We evaluate our proposed algorithm through simulation in OMNet++ as well as on test bed; we experience the result according to our assumption. For further evaluation we also compare our proposed protocol with several other protocols like LEACH-C and consequences show perfection.

  16. Performance Analysis of Hierarchical Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    K.Ranjini

    2011-07-01

    Full Text Available Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters, so that the data in each subset (ideally share some common trait - often proximity according to some defined distance measure. Data clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. This paper explains the implementation of agglomerative and divisive clustering algorithms applied on various types of data. The details of the victims of Tsunami in Thailand during the year 2004, was taken as the test data. Visual programming is used for implementation and running time of the algorithms using different linkages (agglomerative to different types of data are taken for analysis.

  17. Hybrid Genetic Algorithms with Fuzzy Logic Controller

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    In this paper, a new implementation of genetic algorithms (GAs) is developed for the machine scheduling problem, which is abundant among the modern manufacturing systems. The performance measure of early and tardy completion of jobs is very natural as one's aim, which is usually to minimize simultaneously both earliness and tardiness of all jobs. As the problem is NP-hard and no effective algorithms exist, we propose a hybrid genetic algorithms approach to deal with it. We adjust the crossover and mutation probabilities by fuzzy logic controller whereas the hybrid genetic algorithm does not require preliminary experiments to determine probabilities for genetic operators. The experimental results show the effectiveness of the GAs method proposed in the paper.``

  18. Hybrid fuzzy cluster ensemble framework for tumor clustering from biomolecular data.

    Science.gov (United States)

    Yu, Zhiwen; Chen, Hantao; You, Jane; Han, Guoqiang; Li, Le

    2013-01-01

    Cancer class discovery using biomolecular data is one of the most important tasks for cancer diagnosis and treatment. Tumor clustering from gene expression data provides a new way to perform cancer class discovery. Most of the existing research works adopt single-clustering algorithms to perform tumor clustering is from biomolecular data that lack robustness, stability, and accuracy. To further improve the performance of tumor clustering from biomolecular data, we introduce the fuzzy theory into the cluster ensemble framework for tumor clustering from biomolecular data, and propose four kinds of hybrid fuzzy cluster ensemble frameworks (HFCEF), named as HFCEF-I, HFCEF-II, HFCEF-III, and HFCEF-IV, respectively, to identify samples that belong to different types of cancers. The difference between HFCEF-I and HFCEF-II is that they adopt different ensemble generator approaches to generate a set of fuzzy matrices in the ensemble. Specifically, HFCEF-I applies the affinity propagation algorithm (AP) to perform clustering on the sample dimension and generates a set of fuzzy matrices in the ensemble based on the fuzzy membership function and base samples selected by AP. HFCEF-II adopts AP to perform clustering on the attribute dimension, generates a set of subspaces, and obtains a set of fuzzy matrices in the ensemble by performing fuzzy c-means on subspaces. Compared with HFCEF-I and HFCEF-II, HFCEF-III and HFCEF-IV consider the characteristics of HFCEF-I and HFCEF-II. HFCEF-III combines HFCEF-I and HFCEF-II in a serial way, while HFCEF-IV integrates HFCEF-I and HFCEF-II in a concurrent way. HFCEFs adopt suitable consensus functions, such as the fuzzy c-means algorithm or the normalized cut algorithm (Ncut), to summarize generated fuzzy matrices, and obtain the final results. The experiments on real data sets from UCI machine learning repository and cancer gene expression profiles illustrate that 1) the proposed hybrid fuzzy cluster ensemble frameworks work well on real

  19. Parallel Clustering Algorithms for Structured AMR

    Energy Technology Data Exchange (ETDEWEB)

    Gunney, B T; Wissink, A M; Hysom, D A

    2005-10-26

    We compare several different parallel implementation approaches for the clustering operations performed during adaptive gridding operations in patch-based structured adaptive mesh refinement (SAMR) applications. Specifically, we target the clustering algorithm of Berger and Rigoutsos (BR91), which is commonly used in many SAMR applications. The baseline for comparison is a simplistic parallel extension of the original algorithm that works well for up to O(10{sup 2}) processors. Our goal is a clustering algorithm for machines of up to O(10{sup 5}) processors, such as the 64K-processor IBM BlueGene/Light system. We first present an algorithm that avoids the unneeded communications of the simplistic approach to improve the clustering speed by up to an order of magnitude. We then present a new task-parallel implementation to further reduce communication wait time, adding another order of magnitude of improvement. The new algorithms also exhibit more favorable scaling behavior for our test problems. Performance is evaluated on a number of large scale parallel computer systems, including a 16K-processor BlueGene/Light system.

  20. Hybrid Collaborative Learning for Classification and Clustering in Sensor Networks

    Science.gov (United States)

    Wagstaff, Kiri L.; Sosnowski, Scott; Lane, Terran

    2012-01-01

    Traditionally, nodes in a sensor network simply collect data and then pass it on to a centralized node that archives, distributes, and possibly analyzes the data. However, analysis at the individual nodes could enable faster detection of anomalies or other interesting events as well as faster responses, such as sending out alerts or increasing the data collection rate. There is an additional opportunity for increased performance if learners at individual nodes can communicate with their neighbors. In previous work, methods were developed by which classification algorithms deployed at sensor nodes can communicate information about event labels to each other, building on prior work with co-training, self-training, and active learning. The idea of collaborative learning was extended to function for clustering algorithms as well, similar to ideas from penta-training and consensus clustering. However, collaboration between these learner types had not been explored. A new protocol was developed by which classifiers and clusterers can share key information about their observations and conclusions as they learn. This is an active collaboration in which learners of either type can query their neighbors for information that they then use to re-train or re-learn the concept they are studying. The protocol also supports broadcasts from the classifiers and clusterers to the rest of the network to announce new discoveries. Classifiers observe an event and assign it a label (type). Clusterers instead group observations into clusters without assigning them a label, and they collaborate in terms of pairwise constraints between two events [same-cluster (mustlink) or different-cluster (cannot-link)]. Fundamentally, these two learner types speak different languages. To bridge this gap, the new communication protocol provides four types of exchanges: hybrid queries for information, hybrid "broadcasts" of learned information, each specified for classifiers-to-clusterers, and clusterers

  1. Multicast Routing Based on Hybrid Genetic Algorithm

    Institute of Scientific and Technical Information of China (English)

    CAO Yuan-da; CAI Gui

    2005-01-01

    A new multicast routing algorithm based on the hybrid genetic algorithm (HGA) is proposed. The coding pattern based on the number of routing paths is used. A fitness function that is computed easily and makes algorithm quickly convergent is proposed. A new approach that defines the HGA's parameters is provided. The simulation shows that the approach can increase largely the convergent ratio, and the fitting values of the parameters of this algorithm are different from that of the original algorithms. The optimal mutation probability of HGA equals 0.50 in HGA in the experiment, but that equals 0.07 in SGA. It has been concluded that the population size has a significant influence on the HGA's convergent ratio when it's mutation probability is bigger. The algorithm with a small population size has a high average convergent rate. The population size has little influence on HGA with the lower mutation probability.

  2. Analysis of Stemming Algorithm for Text Clustering

    Directory of Open Access Journals (Sweden)

    N.Sandhya

    2011-09-01

    Full Text Available Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. In Bag of words representation of documents the words that appear in documents often have many morphological variants and in most cases, morphological variants of words have similar semantic interpretations and can be considered as equivalent for the purpose of clustering applications. For this reason, a number of stemming Algorithms, or stemmers, have been developed, which attempt to reduce a word to its stem or root form. Thus, the key terms of a document are represented by stems rather than by the original words. In this work we have studied the impact of stemming algorithm along with four popular similarity measures (Euclidean, cosine, Pearson correlation and extended Jaccard in conjunction with different types of vector representation (boolean, term frequency and term frequency and inverse document frequency on cluster quality. For Clustering documents we have used partitional based clustering technique K Means. Performance is measured against a human-imposed classification of Classic data set. We conducted a number of experiments and used entropy measure to assure statistical significance of results. Cosine, Pearson correlation and extended Jaccard similarities emerge as the best measures to capture human categorization behavior, while Euclidean measures perform poor. After applying the Stemming algorithm Euclidean measure shows little improvement.

  3. High-Performance Broadcasting Algorithms on Cluster

    Institute of Scientific and Technical Information of China (English)

    舒继武; 魏英霞; 王鼎兴

    2004-01-01

    In many clusters connected by high-speed communication networks, the exact structure of the underlying communication network and the latency difference between different sending and receiving pairs may be ignored when they broadcast, such as in the approach adopted by the broadcasting method in MPICH,a widely used MPI implementation. However, the underlying network cluster topologies are becoming more and more complicated and the performance of traditional broadcasting algorithms, such as MPICH's MPI_Bcast, is far from good. This paper analyzed the impact of communication latencies and the underlying topologies on the performance of broadcasting algorithms for multilevel clusters. A multilevel model was developed for broadcasting in clusters with complicated topologies, which divides the cluster topology into many levels based on the underlying topology. The multilevel model was used to develop a new broadcast algorithm,MLM broadcast-2 (MLMB-2), that adapts to a wide range of clusters. Comparison of the performance of the counterpart MPI operation MPI_Bcast and MLMB-2 shows that MLMB-2 outperforms MPl_Bcast by decreasing the broadcast running time by 60%-90%.

  4. Cluster Algorithm Special Purpose Processor

    Science.gov (United States)

    Talapov, A. L.; Shchur, L. N.; Andreichenko, V. B.; Dotsenko, Vl. S.

    We describe a Special Purpose Processor, realizing the Wolff algorithm in hardware, which is fast enough to study the critical behaviour of 2D Ising-like systems containing more than one million spins. The processor has been checked to produce correct results for a pure Ising model and for Ising model with random bonds. Its data also agree with the Nishimori exact results for spin glass. Only minor changes of the SPP design are necessary to increase the dimensionality and to take into account more complex systems such as Potts models.

  5. Cluster algorithm special purpose processor

    Energy Technology Data Exchange (ETDEWEB)

    Talapov, A.L.; Shchur, L.N.; Andreichenko, V.B.; Dotsenko, V.S. (Landau Inst. for Theoretical Physics, GSP-1 117940 Moscow V-334 (USSR))

    1992-08-10

    In this paper, the authors describe a Special Purpose Processor, realizing the Wolff algorithm in hardware, which is fast enough to study the critical behaviour of 2D Ising-like systems containing more than one million spins. The processor has been checked to produce correct results for a pure Ising model and for Ising model with random bonds. Its data also agree with the Nishimori exact results for spin glass. Only minor changes of the SPP design are necessary to increase the dimensionality and to take into account more complex systems such as Potts models.

  6. An Improved Heuristic Ant-Clustering Algorithm

    Institute of Scientific and Technical Information of China (English)

    Yunfei Chen; Yushu Liu; Jihai Zhao

    2004-01-01

    An improved heuristic ant-clustering algorithm(HAC)is presented in this paper. A device of 'memory bank' is proposed,which can bring forth heuristic knowledge guiding ant to move in the bi-dimension grid space.The device experiments on real data sets and synthetic data sets.The results demonstrate that HAC has superiority in misclassification error rate and runtime over the classical algorithm.

  7. Comparision of Clustering Algorithms usingNeural Network Classifier for Satellite Image Classification

    Directory of Open Access Journals (Sweden)

    S.Praveena

    2015-06-01

    Full Text Available This paper presents a hybrid clustering algorithm and feed-forward neural network classifier for land-cover mapping of trees, shade, building and road. It starts with the single step preprocessing procedure to make the image suitable for segmentation. The pre-processed image is segmented using the hybrid genetic-Artificial Bee Colony(ABC algorithm that is developed by hybridizing the ABC and FCM to obtain the effective segmentation in satellite image and classified using neural network . The performance of the proposed hybrid algorithm is compared with the algorithms like, k-means, Fuzzy C means(FCM, Moving K-means, Artificial Bee Colony(ABC algorithm, ABC-GA algorithm, Moving KFCM and KFCM algorithm.

  8. New Hybrid Algorithm for Question Answering

    Directory of Open Access Journals (Sweden)

    Jaspreet Kaur

    2013-08-01

    Full Text Available With technical advancement, Question Answering has emerged as the main area for the researchers. User is provided with specific answers instead of large number of documents or passages in question answering. Question answering proposes the solution to acquire efficient and exact answers to user question asked in natural language rather than language query. The major goal of this paper is to develop a hybrid algorithm for question answering. For this task different question answering systems for different languages were studied. After deep study, we are able to develop an algorithm that comprises the best features from excellent systems. An algorithm developed by us performs well.

  9. A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.

    Directory of Open Access Journals (Sweden)

    Keng-Hoong Ng

    Full Text Available BACKGROUND: Clustering is a key step in the processing of Expressed Sequence Tags (ESTs. The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes. METHODOLOGY/PRINCIPAL FINDINGS: We propose a hybrid distance measure that combines the global and local features extracted from ESTs, with the aim to address the clustering problem faced by ESTs derived from the same gene family. The clustering process is implemented using the DBSCAN algorithm. We test the hybrid distance measure on the ten EST datasets, and the clustering results are compared with the two alignment-free EST clustering tools, i.e. wcd and PEACE. The clustering results indicate that the proposed hybrid distance measure performs relatively better (in terms of clustering accuracy than both EST clustering tools. CONCLUSIONS/SIGNIFICANCE: The clustering results provide support for the effectiveness of the proposed hybrid distance measure in solving the clustering problem for ESTs that originate from the same gene family. The improvement of clustering accuracies on the experimental datasets has supported the claim that the sensitivity of the hybrid distance measure is sufficient to solve the clustering problem.

  10. A Fast Algorithm for Support Vector Clustering

    Institute of Scientific and Technical Information of China (English)

    吕常魁; 姜澄宇; 王宁生

    2004-01-01

    Support Vector Clustering (SVC) is a kernel-based unsupervised learning clustering method. The main drawback of SVC is its high computational complexity in getting the adjacency matrix describing the connectivity for each pairs of points. Based on the proximity graph model[3] , the Euclidean distance in Hilbert space is calculated using a Gaussian kernel, which is the right criterion to generate a minimum spanning tree using Kruskal's algorithm. Then the connectivity estimation is lowered by only checking the linkages between the edges that construct the main stem of the MST (Minimum Spanning Tree), in which the non-compatibility degree is originally defined to support the edge selection during linkage estimations. This new approach is experimentally analyzed.The results show that the revised algorithm has a better performance than the proximity graph model with faster speed, optimized clustering quality and strong ability to noise suppression, which makes SVC scalable to large data sets.

  11. Fuzzy Rules for Ant Based Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Amira Hamdi

    2016-01-01

    Full Text Available This paper provides a new intelligent technique for semisupervised data clustering problem that combines the Ant System (AS algorithm with the fuzzy c-means (FCM clustering algorithm. Our proposed approach, called F-ASClass algorithm, is a distributed algorithm inspired by foraging behavior observed in ant colonyT. The ability of ants to find the shortest path forms the basis of our proposed approach. In the first step, several colonies of cooperating entities, called artificial ants, are used to find shortest paths in a complete graph that we called graph-data. The number of colonies used in F-ASClass is equal to the number of clusters in dataset. Hence, the partition matrix of dataset founded by artificial ants is given in the second step, to the fuzzy c-means technique in order to assign unclassified objects generated in the first step. The proposed approach is tested on artificial and real datasets, and its performance is compared with those of K-means, K-medoid, and FCM algorithms. Experimental section shows that F-ASClass performs better according to the error rate classification, accuracy, and separation index.

  12. Application of a New Fuzzy Clustering Algorithm in Intrusion Detection

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    This paper presents a new Section Set Adaptive FCM algorithm. The algorithm solved the shortcomings of localoptimality, unsure classification and clustering numbers ascertained previously. And it improved on the architecture of FCM al-gorithm, enhanced the analysis for effective clustering. During the clustering processing, it may adjust clustering numbers dy-namically. Finally, it used the method of section set decreasing the time of classification. By experiments, the algorithm can im-prove dependability of clustering and correctness of classification.

  13. ENHANCED HYBRID PSO – ACO ALGORITHM FOR GRID SCHEDULING

    Directory of Open Access Journals (Sweden)

    P. Mathiyalagan

    2010-07-01

    Full Text Available Grid computing is a high performance computing environment to solve larger scale computational demands. Grid computing contains resource management, task scheduling, security problems, information management and so on. Task scheduling is a fundamental issue in achieving high performance in grid computing systems. A computational GRID is typically heterogeneous in the sense that it combines clusters of varying sizes, and different clusters typically contains processing elements with different level of performance. In this, heuristic approaches based on particle swarm optimization and ant colony optimization algorithms are adopted for solving task scheduling problems in grid environment. Particle Swarm Optimization (PSO is one of the latest evolutionary optimization techniques by nature. It has the better ability of global searching and has been successfully applied to many areas such as, neural network training etc. Due to the linear decreasing of inertia weight in PSO the convergence rate becomes faster, which leads to the minimal makespan time when used for scheduling. To make the convergence rate faster, the PSO algorithm is improved by modifying the inertia parameter, such that it produces better performance and gives an optimized result. The ACO algorithm is improved by modifying the pheromone updating rule. ACO algorithm is hybridized with PSO algorithm for efficient result and better convergence in PSO algorithm.

  14. A Hybrid Algorithm for Optimizing Multi- Modal Functions

    Institute of Scientific and Technical Information of China (English)

    Li Qinghua; Yang Shida; Ruan Youlin

    2006-01-01

    A new genetic algorithm is presented based on the musical performance. The novelty of this algorithm is that a new genetic algorithm, mimicking the musical process of searching for a perfect state of harmony, which increases the robustness of it greatly and gives a new meaning of it in the meantime, has been developed. Combining the advantages of the new genetic algorithm, simplex algorithm and tabu search, a hybrid algorithm is proposed. In order to verify the effectiveness of the hybrid algorithm, it is applied to solving some typical numerical function optimization problems which are poorly solved by traditional genetic algorithms. The experimental results show that the hybrid algorithm is fast and reliable.

  15. CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests.

    Science.gov (United States)

    Ma, Li; Fan, Suohai

    2017-03-14

    The random forests algorithm is a type of classifier with prominent universality, a wide application range, and robustness for avoiding overfitting. But there are still some drawbacks to random forests. Therefore, to improve the performance of random forests, this paper seeks to improve imbalanced data processing, feature selection and parameter optimization. We propose the CURE-SMOTE algorithm for the imbalanced data classification problem. Experiments on imbalanced UCI data reveal that the combination of Clustering Using Representatives (CURE) enhances the original synthetic minority oversampling technique (SMOTE) algorithms effectively compared with the classification results on the original data using random sampling, Borderline-SMOTE1, safe-level SMOTE, C-SMOTE, and k-means-SMOTE. Additionally, the hybrid RF (random forests) algorithm has been proposed for feature selection and parameter optimization, which uses the minimum out of bag (OOB) data error as its objective function. Simulation results on binary and higher-dimensional data indicate that the proposed hybrid RF algorithms, hybrid genetic-random forests algorithm, hybrid particle swarm-random forests algorithm and hybrid fish swarm-random forests algorithm can achieve the minimum OOB error and show the best generalization ability. The training set produced from the proposed CURE-SMOTE algorithm is closer to the original data distribution because it contains minimal noise. Thus, better classification results are produced from this feasible and effective algorithm. Moreover, the hybrid algorithm's F-value, G-mean, AUC and OOB scores demonstrate that they surpass the performance of the original RF algorithm. Hence, this hybrid algorithm provides a new way to perform feature selection and parameter optimization.

  16. Parallel FFT Algorithm on Computer Clusters

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    DFT is widely applied in the field of signal process and others. Most present rapid ways of calculation are either based on paralleled computers connected by such particular systems like butterfly network, hypercube etc;or based on the assumption of instant transportation, non-conflict communication, complete connection of paralleled processors and unlimited usable processors. However, the delay of communication in the system of information transmission cannot be ignored. This paper works on the following aspects: instant transmission, dispatching missions, and the path of information through the communication link in the computer cluster systems;layout of the dynamic FFT algorithm under the different structures of computer clusters.

  17. Evaluation of clustering algorithms for protein-protein interaction networks

    Directory of Open Access Journals (Sweden)

    van Helden Jacques

    2006-11-01

    Full Text Available Abstract Background Protein interactions are crucial components of all cellular processes. Recently, high-throughput methods have been developed to obtain a global description of the interactome (the whole network of protein interactions for a given organism. In 2002, the yeast interactome was estimated to contain up to 80,000 potential interactions. This estimate is based on the integration of data sets obtained by various methods (mass spectrometry, two-hybrid methods, genetic studies. High-throughput methods are known, however, to yield a non-negligible rate of false positives, and to miss a fraction of existing interactions. The interactome can be represented as a graph where nodes correspond with proteins and edges with pairwise interactions. In recent years clustering methods have been developed and applied in order to extract relevant modules from such graphs. These algorithms require the specification of parameters that may drastically affect the results. In this paper we present a comparative assessment of four algorithms: Markov Clustering (MCL, Restricted Neighborhood Search Clustering (RNSC, Super Paramagnetic Clustering (SPC, and Molecular Complex Detection (MCODE. Results A test graph was built on the basis of 220 complexes annotated in the MIPS database. To evaluate the robustness to false positives and false negatives, we derived 41 altered graphs by randomly removing edges from or adding edges to the test graph in various proportions. Each clustering algorithm was applied to these graphs with various parameter settings, and the clusters were compared with the annotated complexes. We analyzed the sensitivity of the algorithms to the parameters and determined their optimal parameter values. We also evaluated their robustness to alterations of the test graph. We then applied the four algorithms to six graphs obtained from high-throughput experiments and compared the resulting clusters with the annotated complexes. Conclusion This

  18. Comparative study of several Clustering Algorithms

    Directory of Open Access Journals (Sweden)

    Prof. Neha Soni, Dr. Amit Ganatra

    2012-12-01

    Full Text Available Cluster Analysis is a process of grouping theobjects, where objects can be physical like a studentor can be an abstract such as behaviour of acustomer or handwriting of a person. The clusteranalysis is as old as a human life and has its rootsin many fields such as statistics, machine learning,biology, artificial intelligence. It is an unsupervisedlearning and faces many challenges such as a highdimension of the dataset, arbitrary shapes ofclusters, scalability, input parameter, domainknowledge and noisy data. Large number ofclustering algorithms had been proposed till date toaddress these challenges. There do not exist a singlealgorithm which can adequately handle all sorts ofrequirement. This makes a great challenge for theuser to do selection among the available algorithmfor the specific task. The purpose of this paper is toprovide a detailed analytical comparison of some ofthe very well known clustering algorithms, whichprovides guidance for the selection of clusteringalgorithm for a specific application.

  19. An incremental clustering algorithm based on Mahalanobis distance

    Science.gov (United States)

    Aik, Lim Eng; Choon, Tan Wee

    2014-12-01

    Classical fuzzy c-means clustering algorithm is insufficient to cluster non-spherical or elliptical distributed datasets. The paper replaces classical fuzzy c-means clustering euclidean distance with Mahalanobis distance. It applies Mahalanobis distance to incremental learning for its merits. A Mahalanobis distance based fuzzy incremental clustering learning algorithm is proposed. Experimental results show the algorithm is an effective remedy for the defect in fuzzy c-means algorithm but also increase training accuracy.

  20. CABOSFV algorithm for high dimensional sparse data clustering

    Institute of Scientific and Technical Information of China (English)

    Sen Wu; Xuedong Gao

    2004-01-01

    An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.

  1. First Cluster Algorithm Special Purpose Processor

    Science.gov (United States)

    Talapov, A. L.; Andreichenko, V. B.; Dotsenko S., Vi.; Shchur, L. N.

    We describe the architecture of the special purpose processor built to realize in hardware cluster Wolff algorithm, which is not hampered by a critical slowing down. The processor simulates two-dimensional Ising-like spin systems. With minor changes the same very effective architecture, which can be defined as a Memory Machine, can be used to study phase transitions in a wide range of models in two or three dimensions.

  2. Dynamic exponents for potts model cluster algorithms

    Science.gov (United States)

    Coddington, Paul D.; Baillie, Clive F.

    We have studied the Swendsen-Wang and Wolff cluster update algorithms for the Ising model in 2, 3 and 4 dimensions. The data indicate simple relations between the specific heat and the Wolff autocorrelations, and between the magnetization and the Swendsen-Wang autocorrelations. This implies that the dynamic critical exponents are related to the static exponents of the Ising model. We also investigate the possibility of similar relationships for the Q-state Potts model.

  3. Enhanced Unequal Clustering Algorithm for Wireless Sensor Networks

    OpenAIRE

    Talbi, Said; Zaouche, Lotfi

    2015-01-01

    International audience; Clustering is considered as solution for more energy conservation during communications in wireless sensor networks. Recently, a new clustering algorithm named Unequal Clustering Algorithm (UCA) is proposed to avoid the burdened cluster-heads located around the sink due to the traffic coming from others which are far to the base station. This paper presents an Enhanced Unequal Clustering Algorithm called EUCA. This solution reduces the control traffic during a clusteri...

  4. ITS Cluster Finding Algorithm on GPU

    CERN Document Server

    Changaival, Boonyarit

    2014-01-01

    ITS cluster finding algorithm is one of the data reduction algorithms at ALICE. It needs to be processed fast due to a high amount of data readout from the detector. A variety of platforms were studied for the system design. My work is to design, implement and benchmark this algorithm on a GPU platform. GPU is one of many platform that promote parallel computing. A high-end GPU can contain over 2000 processing cores comparing to the commodity CPUs which have only four cores. The program is written in C and CUDA library. The throughput (Number of events per second) is used as a metric to measure the performance. With the latest implementation, the throughput was increased by a factor of 5.

  5. Hybrid Genetic Algorithms for University Course Timetabling

    Directory of Open Access Journals (Sweden)

    Meysam Shahvali Kohshori

    2012-03-01

    Full Text Available University course timetabling is one of the important and time consuming issues that each University is involved with it at the beginning of each. This problem is in class of NP-hard problem and is very difficult to solve by classic algorithms. Therefore optimization techniques are used to solve them and produce optimal or near optimal feasible solutions instead of exact solutions. Genetic algorithms, because of multidirectional search property of them, are considered as an efficient approach for solving this type of problems. In this paper three new hybrid genetic algorithms for solving the university course timetabling problem (UCTP are proposed: FGARI, FGASA and FGATS. In proposed algorithms, fuzzy logic is used to measure violation of soft constraints in fitness function to deal with inherent uncertainly and vagueness involved in real life data. Also, randomized iterative local search, simulated annealing and tabu search are applied, respectively, to improve exploitive search ability and prevent genetic algorithm to be trapped in local optimum. The experimental results indicate that the proposed algorithms are able to produce promising results for the UCTP.

  6. 并行遗传/模拟退火混合算法及其应用%Parallel Genetic Algorithm / Simulated Annealing Hybrid Algorithm and its Applications

    Institute of Scientific and Technical Information of China (English)

    温平川; 徐晓东; 何先刚

    2003-01-01

    This paper presents a highly hybrid Genetic Algorithm / Simulated Annealing algorithm. This algorithmhas been successfully implemented on Beowulf PCs Cluster and applied to a set of standard function optimization prob-lems. From experimental results, it is easily to see that this algorithm proposed by us is not only effective but also robust.

  7. A Hybrid Intelligent Algorithm for Optimal Birandom Portfolio Selection Problems

    Directory of Open Access Journals (Sweden)

    Qi Li

    2014-01-01

    Full Text Available Birandom portfolio selection problems have been well developed and widely applied in recent years. To solve these problems better, this paper designs a new hybrid intelligent algorithm which combines the improved LGMS-FOA algorithm with birandom simulation. Since all the existing algorithms solving these problems are based on genetic algorithm and birandom simulation, some comparisons between the new hybrid intelligent algorithm and the existing algorithms are given in terms of numerical experiments, which demonstrate that the new hybrid intelligent algorithm is more effective and precise when the numbers of the objective function computations are the same.

  8. Genetic Algorithms Applied to Multi-Class Clustering for Gene Expression Data

    Institute of Scientific and Technical Information of China (English)

    Haiyan Pan; Jun Zhu; Danfu Han

    2003-01-01

    A hybrid GA (genetic algorithm)-based clustering (HGACLUS) schema, combining merits of the Simulated Annealing, was described for finding an optimal or near-optimal set of medoids. This schema maximized the clustering success by achieving internal cluster cohesion and external cluster isolation. The performance of HGACLUS and other methods was compared by using simulated data and open microarray gene-expression datasets. HGACLUS was generally found to be more accurate and robust than other methods discussed in this paper by the exact validation strategy and the explicit cluster number.

  9. Hearing the clusters in a graph: A distributed algorithm

    CERN Document Server

    Sahai, Tuhin; Banaszuk, Andrzej

    2009-01-01

    We propose a novel distributed algorithm to decompose graphs or cluster data. The algorithm recovers the solution obtained from spectral clustering without need for expensive eigenvalue/ eigenvector computations. We demonstrate that by solving the wave equation on the graph, every node can assign itself to a cluster by performing a local fast Fourier transform. We prove the equivalence of our algorithm to spectral clustering, derive convergence rates and demonstrate it on examples.

  10. A High-Order CFS Algorithm for Clustering Big Data

    OpenAIRE

    Fanyu Bu; Zhikui Chen; Peng Li; Tong Tang; Ying Zhang

    2016-01-01

    With the development of Internet of Everything such as Internet of Things, Internet of People, and Industrial Internet, big data is being generated. Clustering is a widely used technique for big data analytics and mining. However, most of current algorithms are not effective to cluster heterogeneous data which is prevalent in big data. In this paper, we propose a high-order CFS algorithm (HOCFS) to cluster heterogeneous data by combining the CFS clustering algorithm and the dropout deep learn...

  11. Improvement and Parallelism of k-Means Clustering Algorithm

    Institute of Scientific and Technical Information of China (English)

    TIAN Jinlan; ZHU Lin; ZHANG Suqin; LIU Lu

    2005-01-01

    The k-means clustering algorithm is one of the most commonly used algorithms for clustering analysis. The traditional k-means algorithm is, however, inefficient while working on large numbers of data sets and improving the algorithm efficiency remains a problem. This paper focuses on the efficiency issues of cluster algorithms. A refined initial cluster centers method is designed to reduce the number of iterative procedures in the algorithm. A parallel k-means algorithm is also studied for the problem of the operation limitation of a single processor machine when given huge data sets. The analytical results demonstrate that these improvements can greatly enhance the efficiency of the k-means algorithm, i.e., allow the grouping of a large number of data sets more accurately and more quickly. The analysis has theoretical and practical importance for work on the improvement and parallelism of cluster algorithms.

  12. A Hybrid Trajectory Clustering for Predicting User Navigation

    CERN Document Server

    Munaga, Hazarath; Venkateswarlu, N B

    2011-01-01

    Wireless sensor networks (WSNs) suffers from the hot spot problem where the sensor nodes closest to the base station are need to relay more packet than the nodes farther away from the base station. Thus, lifetime of sensory network depends on these closest nodes. Clustering methods are used to extend the lifetime of a wireless sensor network. However, current clustering algorithms usually utilize two techniques; selecting cluster heads with more residual energy, and rotating cluster heads periodically to distribute the energy consumption among nodes in each cluster and lengthen the network lifetime. Most of the algorithms use random selection for selecting the cluster heads. Here, we propose a novel trajectory clustering technique for selecting the cluster heads in WSNs. Our algorithm selects the cluster heads based on traffic and rotates periodically. It provides the first trajectory based clustering technique for selecting the cluster heads and to extenuate the hot spot problem by prolonging the network lif...

  13. Parallelization of Edge Detection Algorithm using MPI on Beowulf Cluster

    Science.gov (United States)

    Haron, Nazleeni; Amir, Ruzaini; Aziz, Izzatdin A.; Jung, Low Tan; Shukri, Siti Rohkmah

    In this paper, we present the design of parallel Sobel edge detection algorithm using Foster's methodology. The parallel algorithm is implemented using MPI message passing library and master/slave algorithm. Every processor performs the same sequential algorithm but on different part of the image. Experimental results conducted on Beowulf cluster are presented to demonstrate the performance of the parallel algorithm.

  14. A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA

    Institute of Scientific and Technical Information of China (English)

    Ohn Mar San; Van-Nam Huynh; Yoshiteru Nakamori

    2003-01-01

    Most of the earlier work on clustering mainly focused on numeric data whose inherent geometric properties can be exploited to naturally define distance functions between data points. However, data mining applications frequently involve many datasets that also consists of mixed numeric and categorical attributes. In this paper we present a clustering algorithm which is based on the k-means algorithm. The algorithm clusters objects with numeric and categorical attributes in a way similar to k-means. The object similarity measure is derived from both numeric and categorical attributes. When applied to numeric data, the algorithm is identical to the k-means. The main result of this paper is to provide a method to update the "cluster centers" of clustering objects described by mixed numeric and categorical attributes in the clustering process to minimize the clustering cost function. The clustering performance of the algorithm is demonstrated with the two well known data sets, namely credit approval and abalone databases.

  15. EFFICIENT ALGORITHM FOR MINING FREQUENT ITEMSETS USING CLUSTERING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    D.Kerana Hanirex

    2011-03-01

    Full Text Available Now a days, Association rule plays an important role. The purchasing of one product when another product is purchased represents an association rule. The Apriori algorithm is the basic algorithm for mining association rules. This paper presents an efficient Partition Algorithm for Mining Frequent Itemsets(PAFI using clustering. This algorithm finds the frequent itemsets by partitioning the database transactions into clusters. Clusters are formed based on the imilarity measures between the transactions. Then it finds the frequent itemsets with the transactions in the clusters directly using improved Apriori algorithm which further reduces the number of scans in the database and hence improve the efficiency.

  16. The Georgi algorithms of jet clustering

    Science.gov (United States)

    Ge, Shao-Feng

    2015-05-01

    We reveal the direct link between the jet clustering algorithms recently proposed by Howard Georgi and parton shower kinematics, providing firm foundation from the theoretical side. The kinematics of this class of elegant algorithms is explored systematically for partons with arbitrary masses and the jet function is generalized to J {/β ( n)} with a jet function index n in order to achieve more degrees of freedom. Based on three basic requirements that, the result of jet clustering is process-independent and hence logically consistent, for softer subjets the inclusion cone is larger to conform with the fact that parton shower tends to emit softer partons at earlier stage with larger opening angle, and that the cone size cannot be too large in order to avoid mixing up neighbor jets, we derive constraints on the jet function parameter β and index n which are closely related to cone size cutoff. Finally, we discuss how jet function values can be made invariant under Lorentz boost.

  17. PROPOSED A HETEROGENEOUS CLUSTERING ALGORITHM TO IMPROVE QOS IN WSN

    Directory of Open Access Journals (Sweden)

    Mehran Mokhtari

    2016-07-01

    Full Text Available In this article it has presented leach extended hierarchical 3-level clustered heterogeneous and dynamics algorithm. On suggested protocol (LEH3LA with planning of selected auction cluster head, and alternative cluster head node, problem of delay on processing, processing of selecting members, decrease of expenses, and energy consumption, decrease of sending message, and receiving messages inside the clusters, selecting of cluster heads in large sensor networks were solved. This algorithm uses hierarchical heterogeneous network (3-levels, collective intelligence, and intra-cluster interaction for communications. Also it will solve the problems of sending data in Multi-BS mobile networks, expanding inter-cluster networks, overlap cluster, genesis orphan nodes, boundary change dynamically clusters, using backbone networks, cloud sensor. Using sleep/wake scheduling algorithm or TDMA-schedule alternative cluster head node provides redundancy, and fault tolerance. Local processing in cluster head nodes, and alternative cluster head, intra-cluster and inter-cluster communications such as Multi-HOP cause increase on processing speed, and sending data intra-cluster and inter-cluster. Decrease of overhead network, and increase the load balancing among cluster heads. Using encapsulation of data method, by cluster head nodes, energy consumption decrease during sending data. Also by improving quality of service (QoS in CBRP, LEACH, 802.15.4, decrease of energy consumption in sensors, cluster heads and alternative cluster head nodes, cause increase on lift time of sensor networks

  18. Hybridizing Evolutionary Algorithms with Opportunistic Local Search

    DEFF Research Database (Denmark)

    Gießen, Christian

    2013-01-01

    There is empirical evidence that memetic algorithms (MAs) can outperform plain evolutionary algorithms (EAs). Recently the first runtime analyses have been presented proving the aforementioned conjecture rigorously by investigating Variable-Depth Search, VDS for short (Sudholt, 2008). Sudholt...... raised the question if there are problems where VDS performs badly. We answer this question in the affirmative in the following way. We analyze MAs with VDS, which is also known as Kernighan-Lin for the TSP, on an artificial problem and show that MAs with a simple first-improvement local search...... outperform VDS. Moreover, we show that the performance gap is exponential. We analyze the features leading to a failure of VDS and derive a new local search operator, coined Opportunistic Local Search, that can easily overcome regions of the search space where local optima are clustered. The power...

  19. Genetic Algorithms for Auto-Clustering in KDD

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    In solving the clustering problem in the context of knowledge discovery in databases (KDD), the traditional methods, for example, the K-means algorithm and its variants, usually require the users to provide the number of clusters in advance based on the pro-information. Unfortunately, the number of clusters in general is unknown to the users who are usually short of pro-information. Therefore, the clustering calculation becomes a tedious trial-and-error work, and the result is often not global optimal especially when the number of clusters is large. In this paper, a new dynamic clustering method based on genetic algorithms (GA) is proposed and applied for auto-clustering of data entities in large databases. The algorithm can automatically cluster the data according to their similarities and find the exact number of clusters. Experiment results indicate that the method is of global optimization by dynamically clustering logic.

  20. Ouroboros: A Tool for Building Generic, Hybrid, Divide& Conquer Algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Johnson, J R; Foster, I

    2003-05-01

    A hybrid divide and conquer algorithm is one that switches from a divide and conquer to an iterative strategy at a specified problem size. Such algorithms can provide significant performance improvements relative to alternatives that use a single strategy. However, the identification of the optimal problem size at which to switch for a particular algorithm and platform can be challenging. We describe an automated approach to this problem that first conducts experiments to explore the performance space on a particular platform and then uses the resulting performance data to construct an optimal hybrid algorithm on that platform. We implement this technique in a tool, ''Ouroboros'', that automatically constructs a high-performance hybrid algorithm from a set of registered algorithms. We present results obtained with this tool for several classical divide and conquer algorithms, including matrix multiply and sorting, and report speedups of up to six times achieved over non-hybrid algorithms.

  1. Energy Aware Clustering Algorithms for Wireless Sensor Networks

    Science.gov (United States)

    Rakhshan, Noushin; Rafsanjani, Marjan Kuchaki; Liu, Chenglian

    2011-09-01

    The sensor nodes deployed in wireless sensor networks (WSNs) are extremely power constrained, so maximizing the lifetime of the entire networks is mainly considered in the design. In wireless sensor networks, hierarchical network structures have the advantage of providing scalable and energy efficient solutions. In this paper, we investigate different clustering algorithms for WSNs and also compare these clustering algorithms based on metrics such as clustering distribution, cluster's load balancing, Cluster Head's (CH) selection strategy, CH's role rotation, node mobility, clusters overlapping, intra-cluster communications, reliability, security and location awareness.

  2. A Novel Clustering Algorithm Inspired by Membrane Computing

    Directory of Open Access Journals (Sweden)

    Hong Peng

    2015-01-01

    Full Text Available P systems are a class of distributed parallel computing models; this paper presents a novel clustering algorithm, which is inspired from mechanism of a tissue-like P system with a loop structure of cells, called membrane clustering algorithm. The objects of the cells express the candidate centers of clusters and are evolved by the evolution rules. Based on the loop membrane structure, the communication rules realize a local neighborhood topology, which helps the coevolution of the objects and improves the diversity of objects in the system. The tissue-like P system can effectively search for the optimal partitioning with the help of its parallel computing advantage. The proposed clustering algorithm is evaluated on four artificial data sets and six real-life data sets. Experimental results show that the proposed clustering algorithm is superior or competitive to k-means algorithm and several evolutionary clustering algorithms recently reported in the literature.

  3. New Hybrid Genetic Algorithm for Vertex Cover Problems

    Institute of Scientific and Technical Information of China (English)

    霍红卫; 许进

    2003-01-01

    This paper presents a new hybrid genetic algorithm for the vertex cover problems in which scan-repair and local improvement techniques are used for local optimization. With the hybrid approach, genetic algorithms are used to perform global exploration in a population, while neighborhood search methods are used to perform local exploitation around the chromosomes. The experimental results indicate that hybrid genetic algorithms can obtain solutions of excellent quality to the problem instances with different sizes. The pure genetic algorithms are outperformed by the neighborhood search heuristics procedures combined with genetic algorithms.

  4. Solving the Quadratic Assignment Problem by a Hybrid Algorithm

    Directory of Open Access Journals (Sweden)

    Aldy Gunawan

    2011-01-01

    Full Text Available This paper presents a hybrid algorithm to solve the Quadratic Assignment Problem (QAP. The proposed algorithm involves using the Greedy Randomized Adaptive Search Procedure (GRASP to obtain an initial solution, and then using a combined Simulated Annealing (SA and Tabu Search (TS algorithm to improve the solution. Experimental results  indicate that the hybrid algorithm is able to obtain good quality solutions for QAPLIB test problems within reasonable computation time.

  5. Towards Enhancement of Performance of K-Means Clustering Using Nature-Inspired Optimization Algorithms

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2014-01-01

    Full Text Available Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.

  6. A New Cooperative Algorithm Based on PSO and K-Means for Data Clustering

    Directory of Open Access Journals (Sweden)

    Mehdi Sargolzaei

    2012-01-01

    Full Text Available Problem statement: Data clustering has been applied in multiple fields such as machine learning, data mining, wireless sensor networks and pattern recognition. One of the most famous clustering approaches is K-means which effectively has been used in many clustering problems, but this algorithm has some drawbacks such as local optimal convergence and sensitivity to initial points. Approach: Particle Swarm Optimization (PSO algorithm is one of the swarm intelligence algorithms, which is applied in determining the optimal cluster centers. In this study, a cooperative algorithm based on PSO and k-means is presented. Result: The proposed algorithm utilizes both global search ability of PSO and local search ability of k-means. The proposed algorithm and also PSO, PSO with Contraction Factor (CF-PSO, k-means algorithms and KPSO hybrid algorithm have been used for clustering six datasets and their efficiencies are compared with each other. Conclusion: Experimental results show that the proposed algorithm has an acceptable efficiency and robustness.

  7. Towards enhancement of performance of K-means clustering using nature-inspired optimization algorithms.

    Science.gov (United States)

    Fong, Simon; Deb, Suash; Yang, Xin-She; Zhuang, Yan

    2014-01-01

    Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.

  8. An energy efficient clustering routing algorithm for wireless sensor networks

    Institute of Scientific and Technical Information of China (English)

    LI Li; DONG Shu-song; WEN Xiang-ming

    2006-01-01

    This article proposes an energy efficient clustering routing (EECR) algorithm for wireless sensor network. The algorithm can divide a sensor network into a few clusters and select a cluster head base on weight value that leads to more uniform energy dissipation evenly among all sensor nodes.Simulations and results show that the algorithm can save overall energy consumption and extend the lifetime of the wireless sensor network.

  9. Introduction to Clustering Algorithms and Applications

    OpenAIRE

    Yang, Sibei; Tao, Liangde; Gong, Bingchen

    2014-01-01

    Data clustering is the process of identifying natural groupings or clusters within multidimensional data based on some similarity measure. Clustering is a fundamental process in many different disciplines. Hence, researchers from different fields are actively working on the clustering problem. This paper provides an overview of the different representative clustering methods. In addition, application of clustering in different field is briefly introduced.

  10. PHC: A Fast Partition and Hierarchy-Based Clustering Algorithm

    Institute of Scientific and Technical Information of China (English)

    ZHOU HaoFeng(周皓峰); YUAN QingQing(袁晴晴); CHENG ZunPing(程尊平); SHI BaiLe(施伯乐)

    2003-01-01

    Cluster analysis is a process to classify data in a specified data set. In this field,much attention is paid to high-efficiency clustering algorithms. In this paper, the features in thecurrent partition-based and hierarchy-based algorithms are reviewed, and a new hierarchy-basedalgorithm PHC is proposed by combining advantages of both algorithms, which uses the cohesionand the closeness to amalgamate the clusters. Compared with similar algorithms, the performanceof PHC is improved, and the quality of clustering is guaranteed. And both the features were provedby the theoretic and experimental analyses in the paper.

  11. Counterexamples to convergence theorem of maximum-entropy clustering algorithm

    Institute of Scientific and Technical Information of China (English)

    于剑; 石洪波; 黄厚宽; 孙喜晨; 程乾生

    2003-01-01

    In this paper, we surveyed the development of maximum-entropy clustering algorithm, pointed out that the maximum-entropy clustering algorithm is not new in essence, and constructed two examples to show that the iterative sequence given by the maximum-entropy clustering algorithm may not converge to a local minimum of its objective function, but a saddle point. Based on these results, our paper shows that the convergence theorem of maximum-entropy clustering algorithm put forward by Kenneth Rose et al. does not hold in general cases.

  12. An Incremental Algorithm of Text Clustering Based on Semantic Sequences

    Institute of Scientific and Technical Information of China (English)

    FENG Zhonghui; SHEN Junyi; BAO Junpeng

    2006-01-01

    This paper proposed an incremental textclustering algorithm based on semantic sequence.Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm.The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.

  13. A new efficient Cluster Algorithm for the Ising Model

    CERN Document Server

    Nyffeler, M; Wiese, U J; Nyfeler, Matthias; Pepe, Michele; Wiese, Uwe-Jens

    2005-01-01

    Using D-theory we construct a new efficient cluster algorithm for the Ising model. The construction is very different from the standard Swendsen-Wang algorithm and related to worm algorithms. With the new algorithm we have measured the correlation function with high precision over a surprisingly large number of orders of magnitude.

  14. URL Mining Using Agglomerative Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Chinmay R. Deshmukh

    2015-02-01

    Full Text Available Abstract The tremendous growth of the web world incorporates application of data mining techniques to the web logs. Data Mining and World Wide Web encompasses an important and active area of research. Web log mining is analysis of web log files with web pages sequences. Web mining is broadly classified as web content mining web usage mining and web structure mining. Web usage mining is a technique to discover usage patterns from Web data in order to understand and better serve the needs of Web-based applications. URL mining refers to a subclass of Web mining that helps us to investigate the details of a Uniform Resource Locator. URL mining can be advantageous in the fields of security and protection. The paper introduces a technique for mining a collection of user transactions with an Internet search engine to discover clusters of similar queries and similar URLs. The information we exploit is a clickthrough data each record consist of a users query to a search engine along with the URL which the user selected from among the candidates offered by search engine. By viewing this dataset as a bipartite graph with the vertices on one side corresponding to queries and on the other side to URLs one can apply an agglomerative clustering algorithm to the graphs vertices to identify related queries and URLs.

  15. A fingerprint identification algorithm by clustering similarity

    Institute of Scientific and Technical Information of China (English)

    TIAN Jie; HE Yuliang; CHEN Hong; YANG Xin

    2005-01-01

    This paper introduces a fingerprint identification algorithm by clustering similarity with the view to overcome the dilemmas encountered in fingerprint identification.To decrease multi-spectrum noises in a fingerprint, we first use a dyadic scale space (DSS) method for image enhancement. The second step describes the relative features among minutiae by building a minutia-simplex which contains a pair of minutiae and their local associated ridge information, with its transformation-variant and invariant relative features applied for comprehensive similarity measurement and for parameter estimation respectively. The clustering method is employed to estimate the transformation space.Finally, multi-resolution technique is used to find an optimal transformation model for getting the maximal mutual information between the input and the template features. The experimental results including the performance evaluation by the 2nd International Verification Competition in 2002 (FVC2002), over the four fingerprint databases of FVC2002 indicate that our method is promising in an automatic fingerprint identification system (AFIS).

  16. A Hybrid Aggressive Space Mapping Algorithm for EM Optimization

    DEFF Research Database (Denmark)

    Bakr, M.; Bandler, J. W.; Georgieva, N.;

    1999-01-01

    We present a novel, Hybrid Aggressive Space Mapping (HASM) optimization algorithm. HASM is a hybrid approach exploiting both the Trust Region Aggressive Space Mapping (TRASM) algorithm and direct optimization. It does not assume that the final space-mapped design is the true optimal design and is...

  17. New MPPT algorithm based on hybrid dynamical theory

    KAUST Repository

    Elmetennani, Shahrazed

    2014-11-01

    This paper presents a new maximum power point tracking algorithm based on the hybrid dynamical theory. A multiceli converter has been considered as an adaptation stage for the photovoltaic chain. The proposed algorithm is a hybrid automata switching between eight different operating modes, which has been validated by simulation tests under different working conditions. © 2014 IEEE.

  18. Intelligent Control Scheme of Engineering Machinery of Cluster Hybrid System

    Institute of Scientific and Technical Information of China (English)

    GAO Qiang; WANG Hongli

    2005-01-01

    In a hybrid system, the subsystems with discrete dynamics play a central role in a hybrid system. In the course of engineering machinery of cluster construction, the discrete control law is hard to obtain because the construction environment is complex and there exist many affecting factors. In this paper, hierarchically intelligent control, expert control and fuzzy control are introduced into the discrete subsystems of engineering machinery of cluster hybrid system, so as to rebuild the hybrid system and make the discrete control law easily and effectively obtained. The structures, reasoning mechanism and arithmetic of intelligent control are replanted to discrete dynamic, conti-nuous process and the interface of the hybrid system. The structures of three types of intelligent hybrid system are presented and the human experiences summarized from engineering machinery of cluster are taken into account.

  19. Local Community Detection Algorithm Based on Minimal Cluster

    Directory of Open Access Journals (Sweden)

    Yong Zhou

    2016-01-01

    Full Text Available In order to discover the structure of local community more effectively, this paper puts forward a new local community detection algorithm based on minimal cluster. Most of the local community detection algorithms begin from one node. The agglomeration ability of a single node must be less than multiple nodes, so the beginning of the community extension of the algorithm in this paper is no longer from the initial node only but from a node cluster containing this initial node and nodes in the cluster are relatively densely connected with each other. The algorithm mainly includes two phases. First it detects the minimal cluster and then finds the local community extended from the minimal cluster. Experimental results show that the quality of the local community detected by our algorithm is much better than other algorithms no matter in real networks or in simulated networks.

  20. A Load Balance Routing Algorithm Based on Uneven Clustering

    Directory of Open Access Journals (Sweden)

    Liang Yuan

    2013-10-01

    Full Text Available Aiming at the problem of uneven load in clustering Wireless Sensor Network (WSN, a kind of load balance routing algorithm based on uneven clustering is proposed to do uneven clustering and calculate optimal number of clustering. This algorithm prevents the number of common node under some certain cluster head from being too large which leads load to be overweight to death through even node clustering. It constructs evaluation function which can better reflect residual energy distribution of nodes and at the same time constructs routing evaluation function between cluster heads which uses MATLAB to do simulation on the performance of this algorithm. Simulation result shows that the routing established by this algorithm effectively improves network’s energy balance and lengthens the life cycle of network.  

  1. A Fast Hybrid Algorithm for the Exact String Matching Problem

    Directory of Open Access Journals (Sweden)

    Abdulwahab A. Al-mazroi

    2011-01-01

    Full Text Available Problem statement: Due to huge amount and complicated nature of data being generated recently, the usage of one algorithm for string searching was not sufficient to ensure faster search and matching of patterns. So there is the urgent need to integrate two or more algorithms to form a hybrid algorithm (called BRSS to ensure speedy results. Approach: This study proposes the combination of two algorithms namely Berry-Ravindran and Skip Search Algorithms to form a hybrid algorithm in order to boost search performance. Results: The proposed hybrid algorithm contributes to better results by reducing the number of attempts, number of character comparisons and searching time. The performance of the hybrid was tested using different types of data-DNA, Protein and English text. The percentage of the improvements of the hybrid algorithm compared to Berry-Ravindran in DNA, Protein and English text are 50%, 43% and 44% respectively. The percentage of the improvements over Skip Search algorithm in DNA, Protein and English text are 20%, 30% and 18% respectively. The criteria applied for evaluation are number of attempts, number of character comparisons and searching time. Conclusion: The study shows how the integration of two algorithms gives better results than the original algorithms even the same data size and pattern lengths are applied as test evaluation on each of the algorithms.

  2. Analyzing Job Aware Scheduling Algorithm in Hadoop for Heterogeneous Cluster

    Directory of Open Access Journals (Sweden)

    Mayuri A Mehta

    2015-12-01

    Full Text Available A scheduling algorithm is required to efficiently manage cluster resources in a Hadoop cluster, thereby to increase resource utilization and to reduce response time. The job aware scheduling algorithm schedules non-local map tasks of jobs based on job execution time, earliest deadline first or workload of the job. In this paper, we present the performance evaluation of the job aware scheduling algorithm using MapReduce WordCount benchmark. The experimental results are compared with matchmaking scheduling algorithm. The results show that the job aware scheduling algorithm reduces average waiting time and memory wastage considerably as compared to matchmaking algorithm.

  3. Cluster fusion algorithm: application to Lennard-Jones clusters

    DEFF Research Database (Denmark)

    Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

    2008-01-01

    paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...

  4. Cluster fusion algorithm: application to Lennard-Jones clusters

    DEFF Research Database (Denmark)

    Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

    2006-01-01

    paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...

  5. Simulated annealing spectral clustering algorithm for image segmentation

    Institute of Scientific and Technical Information of China (English)

    Yifang Yang; and Yuping Wang

    2014-01-01

    The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity mea-sure. However, the Euclidean distance measure cannot ful y reveal the complex distribution data, and the result of spectral clustering is very sensitive to the scaling parameter. To solve these problems, a new manifold distance measure and a novel simulated anneal-ing spectral clustering (SASC) algorithm based on the manifold distance measure are proposed. The simulated annealing based on genetic algorithm (SAGA), characterized by its rapid conver-gence to the global optimum, is used to cluster the sample points in the spectral mapping space. The proposed algorithm can not only reflect local and global consistency better, but also reduce the sensitivity of spectral clustering to the kernel parameter, which improves the algorithm’s clustering performance. To efficiently ap-ply the algorithm to image segmentation, the Nystr¨om method is used to reduce the computation complexity. Experimental re-sults show that compared with traditional clustering algorithms and those popular spectral clustering algorithms, the proposed algorithm can achieve better clustering performances on several synthetic datasets, texture images and real images.

  6. A Flocking Based algorithm for Document Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  7. APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

    DEFF Research Database (Denmark)

    Zong, Yu; Xu, Guandong; Jin, Pin

    2011-01-01

    algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C1,C2,…,Cm}; (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone...... resulting from the severe difficulty of ambiguity, redundancy and less semantic nature of tags. Clustering method is a useful tool to address the aforementioned difficulties. Most of the researches on tag clustering are directly using traditional clustering algorithms such as K-means or Hierarchical...

  8. Mercer Kernel Based Fuzzy Clustering Self-Adaptive Algorithm

    Institute of Scientific and Technical Information of China (English)

    李侃; 刘玉树

    2004-01-01

    A novel mercer kernel based fuzzy clustering self-adaptive algorithm is presented. The mercer kernel method is introduced to the fuzzy c-means clustering. It may map implicitly the input data into the high-dimensional feature space through the nonlinear transformation. Among other fuzzy c-means and its variants, the number of clusters is first determined. A self-adaptive algorithm is proposed. The number of clusters, which is not given in advance, can be gotten automatically by a validity measure function. Finally, experiments are given to show better performance with the method of kernel based fuzzy c-means self-adaptive algorithm.

  9. APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

    DEFF Research Database (Denmark)

    Zong, Yu; Xu, Guandong; Jin, Pin

    2011-01-01

    algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C1,C2,…,Cm}; (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone...... resulting from the severe difficulty of ambiguity, redundancy and less semantic nature of tags. Clustering method is a useful tool to address the aforementioned difficulties. Most of the researches on tag clustering are directly using traditional clustering algorithms such as K-means or Hierarchical...

  10. Android Malware Classification Using K-Means Clustering Algorithm

    Science.gov (United States)

    Hamid, Isredza Rahmi A.; Syafiqah Khalid, Nur; Azma Abdullah, Nurul; Rahman, Nurul Hidayah Ab; Chai Wen, Chuah

    2017-08-01

    Malware was designed to gain access or damage a computer system without user notice. Besides, attacker exploits malware to commit crime or fraud. This paper proposed Android malware classification approach based on K-Means clustering algorithm. We evaluate the proposed model in terms of accuracy using machine learning algorithms. Two datasets were selected to demonstrate the practicing of K-Means clustering algorithms that are Virus Total and Malgenome dataset. We classify the Android malware into three clusters which are ransomware, scareware and goodware. Nine features were considered for each types of dataset such as Lock Detected, Text Detected, Text Score, Encryption Detected, Threat, Porn, Law, Copyright and Moneypak. We used IBM SPSS Statistic software for data classification and WEKA tools to evaluate the built cluster. The proposed K-Means clustering algorithm shows promising result with high accuracy when tested using Random Forest algorithm.

  11. Functional Clustering Algorithm for High-Dimensional Proteomics Data

    Directory of Open Access Journals (Sweden)

    Halima Bensmail

    2005-01-01

    Full Text Available Clustering proteomics data is a challenging problem for any traditional clustering algorithm. Usually, the number of samples is largely smaller than the number of protein peaks. The use of a clustering algorithm which does not take into consideration the number of features of variables (here the number of peaks is needed. An innovative hierarchical clustering algorithm may be a good approach. We propose here a new dissimilarity measure for the hierarchical clustering combined with a functional data analysis. We present a specific application of functional data analysis (FDA to a high-throughput proteomics study. The high performance of the proposed algorithm is compared to two popular dissimilarity measures in the clustering of normal and human T-cell leukemia virus type 1 (HTLV-1-infected patients samples.

  12. Extension of K-Modes Algorithm for Generating Clusters Automatically

    Directory of Open Access Journals (Sweden)

    Anupama Chadha

    2016-03-01

    Full Text Available —K-Modes is an eminent algorithm for clustering data set with categorical attributes. This algorithm is famous for its simplicity and speed. The KModes is an extension of the K-Means algorithm for categorical data. Since K-Modes is used for categorical data so ‘Simple Matching Dissimilarity’ measure is used instead of Euclidean distance and the ‘Modes’ of clusters are used instead of ‘Means’. However, one major limitation of this algorithm is dependency on prior input of number of clusters K, and sometimes it becomes practically impossible to correctly estimate the optimum number of clusters in advance. In this paper we have proposed an algorithm which will overcome this limitation while maintaining the simplicity of K-Modes algorithm

  13. An efficient clustering algorithm for partitioning Y-short tandem repeats data

    Directory of Open Access Journals (Sweden)

    Seman Ali

    2012-10-01

    Full Text Available Abstract Background Y-Short Tandem Repeats (Y-STR data consist of many similar and almost similar objects. This characteristic of Y-STR data causes two problems with partitioning: non-unique centroids and local minima problems. As a result, the existing partitioning algorithms produce poor clustering results. Results Our new algorithm, called k-Approximate Modal Haplotypes (k-AMH, obtains the highest clustering accuracy scores for five out of six datasets, and produces an equal performance for the remaining dataset. Furthermore, clustering accuracy scores of 100% are achieved for two of the datasets. The k-AMH algorithm records the highest mean accuracy score of 0.93 overall, compared to that of other algorithms: k-Population (0.91, k-Modes-RVF (0.81, New Fuzzy k-Modes (0.80, k-Modes (0.76, k-Modes-Hybrid 1 (0.76, k-Modes-Hybrid 2 (0.75, Fuzzy k-Modes (0.74, and k-Modes-UAVM (0.70. Conclusions The partitioning performance of the k-AMH algorithm for Y-STR data is superior to that of other algorithms, owing to its ability to solve the non-unique centroids and local minima problems. Our algorithm is also efficient in terms of time complexity, which is recorded as O(km(n-k and considered to be linear.

  14. Resource Allocation in Public Cluster with Extended Optimization Algorithm

    OpenAIRE

    Akbar, Z.; Handoko, L. T.

    2007-01-01

    We introduce an optimization algorithm for resource allocation in the LIPI Public Cluster to optimize its usage according to incoming requests from users. The tool is an extended and modified genetic algorithm developed to match specific natures of public cluster. We present a detail analysis of optimization, and compare the results with the exact calculation. We show that it would be very useful and could realize an automatic decision making system for public clusters.

  15. An ACO Algorithm for Effective Cluster Head Selection

    CERN Document Server

    Sampath, Amritha; Thampi, Sabu M; 10.4304/jait.2.1.50-56

    2011-01-01

    This paper presents an effective algorithm for selecting cluster heads in mobile ad hoc networks using ant colony optimization. A cluster in an ad hoc network consists of a cluster head and cluster members which are at one hop away from the cluster head. The cluster head allocates the resources to its cluster members. Clustering in MANET is done to reduce the communication overhead and thereby increase the network performance. A MANET can have many clusters in it. This paper presents an algorithm which is a combination of the four main clustering schemes- the ID based clustering, connectivity based, probability based and the weighted approach. An Ant colony optimization based approach is used to minimize the number of clusters in MANET. This can also be considered as a minimum dominating set problem in graph theory. The algorithm considers various parameters like the number of nodes, the transmission range etc. Experimental results show that the proposed algorithm is an effective methodology for finding out t...

  16. A HYBRID THINNING ALGORITHM FOR BINARY TOPOGRAPHY MAP

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    A hybrid thinning algorithm for binary topography maps is proposed on the basis of parallel thinning templates in this paper.The algorithm has a high processing speed and the strong ability of noise immunity and preservation of connectivity and skeleton symmetry. Experimental results show that the algorithm can solve t he thinning problem of binary maps effectively.

  17. Squeezer: An Efficient Algorithm for Clustering Categorical Data

    Institute of Scientific and Technical Information of China (English)

    何增有; 徐晓飞; 邓胜春

    2002-01-01

    This paper presents a new efficient algorithm for clustering categorical data,Squeezer, which can produce high quality clustering results and at the same time deservegood scalability. The Squeezer algorithm reads each tuple t in sequence, either assigning tto an existing cluster (initially none), or creating t as a new cluster, which is determined bythe similarities between t and clusters. Due to its characteristics, the proposed algorithm isextremely suitable for clustering data streams, where given a sequence of points, the objective isto maintain consistently good clustering of the sequence so far, using a small amount of memoryand time. Outliers can also be handled efficiently and directly in Squeezer. Experimental resultson real-life and synthetic datasets verify the superiority of Squeezer.

  18. A Hybrid Differential Invasive Weed Algorithm for Congestion Management

    Science.gov (United States)

    Basak, Aniruddha; Pal, Siddharth; Pandi, V. Ravikumar; Panigrahi, B. K.; Das, Swagatam

    This work is dedicated to solve the problem of congestion management in restructured power systems. Nowadays we have open access market which pushes the power system operation to their limits for maximum economic benefits but at the same time making the system more susceptible to congestion. In this regard congestion management is absolutely vital. In this paper we try to remove congestion by generation rescheduling where the cost involved in the rescheduling process is minimized. The proposed algorithm is a hybrid of Invasive Weed Optimization (IWO) and Differential Evolution (DE). The resultant hybrid algorithm was applied on standard IEEE 30 bus system and observed to beat existing algorithms like Simple Bacterial foraging (SBF), Genetic Algorithm (GA), Invasive Weed Optimization (IWO), Differential Evolution (DE) and hybrid algorithms like Hybrid Bacterial Foraging and Differential Evolution (HBFDE) and Adaptive Bacterial Foraging with Nelder Mead (ABFNM).

  19. Co-clustering models, algorithms and applications

    CERN Document Server

    Govaert, Gérard

    2013-01-01

    Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixture

  20. A PRODUCT HYBRID GMRES ALGORITHM FOR NONSYMMETRIC LINEAR SYSTEMS

    Institute of Scientific and Technical Information of China (English)

    Bao-jiang Zhong

    2005-01-01

    It has been observed that the residual polynomials resulted from successive restarting cycles of GMRES(m) may differ from one another meaningfully. In this paper, it is further shown that the polynomials can complement one another harmoniously in reducing the iterative residual. This characterization of GMRES(m) is exploited to formulate an efficient hybrid iterative scheme, which can be widely applied to existing hybrid algorithms for solving large nonsymmetric systems of linear equations. In particular, a variant of the hybrid GMRES algorithm of Nachtigal, Reichel and Trefethen (1992) is presented. It is described how the new algorithm may offer significant performance improvements over the original one.

  1. Hybrid Algorithm for Optimal Load Sharing in Grid Computing

    Directory of Open Access Journals (Sweden)

    A. Krishnan

    2012-01-01

    Full Text Available Problem statement: Grid Computing is the fast growing industry, which shares the resources in the organization in an effective manner. Resource sharing requires more optimized algorithmic structure, otherwise the waiting time and response time are increased and the resource utilization is reduced. Approach: In order to avoid such reduction in the performances of the grid system, an optimal resource sharing algorithm is required. In recent days, many load sharing technique are proposed, which provides feasibility but there are many critical issues are still present in these algorithms. Results: In this study a hybrid algorithm for optimization of load sharing is proposed. The hybrid algorithm contains two components which are Hash Table (HT and Distributed Hash Table (DHT. Conclusion: The results of the proposed study show that the hybrid algorithm will optimize the task than existing systems.

  2. A hybrid genetic algorithm to optimize simple distillation column sequences

    Institute of Scientific and Technical Information of China (English)

    GAN YongSheng; Andreas Linninger

    2004-01-01

    Based on the principles of Genetic Algorithms (GAs), a hybrid genetic algorithm used to optimize simple distillation column sequences was established. A new data structure, a novel arithmetic crossover operator and a dynamic mutation operator were proposed. Together with the feasibility test of distillation columns, they are capable to obtain the optimum simple column sequence at one time without the limitation of the number of mixture components, ideal or non-ideal mixtures and sloppy or sharp splits. Compared with conventional algorithms, this hybrid genetic algorithm avoids solving complicated nonlinear equations and demands less derivative information and computation time. Result comparison between this genetic algorithm and Underwood method and Doherty method shows that this hybrid genetic algorithm is reliable.

  3. MAKHA—A New Hybrid Swarm Intelligence Global Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Ahmed M.E. Khalil

    2015-06-01

    Full Text Available The search for efficient and reliable bio-inspired optimization methods continues to be an active topic of research due to the wide application of the developed methods. In this study, we developed a reliable and efficient optimization method via the hybridization of two bio-inspired swarm intelligence optimization algorithms, namely, the Monkey Algorithm (MA and the Krill Herd Algorithm (KHA. The hybridization made use of the efficient steps in each of the two original algorithms and provided a better balance between the exploration/diversification steps and the exploitation/intensification steps. The new hybrid algorithm, MAKHA, was rigorously tested with 27 benchmark problems and its results were compared with the results of the two original algorithms. MAKHA proved to be considerably more reliable and more efficient in tested problems.

  4. A Hybrid Algorithm for Satellite Data Transmission Schedule Based on Genetic Algorithm

    Institute of Scientific and Technical Information of China (English)

    LI Yun-feng; WU Xiao-yue

    2008-01-01

    A hybrid scheduling algorithm based on genetic algorithm is proposed in this paper for reconnaissance satellite data transmission. At first, based on description of satellite data transmission request, satellite data transmission task modal and satellite data transmission scheduling problem model are established. Secondly, the conflicts in scheduling are discussed. According to the meaning of possible conflict, the method to divide possible conflict task set is given. Thirdly, a hybrid algorithm which consists of genetic algorithm and heuristic information is presented. The heuristic information comes from two concepts, conflict degree and conflict number. Finally, an example shows the algorithm's feasibility and performance better than other traditional algorithms.

  5. Constructing Product Ontologies with an Improved Conceptual Clustering Algorithm

    Institute of Scientific and Technical Information of China (English)

    曹大军; 徐良贤

    2002-01-01

    In a distributed eMarketplace, recommended product ontologies are required for trading between buyers and sellers. Conceptual clustering can be employed to build dynamic recommended product ontologies. Traditional methods of conceptual clustering (e. g. COBWEB or Cluster/2) do not take heterogeneous attributes of a concept into account.Moreover, the result of these methods is clusters other than recommended concepts. A center recommendation clustering algorithm is provided. According to the values of heterogeneous attributes, recommended product names can be selected at the clusters, which are produced by this algorithm. This algorithm can also create the hierarchical relations between product names. The definitions of product names given by all participants are collected in a distributed eMarketplace.Recommended product ontologies are built. These ontologies include relations and definitions of product names, which come from different participants in the distributed eMarketplace. Finally a case is given to illustrate this method. The result shows that this method is feasible.

  6. An Energy Consumption Optimized Clustering Algorithm for Radar Sensor Networks Based on an Ant Colony Algorithm

    Directory of Open Access Journals (Sweden)

    Jiang Ting

    2010-01-01

    Full Text Available We optimize the cluster structure to solve problems such as the uneven energy consumption of the radar sensor nodes and random cluster head selection in the traditional clustering routing algorithm. According to the defined cost function for clusters, we present the clustering algorithm which is based on radio-free space path loss. In addition, we propose the energy and distance pheromones based on the residual energy and aggregation of the radar sensor nodes. According to bionic heuristic algorithm, a new ant colony-based clustering algorithm for radar sensor networks is also proposed. Simulation results show that this algorithm can get a better balance of the energy consumption and then remarkably prolong the lifetime of the radar sensor network.

  7. Cosine-Based Clustering Algorithm Approach

    Directory of Open Access Journals (Sweden)

    Mohammed A. H. Lubbad

    2012-02-01

    Full Text Available Due to many applications need the management of spatial data; clustering large spatial databases is an important problem which tries to find the densely populated regions in the feature space to be used in data mining, knowledge discovery, or efficient information retrieval. A good clustering approach should be efficient and detect clusters of arbitrary shapes. It must be insensitive to the outliers (noise and the order of input data. In this paper Cosine Cluster is proposed based on cosine transformation, which satisfies all the above requirements. Using multi-resolution property of cosine transforms, arbitrary shape clusters can be effectively identified at different degrees of accuracy. Cosine Cluster is also approved to be highly efficient in terms of time complexity. Experimental results on very large data sets are presented, which show the efficiency and effectiveness of the proposed approach compared to other recent clustering methods.

  8. A functional clustering algorithm for the analysis of neural relationships

    CERN Document Server

    Feldt, S; Hetrick, V L; Berke, J D; Zochowski, M

    2008-01-01

    We formulate a novel technique for the detection of functional clusters in neural data. In contrast to prior network clustering algorithms, our procedure progressively combines spike trains and derives the optimal clustering cutoff in a simple and intuitive manner. To demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. We observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.

  9. Pixel Intensity Clustering Algorithm for Multilevel Image Segmentation

    Directory of Open Access Journals (Sweden)

    Oludayo O. Olugbara

    2015-01-01

    Full Text Available Image segmentation is an important problem that has received significant attention in the literature. Over the last few decades, a lot of algorithms were developed to solve image segmentation problem; prominent amongst these are the thresholding algorithms. However, the computational time complexity of thresholding exponentially increases with increasing number of desired thresholds. A wealth of alternative algorithms, notably those based on particle swarm optimization and evolutionary metaheuristics, were proposed to tackle the intrinsic challenges of thresholding. In codicil, clustering based algorithms were developed as multidimensional extensions of thresholding. While these algorithms have demonstrated successful results for fewer thresholds, their computational costs for a large number of thresholds are still a limiting factor. We propose a new clustering algorithm based on linear partitioning of the pixel intensity set and between-cluster variance criterion function for multilevel image segmentation. The results of testing the proposed algorithm on real images from Berkeley Segmentation Dataset and Benchmark show that the algorithm is comparable with state-of-the-art multilevel segmentation algorithms and consistently produces high quality results. The attractive properties of the algorithm are its simplicity, generalization to a large number of clusters, and computational cost effectiveness.

  10. A High-Order CFS Algorithm for Clustering Big Data

    Directory of Open Access Journals (Sweden)

    Fanyu Bu

    2016-01-01

    Full Text Available With the development of Internet of Everything such as Internet of Things, Internet of People, and Industrial Internet, big data is being generated. Clustering is a widely used technique for big data analytics and mining. However, most of current algorithms are not effective to cluster heterogeneous data which is prevalent in big data. In this paper, we propose a high-order CFS algorithm (HOCFS to cluster heterogeneous data by combining the CFS clustering algorithm and the dropout deep learning model, whose functionality rests on three pillars: (i an adaptive dropout deep learning model to learn features from each type of data, (ii a feature tensor model to capture the correlations of heterogeneous data, and (iii a tensor distance-based high-order CFS algorithm to cluster heterogeneous data. Furthermore, we verify our proposed algorithm on different datasets, by comparison with other two clustering schemes, that is, HOPCM and CFS. Results confirm the effectiveness of the proposed algorithm in clustering heterogeneous data.

  11. Meaningful Clustered Forest: an Automatic and Robust Clustering Algorithm

    CERN Document Server

    Tepper, Mariano; Almansa, Andrés

    2011-01-01

    We propose a new clustering method that can be regarded as a numerical method to compute the proximity gestalt. The method analyzes edge length statistics in the MST of the dataset and provides an a contrario cluster detection criterion. The approach is fully parametric on the chosen distance and can detect arbitrarily shaped clusters. The method is also automatic, in the sense that only a single parameter is left to the user. This parameter has an intuitive interpretation as it controls the expected number of false detections. We show that the iterative application of our method can (1) provide robustness to noise and (2) solve a masking phenomenon in which a highly populated and salient cluster dominates the scene and inhibits the detection of less-populated, but still salient, clusters.

  12. THE USE OF GENETIC ALGORITHM IN DIMENSIONING HYBRID AUTONOMOUS SYSTEMS

    Directory of Open Access Journals (Sweden)

    RUS T.

    2016-03-01

    Full Text Available In this paper is presented the working principle of genetic algorithms used to dimension autonomous hybrid systems. It is presented a study case in which is dimensioned and optimized an autonomous hybrid system for a residential house located in Cluj-Napoca. After the autonomous hybrid system optimization is performed, it is achieved a reduction of the total cost of system investment, a reduction of energy produced in excess and a reduction of CO2 emissions.

  13. A hybrid algorithm for unrelated parallel machines scheduling

    Directory of Open Access Journals (Sweden)

    Mohsen Shafiei Nikabadi

    2016-09-01

    Full Text Available In this paper, a new hybrid algorithm based on multi-objective genetic algorithm (MOGA using simulated annealing (SA is proposed for scheduling unrelated parallel machines with sequence-dependent setup times, varying due dates, ready times and precedence relations among jobs. Our objective is to minimize makespan (Maximum completion time of all machines, number of tardy jobs, total tardiness and total earliness at the same time which can be more advantageous in real environment than considering each of objectives separately. For obtaining an optimal solution, hybrid algorithm based on MOGA and SA has been proposed in order to gain both good global and local search abilities. Simulation results and four well-known multi-objective performance metrics, indicate that the proposed hybrid algorithm outperforms the genetic algorithm (GA and SA in terms of each objective and significantly in minimizing the total cost of the weighted function.

  14. A novel hybrid algorithm of GSA with Kepler algorithm for numerical optimization

    Directory of Open Access Journals (Sweden)

    Soroor Sarafrazi

    2015-07-01

    Full Text Available It is now well recognized that pure algorithms can be promisingly improved by hybridization with other techniques. One of the relatively new metaheuristic algorithms is Gravitational Search Algorithm (GSA which is based on the Newton laws. In this paper, to enhance the performance of GSA, a novel algorithm called “Kepler”, inspired by the astrophysics, is introduced. The Kepler algorithm is based on the principle of the first Kepler law. The hybridization of GSA and Kepler algorithm is an efficient approach to provide much stronger specialization in intensification and/or diversification. The performance of GSA–Kepler is evaluated by applying it to 14 benchmark functions with 20–1000 dimensions and the optimal approximation of linear system as a practical optimization problem. The results obtained reveal that the proposed hybrid algorithm is robust enough to optimize the benchmark functions and practical optimization problems.

  15. The Refinement Algorithm Consideration in Text Clustering Scheme Based on Multilevel Graph

    Institute of Scientific and Technical Information of China (English)

    CHEN Jian-bin; DONG Xiang-jun; SONG Han-tao

    2004-01-01

    To construct a high efficient text clustering algorithm, the multilevel graph model and the refinement algorithm used in the uncoarsening phase is discussed.The model is applied to text clustering.The performance of clustering algorithm has to be improved with the refinement algorithm application.The experiment result demonstrated that the multilevel graph text clustering algorithm is available.

  16. Improved hybrid optimization algorithm for 3D protein structure prediction.

    Science.gov (United States)

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  17. A Scalable Clustering Algorithm in Dense Mobile Sensor Networks

    Directory of Open Access Journals (Sweden)

    Jianbo Li

    2011-03-01

    Full Text Available Clustering offers a kind of hierarchical organization to provide scalability and basic performance guarantee by partitioning the network into disjoint groups of nodes. In this paper a scalable and energy efficient clustering algorithm is proposed under dense mobile sensor networks scenario. In the initial cluster formation phase, our proposed scheme features a simple execution process with polynomial time complexity, and eliminates the “frozen time” requirement by introducing some GPS-capable mobile nodes to act as cluster heads. In the following cluster maintenance stage, the maintenance of clusters is asynchronously and event driven so as to thoroughly eliminate the “ripple effect” brought by node mobility. As a result local changes in a cluster need not be seen and updated by the entire network, thus bringing greatly reduced communication overheads and being well suitable for the high mobility environment. Extensive simulations have been conducted and the simulation results reveal that our proposed algorithm successfully achieves its target at incurring much less clustering overheads as well as maintaining much more stable cluster structure, as compared to HCC(High Connectivity Clustering  algorithm

  18. Color Image Segmentation Method Based on Improved Spectral Clustering Algorithm

    OpenAIRE

    Dong Qin

    2014-01-01

    Contraposing to the features of image data with high sparsity of and the problems on determination of clustering numbers, we try to put forward an color image segmentation algorithm, combined with semi-supervised machine learning technology and spectral graph theory. By the research of related theories and methods of spectral clustering algorithms, we introduce information entropy conception to design a method which can automatically optimize the scale parameter value. So it avoids the unstab...

  19. A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics

    Directory of Open Access Journals (Sweden)

    Tonny J. Oyana

    2010-01-01

    Full Text Available The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k-means clustering technique—the Fast, Efficient, and Scalable k-means algorithm (FES-k-means. The FES-k-means algorithm uses a hybrid approach that comprises the k-d tree data structure that enhances the nearest neighbor query, the original k-means algorithm, and an adaptation rate proposed by Mashor. This algorithm was tested using two real datasets and one synthetic dataset. It was employed twice on all three datasets: once on data trained by the innovative MIL-SOM method and then on the actual untrained data in order to evaluate its competence. This two-step approach of data training prior to clustering provides a solid foundation for knowledge discovery and data mining, otherwise unclaimed by clustering methods alone. The benefits of this method are that it produces clusters similar to the original k-means method at a much faster rate as shown by runtime comparison data; and it provides efficient analysis of large geospatial data with implications for disease mechanism discovery. From a disease mechanism discovery perspective, it is hypothesized that the linear-like pattern of elevated blood lead levels discovered in the city of Chicago may be spatially linked to the city's water service lines.

  20. Hybrid ant colony algorithm for traveling salesman problem

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    A hybrid approach based on ant colony algorithm for the traveling salesman problem is proposed, which is an improved algorithm characterized by adding a local search mechanism, a cross-removing strategy and candidate lists. Experimental results show that it is competitive in terms of solution quality and computation time.

  1. A New Method for Medical Image Clustering Using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Akbar Shahrzad Khashandarag

    2013-01-01

    Full Text Available Segmentation is applied in medical images when the brightness of the images becomes weaker so that making different in recognizing the tissues borders. Thus, the exact segmentation of medical images is an essential process in recognizing and curing an illness. Thus, it is obvious that the purpose of clustering in medical images is the recognition of damaged areas in tissues. Different techniques have been introduced for clustering in different fields such as engineering, medicine, data mining and so on. However, there is no standard technique of clustering to present ideal results for all of the imaging applications. In this paper, a new method combining genetic algorithm and k-means algorithm is presented for clustering medical images. In this combined technique, variable string length genetic algorithm (VGA is used for the determination of the optimal cluster centers. The proposed algorithm has been compared with the k-means clustering algorithm. The advantage of the proposed method is the accuracy in selecting the optimal cluster centers compared with the above mentioned technique.

  2. Centronit: Initial Centroid Designation Algorithm for K-Means Clustering

    Directory of Open Access Journals (Sweden)

    Ali Ridho Barakbah

    2014-06-01

    Full Text Available Clustering performance of the K-means highly depends on the correctness of initial centroids. Usually initial centroids for the K- means clustering are determined randomly so that the determined initial centers may cause to reach the nearest local minima, not the global optimum. In this paper, we propose an algorithm, called as Centronit, for designation of initial centroidoptimization of K-means clustering. The proposed algorithm is based on the calculation of the average distance of the nearest data inside region of the minimum distance. The initial centroids can be designated by the lowest average distance of each data. The minimum distance is set by calculating the average distance between the data. This method is also robust from outliers of data. The experimental results show effectiveness of the proposed method to improve the clustering results with the K-means clustering. Keywords: K-means clustering, initial centroids, Kmeansoptimization.

  3. New clustering algorithm for interconnection of MANET and internet

    Institute of Scientific and Technical Information of China (English)

    万象; 姚尹雄; 王豪行

    2004-01-01

    This paper presents core-agent based clustering (CBC) algorithm, a novel heuristic clustering scheme for interconnection of MANET and Internet using power, movement probability and hop length as constraints. CBC includes two phases as cluster initialization and cluster maintenance. In phase one, the selection of clusterheads obeys the first two constraints, whereas the father node of each clustering node is chosen according to above three ones. Phase two concerns the case of node insertion or removal. Easy access and little alteration of conventional mobile IP are some characters of this algorithm. Simulation results demonstrate that CBC has many advantages as less average hop length, good robustness and less overheads, and the clustered network architecture behaves stably when topology changes.

  4. The Effective Clustering Partition Algorithm Based on the Genetic Evolution

    Institute of Scientific and Technical Information of China (English)

    LIAO Qin; LI Xi-wen

    2006-01-01

    To the problem that it is hard to determine the clustering number and the abnormal points by using the clustering validity function, an effective clustering partition model based on the genetic algorithm is built in this paper. The solution to the problem is formed by the combination of the clustering partition and the encoding samples, and the fitness function is defined by the distances among and within clusters. The clustering number and the samples in each cluster are determined and the abnormal points are distinguished by implementing the triple random crossover operator and the mutation. Based on the known sample data, the results of the novel method and the clustering validity function are compared. Numerical experiments are given and the results show that the novel method is more effective.

  5. An Extended Clustering Algorithm for Statistical Language Models

    CERN Document Server

    Ueberla, J P

    1994-01-01

    Statistical language models frequently suffer from a lack of training data. This problem can be alleviated by clustering, because it reduces the number of free parameters that need to be trained. However, clustered models have the following drawback: if there is ``enough'' data to train an unclustered model, then the clustered variant may perform worse. On currently used language modeling corpora, e.g. the Wall Street Journal corpus, how do the performances of a clustered and an unclustered model compare? While trying to address this question, we develop the following two ideas. First, to get a clustering algorithm with potentially high performance, an existing algorithm is extended to deal with higher order N-grams. Second, to make it possible to cluster large amounts of training data more efficiently, a heuristic to speed up the algorithm is presented. The resulting clustering algorithm can be used to cluster trigrams on the Wall Street Journal corpus and the language models it produces can compete with exi...

  6. Cost Optimization Using Hybrid Evolutionary Algorithm in Cloud Computing

    Directory of Open Access Journals (Sweden)

    B. Kavitha

    2015-07-01

    Full Text Available The main aim of this research is to design the hybrid evolutionary algorithm for minimizing multiple problems of dynamic resource allocation in cloud computing. The resource allocation is one of the big problems in the distributed systems when the client wants to decrease the cost for the resource allocation for their task. In order to assign the resource for the task, the client must consider the monetary cost and computational cost. Allocation of resources by considering those two costs is difficult. To solve this problem in this study, we make the main task of client into many subtasks and we allocate resources for each subtask instead of selecting the single resource for the main task. The allocation of resources for the each subtask is completed through our proposed hybrid optimization algorithm. Here, we hybrid the Binary Particle Swarm Optimization (BPSO and Binary Cuckoo Search algorithm (BCSO by considering monetary cost and computational cost which helps to minimize the cost of the client. Finally, the experimentation is carried out and our proposed hybrid algorithm is compared with BPSO and BCSO algorithms. Also we proved the efficiency of our proposed hybrid optimization algorithm.

  7. Hybrid pre training algorithm of Deep Neural Networks

    Directory of Open Access Journals (Sweden)

    Drokin I. S.

    2016-01-01

    Full Text Available This paper proposes a hybrid algorithm of pre training deep networks, using both marked and unmarked data. The algorithm combines and extends the ideas of Self-Taught learning and pre training of neural networks approaches on the one hand, as well as supervised learning and transfer learning on the other. Thus, the algorithm tries to integrate in itself the advantages of each approach. The article gives some examples of applying of the algorithm, as well as its comparison with the classical approach to pre training of neural networks. These examples show the effectiveness of the proposed algorithm.

  8. Critical dynamics of cluster algorithms in the dilute Ising model

    Science.gov (United States)

    Hennecke, M.; Heyken, U.

    1993-08-01

    Autocorrelation times for thermodynamic quantities at T C are calculated from Monte Carlo simulations of the site-diluted simple cubic Ising model, using the Swendsen-Wang and Wolff cluster algorithms. Our results show that for these algorithms the autocorrelation times decrease when reducing the concentration of magnetic sites from 100% down to 40%. This is of crucial importance when estimating static properties of the model, since the variances of these estimators increase with autocorrelation time. The dynamical critical exponents are calculated for both algorithms, observing pronounced finite-size effects in the energy autocorrelation data for the algorithm of Wolff. We conclude that, when applied to the dilute Ising model, cluster algorithms become even more effective than local algorithms, for which increasing autocorrelation times are expected.

  9. Segmentation of Medical Image using Clustering and Watershed Algorithms

    OpenAIRE

    M. C.J. Christ; R.M.S Parvathi

    2011-01-01

    Problem statement: Segmentation plays an important role in medical imaging. Segmentation of an image is the division or separation of the image into dissimilar regions of similar attribute. In this study we proposed a methodology that integrates clustering algorithm and marker controlled watershed segmentation algorithm for medical image segmentation. The use of the conservative watershed algorithm for medical image analysis is pervasive because of its advantages, such as always being able to...

  10. Efficient Cluster Algorithm for CP(N-1) Models

    CERN Document Server

    Beard, B B; Riederer, S; Wiese, U J

    2006-01-01

    Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard Wilson formulation of lattice field theory. In fact, there is a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. In this paper, we construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a regularization for CP(N-1) models in the framework of D-theory. We present detailed studies of the autocorrelations and find a dynamical critical exponent that is consistent with z = 0.

  11. Efficient cluster algorithm for CP(N-1) models

    Science.gov (United States)

    Beard, B. B.; Pepe, M.; Riederer, S.; Wiese, U.-J.

    2006-11-01

    Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard Wilson formulation of lattice field theory. In fact, there is a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. In this paper, we construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a regularization for CP(N-1) models in the framework of D-theory. We present detailed studies of the autocorrelations and find a dynamical critical exponent that is consistent with z=0.

  12. Measuring Constraint-Set Utility for Partitional Clustering Algorithms

    Science.gov (United States)

    Davidson, Ian; Wagstaff, Kiri L.; Basu, Sugato

    2006-01-01

    Clustering with constraints is an active area of machine learning and data mining research. Previous empirical work has convincingly shown that adding constraints to clustering improves the performance of a variety of algorithms. However, in most of these experiments, results are averaged over different randomly chosen constraint sets from a given set of labels, thereby masking interesting properties of individual sets. We demonstrate that constraint sets vary significantly in how useful they are for constrained clustering; some constraint sets can actually decrease algorithm performance. We create two quantitative measures, informativeness and coherence, that can be used to identify useful constraint sets. We show that these measures can also help explain differences in performance for four particular constrained clustering algorithms.

  13. A dynamic fuzzy clustering method based on genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    ZHENG Yan; ZHOU Chunguang; LIANG Yanchun; GUO Dongwei

    2003-01-01

    A dynamic fuzzy clustering method is presented based on the genetic algorithm. By calculating the fuzzy dissimilarity between samples the essential associations among samples are modeled factually. The fuzzy dissimilarity between two samples is mapped into their Euclidean distance, that is, the high dimensional samples are mapped into the two-dimensional plane. The mapping is optimized globally by the genetic algorithm, which adjusts the coordinates of each sample, and thus the Euclidean distance, to approximate to the fuzzy dissimilarity between samples gradually. A key advantage of the proposed method is that the clustering is independent of the space distribution of input samples, which improves the flexibility and visualization. This method possesses characteristics of a faster convergence rate and more exact clustering than some typical clustering algorithms. Simulated experiments show the feasibility and availability of the proposed method.

  14. SURVEY ON CLUSTERING ALGORITHM AND SIMILARITY MEASURE FOR CATEGORICAL DATA

    Directory of Open Access Journals (Sweden)

    S. Anitha Elavarasi

    2014-01-01

    Full Text Available Learning is the process of generating useful information from a huge volume of data. Learning can be either supervised learning (e.g. classification or unsupervised learning (e.g. Clustering Clustering is the process of grouping a set of physical objects into classes of similar object. Objects in real world consist of both numerical and categorical data. Categorical data are not analyzed as numerical data because of the absence of inherit ordering. This paper describes about ten different clustering algorithms, its methodology and the factors influencing its performance. Each algorithm is evaluated using real world datasets and its pro and cons are specified. The various similarity / dissimilarity measure applied to categorical data and its performance is also discussed. The time complexity defines the amount of time taken by an algorithm to perform the elementary operation. The time complexity of various algorithms are discussed and its performance on real world data such as mushroom, zoo, soya bean, cancer, vote, car and iris are measured. In this survey Cluster Accuracy and Error rate for four different clustering algorithm (K-modes, fuzzy K-modes, ROCK and Squeezer, two different similarity measure (DISC and Overlap and DILCA applied for hierarchy and partition algorithm are evaluated.

  15. A Geometric Clustering Algorithm with Applications to Structural Data

    Science.gov (United States)

    Xu, Shutan; Zou, Shuxue

    2015-01-01

    Abstract An important feature of structural data, especially those from structural determination and protein-ligand docking programs, is that their distribution could be mostly uniform. Traditional clustering algorithms developed specifically for nonuniformly distributed data may not be adequate for their classification. Here we present a geometric partitional algorithm that could be applied to both uniformly and nonuniformly distributed data. The algorithm is a top-down approach that recursively selects the outliers as the seeds to form new clusters until all the structures within a cluster satisfy a classification criterion. The algorithm has been evaluated on a diverse set of real structural data and six sets of test data. The results show that it is superior to the previous algorithms for the clustering of structural data and is similar to or better than them for the classification of the test data. The algorithm should be especially useful for the identification of the best but minor clusters and for speeding up an iterative process widely used in NMR structure determination. PMID:25517067

  16. Research on retailer data clustering algorithm based on Spark

    Science.gov (United States)

    Huang, Qiuman; Zhou, Feng

    2017-03-01

    Big data analysis is a hot topic in the IT field now. Spark is a high-reliability and high-performance distributed parallel computing framework for big data sets. K-means algorithm is one of the classical partition methods in clustering algorithm. In this paper, we study the k-means clustering algorithm on Spark. Firstly, the principle of the algorithm is analyzed, and then the clustering analysis is carried out on the supermarket customers through the experiment to find out the different shopping patterns. At the same time, this paper proposes the parallelization of k-means algorithm and the distributed computing framework of Spark, and gives the concrete design scheme and implementation scheme. This paper uses the two-year sales data of a supermarket to validate the proposed clustering algorithm and achieve the goal of subdividing customers, and then analyze the clustering results to help enterprises to take different marketing strategies for different customer groups to improve sales performance.

  17. Multiscale modeling for classification of SAR imagery using hybrid EM algorithm and genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    Xianbin Wen; Hua Zhang; Jianguang Zhang; Xu Jiao; Lei Wang

    2009-01-01

    A novel method that hybridizes genetic algorithm (GA) and expectation maximization (EM) algorithm for the classification of syn-thetic aperture radar (SAR) imagery is proposed by the finite Gaussian mixtures model (GMM) and multiscale autoregressive (MAR)model. This algorithm is capable of improving the global optimality and consistency of the classification performance. The experiments on the SAR images show that the proposed algorithm outperforms the standard EM method significantly in classification accuracy.

  18. Big Data Clustering Using Genetic Algorithm On Hadoop Mapreduce

    Directory of Open Access Journals (Sweden)

    Nivranshu Hans

    2015-04-01

    Full Text Available Abstract Cluster analysis is used to classify similar objects under same group. It is one of the most important data mining methods. However it fails to perform well for big data due to huge time complexity. For such scenarios parallelization is a better approach. Mapreduce is a popular programming model which enables parallel processing in a distributed environment. But most of the clustering algorithms are not naturally parallelizable for instance Genetic Algorithms. This is so due to the sequential nature of Genetic Algorithms. This paper introduces a technique to parallelize GA based clustering by extending hadoop mapreduce. An analysis of proposed approach to evaluate performance gains with respect to a sequential algorithm is presented. The analysis is based on a real life large data set.

  19. Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering.

    Science.gov (United States)

    He, Zhaoshui; Xie, Shengli; Zdunek, Rafal; Zhou, Guoxu; Cichocki, Andrzej

    2011-12-01

    Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.

  20. A hybrid algorithm for speckle noise reduction of ultrasound images.

    Science.gov (United States)

    Singh, Karamjeet; Ranade, Sukhjeet Kaur; Singh, Chandan

    2017-09-01

    Medical images are contaminated by multiplicative speckle noise which significantly reduce the contrast of ultrasound images and creates a negative effect on various image interpretation tasks. In this paper, we proposed a hybrid denoising approach which collaborate the both local and nonlocal information in an efficient manner. The proposed hybrid algorithm consist of three stages in which at first stage the use of local statistics in the form of guided filter is used to reduce the effect of speckle noise initially. Then, an improved speckle reducing bilateral filter (SRBF) is developed to further reduce the speckle noise from the medical images. Finally, to reconstruct the diffused edges we have used the efficient post-processing technique which jointly considered the advantages of both bilateral and nonlocal mean (NLM) filter for the attenuation of speckle noise efficiently. The performance of proposed hybrid algorithm is evaluated on synthetic, simulated and real ultrasound images. The experiments conducted on various test images demonstrate that our proposed hybrid approach outperforms the various traditional speckle reduction approaches included recently proposed NLM and optimized Bayesian-based NLM. The results of various quantitative, qualitative measures and by visual inspection of denoise synthetic and real ultrasound images demonstrate that the proposed hybrid algorithm have strong denoising capability and able to preserve the fine image details such as edge of a lesion better than previously developed methods for speckle noise reduction. The denoising and edge preserving capability of hybrid algorithm is far better than existing traditional and recently proposed speckle reduction (SR) filters. The success of proposed algorithm would help in building the lay foundation for inventing the hybrid algorithms for denoising of ultrasound images. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. A Hybrid Algorithm for Strip Packing Problem with Rotation Constraint

    Directory of Open Access Journals (Sweden)

    Chen Huan

    2016-01-01

    Full Text Available Strip packing is a well-known NP-hard problem and it was widely applied in engineering fields. This paper considers a two-dimensional orthogonal strip packing problem. Until now some exact algorithm and mainly heuristics were proposed for two-dimensional orthogonal strip packing problem. While this paper proposes a two-stage hybrid algorithm for it. In the first stage, a heuristic algorithm based on layering idea is developed to construct a solution. In the second stage, a great deluge algorithm is used to further search a better solution. Computational results on several classes of benchmark problems have revealed that the hybrid algorithm improves the results of layer-heuristic, and can compete with other heuristics from the literature.

  2. Hybrid Ant Algorithm and Applications for Vehicle Routing Problem

    Science.gov (United States)

    Xiao, Zhang; Jiang-qing, Wang

    Ant colony optimization (ACO) is a metaheuristic method that inspired by the behavior of real ant colonies. ACO has been successfully applied to several combinatorial optimization problems, but it has some short-comings like its slow computing speed and local-convergence. For solving Vehicle Routing Problem, we proposed Hybrid Ant Algorithm (HAA) in order to improve both the performance of the algorithm and the quality of solutions. The proposed algorithm took the advantages of Nearest Neighbor (NN) heuristic and ACO for solving VRP, it also expanded the scope of solution space and improves the global ability of the algorithm through importing mutation operation, combining 2-opt heuristics and adjusting the configuration of parameters dynamically. Computational results indicate that the hybrid ant algorithm can get optimal resolution of VRP effectively.

  3. An improved algorithm for clustering gene expression data.

    Science.gov (United States)

    Bandyopadhyay, Sanghamitra; Mukhopadhyay, Anirban; Maulik, Ujjwal

    2007-11-01

    Recent advancements in microarray technology allows simultaneous monitoring of the expression levels of a large number of genes over different time points. Clustering is an important tool for analyzing such microarray data, typical properties of which are its inherent uncertainty, noise and imprecision. In this article, a two-stage clustering algorithm, which employs a recently proposed variable string length genetic scheme and a multiobjective genetic clustering algorithm, is proposed. It is based on the novel concept of points having significant membership to multiple classes. An iterated version of the well-known Fuzzy C-Means is also utilized for clustering. The significant superiority of the proposed two-stage clustering algorithm as compared to the average linkage method, Self Organizing Map (SOM) and a recently developed weighted Chinese restaurant-based clustering method (CRC), widely used methods for clustering gene expression data, is established on a variety of artificial and publicly available real life data sets. The biological relevance of the clustering solutions are also analyzed.

  4. Improved insensitive to input parameters trajectory clustering algorithm

    Institute of Scientific and Technical Information of China (English)

    Jiashun Chen; Dechang Pi

    2013-01-01

    The existing trajectory clustering (TRACLUS) is sensi-tive to the input parameters ε and MinLns. The parameter value is changed a little, but cluster results are entirely different. Aiming at this vulnerability, a shielding parameters sensitivity trajectory cluster (SPSTC) algorithm is proposed which is insensitive to the input parameters. Firstly, some definitions about the core distance and reachable distance of line segment are presented, and then the algorithm generates cluster sorting according to the core dis-tance and reachable distance. Secondly, the reachable plots of line segment sets are constructed according to the cluster sor-ting and reachable distance. Thirdly, a parameterized sequence is extracted according to the reachable plot, and then the final trajec-tory cluster based on the parameterized sequence is acquired. The parameterized sequence represents the inner cluster structure of trajectory data. Experiments on real data sets and test data sets show that the SPSTC algorithm effectively reduces the sensitivity to the input parameters, meanwhile it can obtain the better quality of the trajectory cluster.

  5. Multilayer Traffic Network Optimized by Multiobjective Genetic Clustering Algorithm

    Science.gov (United States)

    Wen, Feng; Gen, Mitsuo; Yu, Xinjie

    This paper introduces a multilayer traffic network model and traffic network clustering method for solving the route selection problem (RSP) in car navigation system (CNS). The purpose of the proposed method is to reduce the computation time of route selection substantially with acceptable loss of accuracy by preprocessing the large size traffic network into new network form. The proposed approach further preprocesses the traffic network than the traditional hierarchical network method by clustering method. The traffic network clustering considers two criteria. We specify a genetic clustering algorithm for traffic network clustering and use NSGA-II for calculating the multiple objective Pareto optimal set. The proposed method can overcome the size limitations when solving route selection in CNS. Solutions provided by the proposed algorithm are compared with the optimal solutions to analyze and quantify the loss of accuracy.

  6. AN APPLICATION OF HYBRID CLUSTERING AND NEURAL BASED PREDICTION MODELLING FOR DELINEATION OF MANAGEMENT ZONES

    Directory of Open Access Journals (Sweden)

    Babankumar S. Bansod

    2011-02-01

    Full Text Available Starting from descriptive data on crop yield and various other properties, the aim of this study is to reveal the trends on soil behaviour, such as crop yield. This study has been carried out by developing web application that uses a well known technique- Cluster Analysis. The cluster analysis revealed linkages between soil classes for the same field as well as between different fields, which can be partly assigned to crops rotation and determination of variable soil input rates. A hybrid clustering algorithm has been developed taking into account the traits of two clustering technologies: i Hierarchical clustering, ii K-means clustering. This hybrid clustering algorithm is applied to sensor- gathered data about soil and analysed, resulting in the formation of well delineatedmanagement zones based on various properties of soil, such as, ECa , crop yield, etc. One of the purposes of the study was to identify the main factors affecting the crop yield and the results obtained were validated with existing techniques. To accomplish this purpose, geo-referenced soil information has been examined. Also, based on this data, statistical method has been used to classify and characterize the soil behaviour. This is done using a prediction model, developed to predict the unknown behaviour of clusters based on the known behaviour of other clusters. In predictive modeling, data has been collected for the relevant predictors, a statistical model has been formulated, predictions were made and the model can be validated (or revised as additional data becomes available. The model used in the web application has been formed taking into account neural network based minimum hamming distance criterion.

  7. Morphology of Open Clusters NGC 1857 and Czernik 20 using Clustering Algorithms

    CERN Document Server

    Bhattacharya, Souradeep; Pandaokar, Samay; Singh, Parikshit Kishor

    2016-01-01

    The morphology and cluster membership of the Galactic open clusters - Czernik 20 and NGC 1857 were analyzed using two different clustering algorithms. We present the maiden use of density-based spatial clustering of applications with noise (DBSCAN) to determine open cluster morphology from spatial distribution. The region of analysis has also been spatially classified using a statistical membership determination algorithm. We utilized near infrared (NIR) data for a suitably large region around the clusters from the United Kingdom Infrared Deep Sky Survey Galactic Plane Survey star catalogue database, and also from the Two Micron All Sky Survey star catalogue database. The densest regions of the cluster morphologies (1 for Czernik 20 and 2 for NGC 1857) thus identified were analyzed with a K-band extinction map and color-magnitude diagrams (CMDs). To address significant discrepancy in known distance and reddening parameters, we carried out field decontamination of these CMDs and subsequent isochrone fitting of...

  8. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    CERN Document Server

    Emmons, Scott; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie network clustering. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms -- Blondel, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 o...

  9. Sampling Within k-Means Algorithm to Cluster Large Datasets

    Energy Technology Data Exchange (ETDEWEB)

    Bejarano, Jeremy [Brigham Young University; Bose, Koushiki [Brown University; Brannan, Tyler [North Carolina State University; Thomas, Anita [Illinois Institute of Technology; Adragni, Kofi [University of Maryland; Neerchal, Nagaraj [University of Maryland; Ostrouchov, George [ORNL

    2011-08-01

    Due to current data collection technology, our ability to gather data has surpassed our ability to analyze it. In particular, k-means, one of the simplest and fastest clustering algorithms, is ill-equipped to handle extremely large datasets on even the most powerful machines. Our new algorithm uses a sample from a dataset to decrease runtime by reducing the amount of data analyzed. We perform a simulation study to compare our sampling based k-means to the standard k-means algorithm by analyzing both the speed and accuracy of the two methods. Results show that our algorithm is significantly more efficient than the existing algorithm with comparable accuracy. Further work on this project might include a more comprehensive study both on more varied test datasets as well as on real weather datasets. This is especially important considering that this preliminary study was performed on rather tame datasets. Also, these datasets should analyze the performance of the algorithm on varied values of k. Lastly, this paper showed that the algorithm was accurate for relatively low sample sizes. We would like to analyze this further to see how accurate the algorithm is for even lower sample sizes. We could find the lowest sample sizes, by manipulating width and confidence level, for which the algorithm would be acceptably accurate. In order for our algorithm to be a success, it needs to meet two benchmarks: match the accuracy of the standard k-means algorithm and significantly reduce runtime. Both goals are accomplished for all six datasets analyzed. However, on datasets of three and four dimension, as the data becomes more difficult to cluster, both algorithms fail to obtain the correct classifications on some trials. Nevertheless, our algorithm consistently matches the performance of the standard algorithm while becoming remarkably more efficient with time. Therefore, we conclude that analysts can use our algorithm, expecting accurate results in considerably less time.

  10. GDCluster: A General Decentralized Clustering Algorithm

    NARCIS (Netherlands)

    Mashayekhi, Hoda; Habibi, Jafar; Khalafbeigi, Tania; Voulgaris, Spyros; van Steen, Martinus Richardus

    In many popular applications like peer-to-peer systems, large amounts of data are distributed among multiple sources. Analysis of this data and identifying clusters is challenging due to processing, storage, and transmission costs. In this paper, we propose GDCluster, a general fully decentralized

  11. A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering

    Science.gov (United States)

    Chahine, Firas Safwan

    2012-01-01

    Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…

  12. A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering

    Science.gov (United States)

    Chahine, Firas Safwan

    2012-01-01

    Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…

  13. A Novel Hybrid Algorithm for Task Graph Scheduling

    Directory of Open Access Journals (Sweden)

    Vahid Majid Nezhad

    2011-03-01

    Full Text Available One of the important problems in multiprocessor systems is Task Graph Scheduling. Task Graph Scheduling is an NP-Hard problem. Both learning automata and genetic algorithms are search tools which are used for solving many NP-Hard problems. In this paper a new hybrid method based on Genetic Algorithm and Learning Automata is proposed. The proposed algorithm begins with an initial population of randomly generated chromosomes and after some stages, each chromosome maps to an automaton. Experimental results show that superiority of the proposed algorithm over the current approaches.

  14. An Effective Hybrid Optimization Algorithm for Capacitated Vehicle Routing Problem

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Capacitated vehicle routing problem (CVRP) is an important combinatorial optimization problem. However, it is quite difficult to achieve an optimal solution with the traditional optimization methods owing to the high computational complexity. A hybrid algorithm was developed to solve the problem, in which an artificial immune clonal algorithm (AICA) makes use of the global search ability to search the optimal results and simulated annealing (SA) algorithm employs certain probability to avoid becoming trapped in a local optimum. The results obtained from the computational study show that the proposed algorithm is a feasible and effective method for capacitated vehicle routing problem.

  15. A Parallel Genetic Simulated Annealing Hybrid Algorithm for Task Scheduling

    Institute of Scientific and Technical Information of China (English)

    SHU Wanneng; ZHENG Shijue

    2006-01-01

    In this paper combined with the advantages of genetic algorithm and simulated annealing, brings forward a parallel genetic simulated annealing hybrid algorithm (PGSAHA) and applied to solve task scheduling problem in grid computing .It first generates a new group of individuals through genetic operation such as reproduction, crossover, mutation, etc, and than simulated anneals independently all the generated individuals respectively.When the temperature in the process of cooling no longer falls, the result is the optimal solution on the whole.From the analysis and experiment result, it is concluded that this algorithm is superior to genetic algorithm and simulated annealing.

  16. An Efficient Hybrid Algorithm for Mining Web Frequent Access Patterns

    Institute of Scientific and Technical Information of China (English)

    ZHAN Li-qiang; LIU Da-xin

    2004-01-01

    We propose an efficient hybrid algorithm WDHP in this paper for mining frequent access patterns.WDHP adopts the techniques of DHP to optimize its performance, which is using hash table to filter candidate set and trimming database.Whenever the database is trimmed to a size less than a specified threshold, the algorithm puts the database into main memory by constructing a tree, and finds frequent patterns on the tree.The experiment shows that WDHP outperform algorithm DHP and main memory based algorithm WAP in execution efficiency.

  17. A Novel Hybrid Algorithm for Task Graph Scheduling

    CERN Document Server

    Nezhad, Vahid Majid; Efimov, Evgueni

    2011-01-01

    One of the important problems in multiprocessor systems is Task Graph Scheduling. Task Graph Scheduling is an NP-Hard problem. Both learning automata and genetic algorithms are search tools which are used for solving many NP-Hard problems. In this paper a new hybrid method based on Genetic Algorithm and Learning Automata is proposed. The proposed algorithm begins with an initial population of randomly generated chromosomes and after some stages, each chromosome maps to an automaton. Experimental results show that superiority of the proposed algorithm over the current approaches.

  18. A New Class of Hybrid Particle Swarm Optimization Algorithm

    Institute of Scientific and Technical Information of China (English)

    Da-Qing Guo; Yong-Jin Zhao; Hui Xiong; Xiao Li

    2007-01-01

    A new class of hybrid particle swarm optimization (PSO) algorithm is developed for solving the premature convergence caused by some particles in standard PSO fall into stagnation. In this algorithm, the linearly decreasing inertia weight technique (LDIW) and the mutative scale chaos optimization algorithm (MSCOA) are combined with standard PSO, which are used to balance the global and local exploration abilities and enhance the local searching abilities, respectively. In order to evaluate the performance of the new method, three benchmark functions are used. The simulation results confirm the proposed algorithm can greatly enhance the searching ability and effectively improve the premature convergence.

  19. Effective FCM noise clustering algorithms in medical images.

    Science.gov (United States)

    Kannan, S R; Devi, R; Ramathilagam, S; Takezawa, K

    2013-02-01

    The main motivation of this paper is to introduce a class of robust non-Euclidean distance measures for the original data space to derive new objective function and thus clustering the non-Euclidean structures in data to enhance the robustness of the original clustering algorithms to reduce noise and outliers. The new objective functions of proposed algorithms are realized by incorporating the noise clustering concept into the entropy based fuzzy C-means algorithm with suitable noise distance which is employed to take the information about noisy data in the clustering process. This paper presents initial cluster prototypes using prototype initialization method, so that this work tries to obtain the final result with less number of iterations. To evaluate the performance of the proposed methods in reducing the noise level, experimental work has been carried out with a synthetic image which is corrupted by Gaussian noise. The superiority of the proposed methods has been examined through the experimental study on medical images. The experimental results show that the proposed algorithms perform significantly better than the standard existing algorithms. The accurate classification percentage of the proposed fuzzy C-means segmentation method is obtained using silhouette validity index.

  20. Robustness of the ATLAS pixel clustering neural network algorithm

    CERN Document Server

    AUTHOR|(INSPIRE)INSPIRE-00407780; The ATLAS collaboration

    2016-01-01

    Proton-proton collisions at the energy frontier puts strong constraints on track reconstruction algorithms. In the ATLAS track reconstruction algorithm, an artificial neural network is utilised to identify and split clusters of neighbouring read-out elements in the ATLAS pixel detector created by multiple charged particles. The robustness of the neural network algorithm is presented, probing its sensitivity to uncertainties in the detector conditions. The robustness is studied by evaluating the stability of the algorithm's performance under a range of variations in the inputs to the neural networks. Within reasonable variation magnitudes, the neural networks prove to be robust to most variation types.

  1. Silver cluster-biomolecule hybrids: from basics towards sensors.

    Science.gov (United States)

    Bonačić-Koutecký, Vlasta; Kulesza, Alexander; Gell, Lars; Mitrić, Roland; Antoine, Rodolphe; Bertorelle, Franck; Hamouda, Ramzi; Rayane, Driss; Broyer, Michel; Tabarin, Thibault; Dugourd, Philippe

    2012-07-14

    We focus on the functional role of small silver clusters in model hybrid systems involving peptides in the context of a new generation of nanostructured materials for biosensing. The optical properties of hybrids in the gas phase and at support will be addressed with the aim to bridge fundamental and application aspects. We show that extension and enhancement of absorption of peptides can be achieved by small silver clusters due to the interaction of intense intracluster excitations with the π-π* excitations of chromophoric aminoacids. Moreover, we demonstrate that the binding of a peptide to a supported silver cluster can be detected by the optical fingerprint. This illustrates that supported silver clusters can serve as building blocks for biosensing materials. Moreover, the clusters can be used simultaneously to immobilize biomolecules and to increase the sensitivity of detection, thus replacing the standard use of organic dyes and providing label-free detection. Complementary to that, we show that protected silver clusters containing a cluster core and a shell liganded by thiolates exhibit absorption properties with intense transitions in the visible regime which are also suitable for biosensing applications.

  2. World Wide Web Metasearch Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Adina LIPAI

    2008-01-01

    Full Text Available As the storage capacity and the processing speed of search engine is growing to keep up with the constant expansion of the World Wide Web, the user is facing an increasing list of results for a given query. A simple query composed of common words sometimes have hundreds even thousands of results making it practically impossible for the user to verify all of them, in order to identify a particular site. Even when the list of results is presented to the user ordered by a rank, most of the time it is not sufficient support to help him identify the most relevant sites for his query. The concept of search result clustering was introduced as a solution to this situation. The process of clustering search results consists of building up thematically homogenous groups from the initial list results provided by classic search tools, and using up characteristics present within the initial results, without any kind of predefined categories.

  3. Efficient Clustering of Web Search Results Using Enhanced Lingo Algorithm

    Directory of Open Access Journals (Sweden)

    M. Manikantan

    2015-02-01

    Full Text Available Web query optimization is the focus of recent research and development efforts. To fetch the required information, the users are using search engines and sometimes through the website interfaces. One approach is search engine optimization which is used by the website developers to popularize their website through the search engine results. Clustering is a main task of explorative data mining process and a common technique for grouping the web search results into a different category based on the specific web contents. A clustering search engine called Lingo used only snippets to cluster the documents. Though this method takes less time to cluster the documents, it could not be able to produce the clusters of good quality. This study focuses on clustering all documents using by applying semantic similarity between words and then by applying modified lingo algorithm in less time and produce good quality.

  4. AN IMPROVED FUZZY CLUSTERING ALGORITHM FOR MICROARRAY IMAGE SPOTS SEGMENTATION

    Directory of Open Access Journals (Sweden)

    V.G. Biju

    2015-11-01

    Full Text Available An automatic cDNA microarray image processing using an improved fuzzy clustering algorithm is presented in this paper. The spot segmentation algorithm proposed uses the gridding technique developed by the authors earlier, for finding the co-ordinates of each spot in an image. Automatic cropping of spots from microarray image is done using these co-ordinates. The present paper proposes an improved fuzzy clustering algorithm Possibility fuzzy local information c means (PFLICM to segment the spot foreground (FG from background (BG. The PFLICM improves fuzzy local information c means (FLICM algorithm by incorporating typicality of a pixel along with gray level information and local spatial information. The performance of the algorithm is validated using a set of simulated cDNA microarray images added with different levels of AWGN noise. The strength of the algorithm is tested by computing the parameters such as the Segmentation matching factor (SMF, Probability of error (pe, Discrepancy distance (D and Normal mean square error (NMSE. SMF value obtained for PFLICM algorithm shows an improvement of 0.9 % and 0.7 % for high noise and low noise microarray images respectively compared to FLICM algorithm. The PFLICM algorithm is also applied on real microarray images and gene expression values are computed.

  5. G/SPLINES: A hybrid of Friedman's Multivariate Adaptive Regression Splines (MARS) algorithm with Holland's genetic algorithm

    Science.gov (United States)

    Rogers, David

    1991-01-01

    G/SPLINES are a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm with Holland's Genetic Algorithm. In this hybrid, the incremental search is replaced by a genetic search. The G/SPLINE algorithm exhibits performance comparable to that of the MARS algorithm, requires fewer least squares computations, and allows significantly larger problems to be considered.

  6. Functional clustering algorithm for the analysis of dynamic network data

    Science.gov (United States)

    Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.

    2009-05-01

    We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.

  7. Application of genetic algorithms to hydrogenated silicon clusters

    Indian Academy of Sciences (India)

    N Chakraborti; R Prasad

    2003-01-01

    We discuss the application of biologically inspired genetic algorithms to determine the ground state structures of a number of Si–H clusters. The total energy of a given configuration of a cluster has been obtained by using a non-orthogonal tight-binding model and the energy minimization has been carried out by using genetic algorithms and their recent variant differential evolution. Our results for ground state structures and cohesive energies for Si–H clusters are in good agreement with the earlier work conducted using the simulated annealing technique. We find that the results obtained by genetic algorithms turn out to be comparable and often better than the results obtained by the simulated annealing technique.

  8. Spin chain simulations with a meron cluster algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Boyer, T. [Humboldt-Universitaet, Berlin (Germany). Inst. fuer Physik]|[Ecole Normale Superieure de Cachan (France); Bietenholz, W. [Humboldt-Universitaet, Berlin (Germany). Inst. fuer Physik]|[Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC; Wuilloud, J. [Humboldt-Universitaet, Berlin (Germany). Inst. fuer Physik]|[Geneve Univ. (Switzerland). Dept. de Physique Theorique

    2007-01-15

    We apply a meron cluster algorithm to the XY spin chain, which describes a quantum rotor. This is a multi-cluster simulation supplemented by an improved estimator, which deals with objects of half-integer topological charge. This method is powerful enough to provide precise results for the model with a {theta}-term - it is therefore one of the rare examples, where a system with a complex action can be solved numerically. In particular we measure the correlation length, as well as the topological and magnetic susceptibility. We discuss the algorithmic efficiency in view of the critical slowing down. Due to the excellent performance that we observe, it is strongly motivated to work on new applications of meron cluster algorithms in higher dimensions. (orig.)

  9. Adaptive Weighted Clustering Algorithm for Mobile Ad-hoc Networks

    Directory of Open Access Journals (Sweden)

    Adwan Yasin

    2016-04-01

    Full Text Available In this paper we present a new algorithm for clustering MANET by considering several parameters. This is a new adaptive load balancing technique for clustering out Mobile Ad-hoc Networks (MANET. MANET is special kind of wireless networks where no central management exits and the nodes in the network cooperatively manage itself and maintains connectivity. The algorithm takes into account the local capabilities of each node, the remaining battery power, degree of connectivity and finally the power consumption based on the average distance between nodes and candidate cluster head. The proposed algorithm efficiently decreases the overhead in the network that enhances the overall MANET performance. Reducing the maintenance time of broken routes makes the network more stable, reliable. Saving the power of the nodes also guarantee consistent and reliable network.

  10. A Hybrid Backtracking Search Optimization Algorithm with Differential Evolution

    Directory of Open Access Journals (Sweden)

    Lijin Wang

    2015-01-01

    Full Text Available The backtracking search optimization algorithm (BSA is a new nature-inspired method which possesses a memory to take advantage of experiences gained from previous generation to guide the population to the global optimum. BSA is capable of solving multimodal problems, but it slowly converges and poorly exploits solution. The differential evolution (DE algorithm is a robust evolutionary algorithm and has a fast convergence speed in the case of exploitive mutation strategies that utilize the information of the best solution found so far. In this paper, we propose a hybrid backtracking search optimization algorithm with differential evolution, called HBD. In HBD, DE with exploitive strategy is used to accelerate the convergence by optimizing one worse individual according to its probability at each iteration process. A suit of 28 benchmark functions are employed to verify the performance of HBD, and the results show the improvement in effectiveness and efficiency of hybridization of BSA and DE.

  11. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li

    2017-01-01

    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  12. Energy Efficient Homogenous Clustering and Cluster Head Selection Algorithm for WSN

    Directory of Open Access Journals (Sweden)

    Ganeshayya I. Shidaganti

    2013-02-01

    Full Text Available Wireless sensor networks (WSNs are energy and resource constrained networks, which are made up of small electronic devices called sensor nodes. Each sensor nodes are capable of sensing, computing and transmitting data from one node to another, till to reach base station. Each node monitors physical or environmental conditions, depending on application and communicate with nearby nodes via radio broadcast. Radio transmission and reception consumes a lot of energy in a wireless sensor network (WSN, thus, one of the important issues in wireless sensor network is the inherent limited battery power within the sensor nodes. Therefore, battery power is crucial parameter in the algorithm design in maximizing the lifespan of sensor nodes. Much research has been done in recent years in the area of low power routing protocol, but there are still many design options open for improvement and for further research targeted to the specific applications need to be done. In this paper, we propose a new approach of an energy-efficient homogeneous clustering and cluster head selection algorithm for wireless sensor networks in which the lifespan of the network is increased by ensuring a homogeneous distribution of nodes in the clusters. In this clustering algorithm, energy efficiency is distributed and network performance is improved by selecting cluster heads on the basis of the residual energy of existing cluster heads, holdback value, and nearest hop distance of the node. In the proposed clustering algorithm, the cluster members are uniformly distributed and the life of the network is further extended

  13. SOLUTION OF THE SATELLITE TRANSFER PROBLEM WITH HYBRID MEMETIC ALGORITHM

    Directory of Open Access Journals (Sweden)

    A. V. Panteleyev

    2014-01-01

    Full Text Available This paper presents a hybrid memetic algorithm (MA to solve the problem of finding the optimal program control of nonlinear continuous deterministic systems based on the concept of the meme, which is one of the promising solutions obtained in the course of implementing the procedure for searching the extremes. On the basis of the proposed algorithm the software complex is formed in C#. The solution of satellite transfer problem is presented.

  14. A New Hybrid Watermarking Algorithm for Images in Frequency Domain

    Directory of Open Access Journals (Sweden)

    AhmadReza Naghsh-Nilchi

    2008-03-01

    Full Text Available In recent years, digital watermarking has become a popular technique for digital images by hiding secret information which can protect the copyright. The goal of this paper is to develop a hybrid watermarking algorithm. This algorithm used DCT coefficient and DWT coefficient to embedding watermark, and the extracting procedure is blind. The proposed approach is robust to a variety of signal distortions, such as JPEG, image cropping and scaling.

  15. A Hybrid Genetic Algorithm for the Job Shop Scheduling Problem

    OpenAIRE

    Gonçalves, José Fernando; Mendes, J. J. M.; Resende, Maurício G. C.

    2005-01-01

    This paper presents a hybrid genetic algorithm for the Job Shop Scheduling problem. The chromosome representation of the problem is based on random keys. The schedules are constructed using a priority rule in which the priorities are defined by the genetic algorithm. Schedules are constructed using a procedure that generates parameterized active schedules. After a schedule is obtained a local search heuristic is applied to improve the solution. The approach is tested on a set o...

  16. NCUBE - A clustering algorithm based on a discretized data space

    Science.gov (United States)

    Eigen, D. J.; Northouse, R. A.

    1974-01-01

    Cluster analysis involves the unsupervised grouping of data. The process provides an automatic procedure for generating known training samples for pattern classification. NCUBE, the clustering algorithm presented, is based upon the concept of imposing a gridwork on the data space. The NCUBE computer implementation of this concept provides an easily derived form of piecewise linear discrimination. This piecewise linear discrimination permits the separation of some types of data groups that are not linearly separable.

  17. A Rough Set based Gene Expression Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    J. J. Emilyn

    2011-01-01

    Full Text Available Problem statement: Microarray technology helps in monitoring the expression levels of thousands of genes across collections of related samples. Approach: The main goal in the analysis of large and heterogeneous gene expression datasets was to identify groups of genes that get expressed in a set of experimental conditions. Results: Several clustering techniques have been proposed for identifying gene signatures and to understand their role and many of them have been applied to gene expression data, but with partial success. The main aim of this work was to develop a clustering algorithm that would successfully indentify gene patterns. The proposed novel clustering technique (RCGED provides an efficient way of finding the hidden and unique gene expression patterns. It overcomes the restriction of one object being placed in only one cluster. Conclusion/Recommendations: The proposed algorithm is termed intelligent because it automatically determines the optimum number of clusters. The proposed algorithm was experimented with colon cancer dataset and the results were compared with Rough Fuzzy K Means algorithm.

  18. Core Business Selection Based on Ant Colony Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Yu Lan

    2014-01-01

    Full Text Available Core business is the most important business to the enterprise in diversified business. In this paper, we first introduce the definition and characteristics of the core business and then descript the ant colony clustering algorithm. In order to test the effectiveness of the proposed method, Tianjin Port Logistics Development Co., Ltd. is selected as the research object. Based on the current situation of the development of the company, the core business of the company can be acquired by ant colony clustering algorithm. Thus, the results indicate that the proposed method is an effective way to determine the core business for company.

  19. Research on Scheduling Algorithms in Web Cluster Servers

    Institute of Scientific and Technical Information of China (English)

    LEI YingChun (雷迎春); GONG YiLi (龚奕利); ZHANG Song (张松); LI GuoJie (李国杰)

    2003-01-01

    This paper analyzes quantitatively the impact of the load balance scheduling algorithms and the locality scheduling algorithms on the performance of Web cluster servers, and brings forward the Adaptive_LARD algorithm. Compared with the representative LARD algorithm, the advantages of the Adaptive_LARD are that: (1) it adjusts load distribution among the back-ends through the idea of load balancing to avoid learning steps in the LARD algorithm and reinforce its adaptability; (2) by distinguishing between TCP connections accessing disks and those accessing cache memory, it can estimate the impact of different connections on the back-ends' load more precisely. Performance evaluations suggest that the proposed method outperforms the LARD algorithm by up to 14.7%.

  20. Identifying multiple influential spreaders by a heuristic clustering algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Bao, Zhong-Kui [School of Mathematical Science, Anhui University, Hefei 230601 (China); Liu, Jian-Guo [Data Science and Cloud Service Research Center, Shanghai University of Finance and Economics, Shanghai, 200133 (China); Zhang, Hai-Feng, E-mail: haifengzhang1978@gmail.com [School of Mathematical Science, Anhui University, Hefei 230601 (China); Department of Communication Engineering, North University of China, Taiyuan, Shan' xi 030051 (China)

    2017-03-18

    The problem of influence maximization in social networks has attracted much attention. However, traditional centrality indices are suitable for the case where a single spreader is chosen as the spreading source. Many times, spreading process is initiated by simultaneously choosing multiple nodes as the spreading sources. In this situation, choosing the top ranked nodes as multiple spreaders is not an optimal strategy, since the chosen nodes are not sufficiently scattered in networks. Therefore, one ideal situation for multiple spreaders case is that the spreaders themselves are not only influential but also they are dispersively distributed in networks, but it is difficult to meet the two conditions together. In this paper, we propose a heuristic clustering (HC) algorithm based on the similarity index to classify nodes into different clusters, and finally the center nodes in clusters are chosen as the multiple spreaders. HC algorithm not only ensures that the multiple spreaders are dispersively distributed in networks but also avoids the selected nodes to be very “negligible”. Compared with the traditional methods, our experimental results on synthetic and real networks indicate that the performance of HC method on influence maximization is more significant. - Highlights: • A heuristic clustering algorithm is proposed to identify the multiple influential spreaders in complex networks. • The algorithm can not only guarantee the selected spreaders are sufficiently scattered but also avoid to be “insignificant”. • The performance of our algorithm is generally better than other methods, regardless of real networks or synthetic networks.

  1. Limited Random Walk Algorithm for Big Graph Data Clustering

    CERN Document Server

    Zhang, Honglei; Kiranyaz, Serkan; Gabbouj, Moncef

    2016-01-01

    Graph clustering is an important technique to understand the relationships between the vertices in a big graph. In this paper, we propose a novel random-walk-based graph clustering method. The proposed method restricts the reach of the walking agent using an inflation function and a normalization function. We analyze the behavior of the limited random walk procedure and propose a novel algorithm for both global and local graph clustering problems. Previous random-walk-based algorithms depend on the chosen fitness function to find the clusters around a seed vertex. The proposed algorithm tackles the problem in an entirely different manner. We use the limited random walk procedure to find attracting vertices in a graph and use them as features to cluster the vertices. According to the experimental results on the simulated graph data and the real-world big graph data, the proposed method is superior to the state-of-the-art methods in solving graph clustering problems. Since the proposed method uses the embarrass...

  2. A hybrid SPH/N-body method for star cluster simulations

    CERN Document Server

    Hubber, D A; Smith, R; Goodwin, S P

    2013-01-01

    We present a new hybrid Smoothed Particle Hydrodynamics (SPH)/N-body method for modelling the collisional stellar dynamics of young clusters in a live gas background. By deriving the equations of motion from Lagrangian mechanics we obtain a formally conservative combined SPH/N-body scheme. The SPH gas particles are integrated with a 2nd order Leapfrog, and the stars with a 4th order Hermite scheme. Our new approach is intended to bridge the divide between the detailed, but expensive, full hydrodynamical simulations of star formation, and pure N-body simulations of gas-free star clusters. We have implemented this hybrid approach in the SPH code SEREN (Hubber et al. 2011) and perform a series of simple tests to demonstrate the fidelity of the algorithm and its conservation properties. We investigate and present resolution criteria to adequately resolve the density field and to prevent strong numerical scattering effects. Future developments will include a more sophisticated treatment of binaries.

  3. Hybrid Genetic Algorithm with PSO Effect for Combinatorial Optimisation Problems

    Directory of Open Access Journals (Sweden)

    M. H. Mehta

    2012-12-01

    Full Text Available In engineering field, many problems are hard to solve in some definite interval of time. These problems known as “combinatorial optimisation problems” are of the category NP. These problems are easy to solve in some polynomial time when input size is small but as input size grows problems become toughest to solve in some definite interval of time. Long known conventional methods are not able to solve the problems and thus proper heuristics is necessary. Evolutionary algorithms based on behaviours of different animals and species have been invented and studied for this purpose. Genetic Algorithm is considered a powerful algorithm for solving combinatorial optimisation problems. Genetic algorithms work on these problems mimicking the human genetics. It follows principle of “survival of the fittest” kind of strategy. Particle swarm optimisation is a new evolutionary approach that copies behaviour of swarm in nature. However, neither traditional genetic algorithms nor particle swarm optimisation alone has been completely successful for solving combinatorial optimisation problems. Here a hybrid algorithm is proposed in which strengths of both algorithms are merged and performance of proposed algorithm is compared with simple genetic algorithm. Results show that proposed algorithm works definitely better than the simple genetic algorithm.

  4. A Genetic Clustering Algorithm for Mean-Residual Vector Quantization

    Institute of Scientific and Technical Information of China (English)

    CHUShuchuan; JohnF.Roddick; CHENTsongyi

    2004-01-01

    Vector quantization (VQ) is a useful tool for data compression and can be applied to compress the data vectors in the database. The quality of the recovered data vector depends on a good codebook. Meanresidual vector quantization (M/R VQ) has been shown to be efficient in the encoding time and it only needs a little storage. In this paper, genetic algorithms in combination with the Generalized lloyd algorithm (GLA) are applied to the codebook design of M/R VQ. The mean codebook and residual codebook are trained using GLA algorithm separately, then Genetic algorithms (GA) are used to evaluate and evolve the combined mean codebook and residual codebook. The parameters used in the proposed algorithm are designed based on experiments and they are robust to the proposed GA based clustering algorithm for M/R VQ. Experimental results demonstrate the proposed genetic clustering algorithm applied to M/R VQ may improve the peak signal to noise ratio of the recovered data vector compared with the GLA algorithm.

  5. A Task-parallel Clustering Algorithm for Structured AMR

    Energy Technology Data Exchange (ETDEWEB)

    Gunney, B N; Wissink, A M

    2004-11-02

    A new parallel algorithm, based on the Berger-Rigoutsos algorithm for clustering grid points into logically rectangular regions, is presented. The clustering operation is frequently performed in the dynamic gridding steps of structured adaptive mesh refinement (SAMR) calculations. A previous study revealed that although the cost of clustering is generally insignificant for smaller problems run on relatively few processors, the algorithm scaled inefficiently in parallel and its cost grows with problem size. Hence, it can become significant for large scale problems run on very large parallel machines, such as the new BlueGene system (which has {Omicron}(10{sup 4}) processors). We propose a new task-parallel algorithm designed to reduce communication wait times. Performance was assessed using dynamic SAMR re-gridding operations on up to 16K processors of currently available computers at Lawrence Livermore National Laboratory. The new algorithm was shown to be up to an order of magnitude faster than the baseline algorithm and had better scaling trends.

  6. Hybrid Architectures for Evolutionary Computing Algorithms

    Science.gov (United States)

    2008-01-01

    Clarkson Univ., at AFRL, summer 2005 (yellow) Genetic Algorithm FPGA Core Burns P1026/MAPLD 200524 GA Core Datapath – Top-level Module • EA parameters and...Statistics are read from I/O ports Burns P1026/MAPLD 200525 GA Core Datapath – Population Module • Array of individuals • Population size register...Permutation generator • Current permutation element register • Current index register Burns P1026/MAPLD 200526 GA Core Datapath – PRNG Module • When

  7. A Hybrid Immigrants Scheme for Genetic Algorithms in Dynamic Environments

    Institute of Scientific and Technical Information of China (English)

    Shengxiang Yang; Renato Tinós

    2007-01-01

    Dynamic optimization problems are a kind of optimization problems that involve changes over time. They pose a serious challenge to traditional optimization methods as well as conventional genetic algorithms since the goal is no longer to search for the optimal solution(s) of a fixed problem but to track the moving optimum over time. Dynamic optimization problems have attracted a growing interest from the genetic algorithm community in recent years. Several approaches have been developed to enhance the performance of genetic algorithms in dynamic environments. One approach is to maintain the diversity of the population via random immigrants. This paper proposes a hybrid immigrants scheme that combines the concepts of elitism, dualism and random immigrants for genetic algorithms to address dynamic optimization problems. In this hybrid scheme, the best individual, i.e., the elite, from the previous generation and its dual individual are retrieved as the bases to create immigrants via traditional mutation scheme. These elitism-based and dualism-based immigrants together with some random immigrants are substituted into the current population, replacing the worst individuals in the population. These three kinds of immigrants aim to address environmental changes of slight, medium and significant degrees respectively and hence efficiently adapt genetic algorithms to dynamic environments that are subject to different severities of changes. Based on a series of systematically constructed dynamic test problems, experiments are carried out to investigate the performance of genetic algorithms with the hybrid immigrants scheme and traditional random immigrants scheme. Experimental results validate the efficiency of the proposed hybrid immigrants scheme for improving the performance of genetic algorithms in dynamic environments.

  8. Performance Assessment of Hybrid Data Fusion and Tracking Algorithms

    DEFF Research Database (Denmark)

    Sand, Stephan; Mensing, Christian; Laaraiedh, Mohamed

    2009-01-01

    This paper presents an overview on the performance of hybrid data fusion and tracking algorithms evaluated in the WHERE consortium. The focus is on three scenarios. For the small scale indoor scenario with ultra wideband (UWB) complementing cellular communication systems, the accuracy can vary in...

  9. Hybrid Probabilistic Logics: Theoretical Aspects, Algorithms and Experiments

    NARCIS (Netherlands)

    Michels, S.

    2016-01-01

    Steffen Michels Hybrid Probabilistic Logics: Theoretical Aspects, Algorithms and Experiments Probabilistic logics aim at combining the properties of logic, that is they provide a structured way of expressing knowledge and a mechanical way of reasoning about such knowledge, with the ability of prob

  10. A Hybrid Aggressive Space Mapping Algorithm for EM Optimization

    DEFF Research Database (Denmark)

    Bakr, Mohamed H.; Bandler, John W.; Georgieva, N.;

    1999-01-01

    We propose a novel hybrid aggressive space-mapping (HASM) optimization algorithm. HASM exploits both the trust-region aggressive space-mapping (TRASM) strategy and direct optimization. Severe differences between the coarse and fine models and nonuniqueness of the parameter extraction procedure ma...

  11. Hybrid Bee Ant Colony Algorithm for Effective Load Balancing And ...

    African Journals Online (AJOL)

    PROF. OLIVER OSUAGWA

    Genetic Algorithm (MO-GA) for dynamic job scheduling ... selection of a data centre. 2.2 Load ... An artificial ant colony, that was capable of .... Scheduling in Hybrid Cloud,” International Journal of Engineering and Technology Volume 2. No.

  12. A Novel Cluster Head Selection Algorithm Based on Fuzzy Clustering and Particle Swarm Optimization.

    Science.gov (United States)

    Ni, Qingjian; Pan, Qianqian; Du, Huimin; Cao, Cen; Zhai, Yuqing

    2017-01-01

    An important objective of wireless sensor network is to prolong the network life cycle, and topology control is of great significance for extending the network life cycle. Based on previous work, for cluster head selection in hierarchical topology control, we propose a solution based on fuzzy clustering preprocessing and particle swarm optimization. More specifically, first, fuzzy clustering algorithm is used to initial clustering for sensor nodes according to geographical locations, where a sensor node belongs to a cluster with a determined probability, and the number of initial clusters is analyzed and discussed. Furthermore, the fitness function is designed considering both the energy consumption and distance factors of wireless sensor network. Finally, the cluster head nodes in hierarchical topology are determined based on the improved particle swarm optimization. Experimental results show that, compared with traditional methods, the proposed method achieved the purpose of reducing the mortality rate of nodes and extending the network life cycle.

  13. Dynamic Head Cluster Election Algorithm for Clustered Ad-Hoc Networks

    Directory of Open Access Journals (Sweden)

    Arwa Zabian

    2008-01-01

    Full Text Available In distributed system, the concept of clustering consists on dividing the geographical area covered by a set of nodes into small zones. In mobile network, the clustering mechanism varied due to the mobility of the nodes any time in any direction. That causes the partitioning of the network or the joining of nodes. Several existing centralized or globalized algorithm have been proposed for clustering technique, in a manner that no one node becomes isolated and no cluster becomes overloaded. A particular node called head cluster or leader is elected, has the role to organize the distribution of nodes in clusters. We propose a distributed clustering and leader election mechanism for Ad-Hoc mobile networks, in which the leader is a mobile node. Our results show that, in the case of leader mobility the time needed to elect a new leader is smaller than the time needed a significant topological change in the network is happens.

  14. Hybrid Active Noise Control using Adjoint LMS Algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Nam, Hyun Do; Hong, Sik Ki [Dankook University (Korea, Republic of)

    1998-07-01

    A multi-channel hybrid active noise control(MCHANC) is derived by combining hybrid active noise control techniques and adjoint LMS algorithms, and this algorithm is applied to an active noise control system in a three dimensional enclosure. A MCHANC system uses feed forward and feedback filters simultaneously to cancel noises in an enclosure. The adjoint LMs algorithm, in which the error is filtered through an adjoint filter of the secondary channel, is also used to reduce the computational burden of adaptive filters. The overall attenuation performance and convergence characteristics of MCHANC algorithm is better than both multiple-channel feed forward algorithms and multiple-channel feedback algorithms. In a large enclosure, the acoustic reverberation can be very long, which means a very high order feed forward filter must be used to cancel the reverberation noises. Strong reverberation noises are generally narrow band and low frequency, which can be effectively predicted and canceled by a feedback adaptive filters. So lower order feed forward filter taps can be used in MCHANC algorithm which combines advantages of fast convergence and small excess mean square error. In this paper, computer simulations and real time implementations is carried out on a TMS320C31 processor to evaluate the performance of the MCHANC systems. (author). 11 refs., 11 figs., 1 tab.

  15. Clustered Self Organising Migrating Algorithm for the Quadratic Assignment Problem

    Science.gov (United States)

    Davendra, Donald; Zelinka, Ivan; Senkerik, Roman

    2009-08-01

    An approach of population dynamics and clustering for permutative problems is presented in this paper. Diversity indicators are created from solution ordering and its mapping is shown as an advantage for population control in metaheuristics. Self Organising Migrating Algorithm (SOMA) is modified using this approach and vetted with the Quadratic Assignment Problem (QAP). Extensive experimentation is conducted on benchmark problems in this area.

  16. Blockspin Scheme and Cluster Algorithm for Quantum Spin Systems

    CERN Document Server

    Ying, H P; Ying, He-Ping; Wiese, Uwe-Jens

    1992-01-01

    We present a numerical study using a cluster algorithm for the 1-d $S=1/2$ quantum Heisenberg models. The dynamical critical exponent for anti-ferromagnetic chains is $z=0.0(1)$ such that critical slowing down is eliminated.

  17. Clustering algorithms for Stokes space modulation format recognition

    DEFF Research Database (Denmark)

    Boada, Ricard; Borkowski, Robert; Tafur Monroy, Idelfonso

    2015-01-01

    Stokes space modulation format recognition (Stokes MFR) is a blind method enabling digital coherent receivers to infer modulation format information directly from a received polarization-division-multiplexed signal. A crucial part of the Stokes MFR is a clustering algorithm, which largely...

  18. A hybrid evolutionary algorithm for distribution feeder reconfiguration

    Indian Academy of Sciences (India)

    Taher Niknam; Reza Khorshidi; Bahman Bahmani Firouzi

    2010-04-01

    Distribution feeder reconfiguration (DFR) is formulated as a multiobjective optimization problem which minimizes real power losses, deviation of the node voltages and the number of switching operations and also balances the loads on the feeders. In the proposed method, the distance ($\\lambda_2$ norm) between the vectorvalued objective function and the worst-case vector-valued objective function in the feasible set is maximized. In the algorithm, the status of tie and sectionalizing switches are considered as the control variables. The proposed DFR problem is a non-differentiable optimization problem. Therefore, a new hybrid evolutionary algorithm based on combination of fuzzy adaptive particle swarm optimization (FAPSO) and ant colony optimization (ACO), called HFAPSO, is proposed to solve it. The performance of HFAPSO is evaluated and compared with other methods such as genetic algorithm (GA), ACO, the original PSO, Hybrid PSO and ACO (HPSO) considering different distribution test systems.

  19. A New Hybrid Algorithm for Association Rule Mining

    Institute of Scientific and Technical Information of China (English)

    ZHANG Min-cong; YAN Cun-liang; ZHU Kai-yu

    2007-01-01

    HA (hashing array), a new algorithm, for mining frequent itemsets of large database is proposed. It employs a structure hash array, ItemArray ( ) to store the information of database and then uses it instead of database in later iteration. By this improvement, only twice scanning of the whole database is necessary, thereby the computational cost can be reduced significantly. To overcome the performance bottleneck of frequent 2-itemsets mining, a modified algorithm of HA, DHA (direct-addressing hashing and array) is proposed, which combines HA with direct-addressing hashing technique. The new hybrid algorithm, DHA, not only overcomes the performance bottleneck but also inherits the advantages of HA. Extensive simulations are conducted in this paper to evaluate the performance of the proposed new algorithm, and the results prove the new algorithm is more efficient and reasonable.

  20. ANOMALY DETECTION IN NETWORKING USING HYBRID ARTIFICIAL IMMUNE ALGORITHM

    Directory of Open Access Journals (Sweden)

    D. Amutha Guka

    2012-01-01

    Full Text Available Especially in today’s network scenario, when computers are interconnected through internet, security of an information system is very important issue. Because no system can be absolutely secure, the timely and accurate detection of anomalies is necessary. The main aim of this research paper is to improve the anomaly detection by using Hybrid Artificial Immune Algorithm (HAIA which is based on Artificial Immune Systems (AIS and Genetic Algorithm (GA. In this research work, HAIA approach is used to develop Network Anomaly Detection System (NADS. The detector set is generated by using GA and the anomalies are identified using Negative Selection Algorithm (NSA which is based on AIS. The HAIA algorithm is tested with KDD Cup 99 benchmark dataset. The detection rate is used to measure the effectiveness of the NADS. The results and consistency of the HAIA are compared with earlier approaches and the results are presented. The proposed algorithm gives best results when compared to the earlier approaches.

  1. Hybrid Weighted-based Clustering Routing Protocol for Railway Communications

    Directory of Open Access Journals (Sweden)

    Jianli Xie

    2013-12-01

    Full Text Available In the paper, a hybrid clustering routing strategy is proposed for railway emergency ad hoc network, when GSM-R base stations are destroyed or some terminals (or nodes are far from the signal coverage. In this case, the cluster-head (CH election procedure is invoked on-demand, which takes into consideration the degree difference from the ideal degree, relative clustering stability, the sum of distance between the node and it’s one-hop neighbors, consumed power, node type and node mobility. For the clustering forming, the weights for the CH election parameters are allocated rationally by rough set theory. The hybrid weighted-based clustering routing (HWBCR strategy is designed for railway emergency communication scene, which aims to get a good trade-off between the computation costs and performances. The simulation platform is constructed to evaluate the performance of our strategy in terms of the average end-to-end delay, packet loss ratio, routing overhead and average throughput. The results, by comparing with the railway communication QoS index, reveal that our strategy is suitable for transmitting dispatching voice and data between train and ground, when the train speed is less than 220km/h

  2. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    Science.gov (United States)

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  3. The C4 clustering algorithm: Clusters of galaxies in the Sloan Digital Sky Survey

    Energy Technology Data Exchange (ETDEWEB)

    Miller, Christopher J.; Nichol, Robert; Reichart, Dan; Wechsler, Risa H.; Evrard, August; Annis, James; McKay, Timothy; Bahcall, Neta; Bernardi, Mariangela; Boehringer,; Connolly, Andrew; Goto, Tomo; Kniazev, Alexie; Lamb, Donald; Postman, Marc; Schneider, Donald; Sheth, Ravi; Voges, Wolfgang; /Cerro-Tololo InterAmerican Obs. /Portsmouth U.,

    2005-03-01

    We present the ''C4 Cluster Catalog'', a new sample of 748 clusters of galaxies identified in the spectroscopic sample of the Second Data Release (DR2) of the Sloan Digital Sky Survey (SDSS). The C4 cluster-finding algorithm identifies clusters as overdensities in a seven-dimensional position and color space, thus minimizing projection effects that have plagued previous optical cluster selection. The present C4 catalog covers {approx}2600 square degrees of sky and ranges in redshift from z = 0.02 to z = 0.17. The mean cluster membership is 36 galaxies (with redshifts) brighter than r = 17.7, but the catalog includes a range of systems, from groups containing 10 members to massive clusters with over 200 cluster members with redshifts. The catalog provides a large number of measured cluster properties including sky location, mean redshift, galaxy membership, summed r-band optical luminosity (L{sub r}), velocity dispersion, as well as quantitative measures of substructure and the surrounding large-scale environment. We use new, multi-color mock SDSS galaxy catalogs, empirically constructed from the {Lambda}CDM Hubble Volume (HV) Sky Survey output, to investigate the sensitivity of the C4 catalog to the various algorithm parameters (detection threshold, choice of passbands and search aperture), as well as to quantify the purity and completeness of the C4 cluster catalog. These mock catalogs indicate that the C4 catalog is {approx_equal}90% complete and 95% pure above M{sub 200} = 1 x 10{sup 14} h{sup -1}M{sub {circle_dot}} and within 0.03 {le} z {le} 0.12. Using the SDSS DR2 data, we show that the C4 algorithm finds 98% of X-ray identified clusters and 90% of Abell clusters within 0.03 {le} z {le} 0.12. Using the mock galaxy catalogs and the full HV dark matter simulations, we show that the L{sub r} of a cluster is a more robust estimator of the halo mass (M{sub 200}) than the galaxy line-of-sight velocity dispersion or the richness of the cluster

  4. Hybrid Parallel Bidirectional Sieve based on SMP Cluster

    CERN Document Server

    Liao, Gang; Liu, Lei

    2012-01-01

    In this article, hybrid parallel bidirectional sieve method is implemented by SMP Cluster, the individual computational units joined together by the communication network, are usually shared-memory systems with one or more multicore processor. To high-efficiency optimization, we propose average divide data into nodes, generating double-ended queues (deque) for sieve method that are able to exploit dual-cores simultaneously start sifting out primes from the head and tail.And each node create a FIFO queue as dynamic data buffer to ache temporary data from another nodes send to. The approach obtains huge speedup and efficiency on SMP Cluster.

  5. EFFECT OF CLUSTERING IN DESIGNING A FUZZY BASED HYBRID INTRUSION DETECTION SYSTEM FOR MOBILE AD HOC NETWORKS

    Directory of Open Access Journals (Sweden)

    D. Vydeki

    2013-01-01

    Full Text Available Intrusion Detection System (IDS provides additional security for the most vulnerable Mobile Adhoc Networks (MANET. Use of Fuzzy Inference System (FIS in the design of IDS is proven to be efficient in detecting routing attacks in MANETs. Clustering is a vital means in the detection process of FIS based hybrid IDS. This study describes the design of such a system to detect black hole attack in MANET that uses Adhoc On-Demand Distance Vector (AODV routing protocol. It analyses the effect of two clustering algorithms and also prescribes the suitable clustering algorithm for the above-mentioned IDS. MANETs with various traffic scenarios were simulated and the data set required for the IDS is extracted. A hybrid IDS is designed using Sugeno type-2 FIS to detect black hole attack. From the experimental results, it is derived that the subtractive clustering algorithm produces 97% efficient detection while FCM offers 91%. It has been found that the subtractive clustering algorithm is more fit and efficient than the Fuzzy C-Means clustering (FCM for the FIS based detection system.

  6. A Survey on Clustering Algorithms for Heterogeneous Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Vivek Katiyar

    2011-01-01

    Full Text Available Potential use of wireless sensor networks (WSNs can be seen in various fields like disaster management, battle field surveillance and border security surveillance since last few years. In such applications, a large number of sensor nodes are deployed, which are often unattended and work autonomously. Clustering is a key technique used to extend the lifetime of a sensor network by reducing energy consumption. It can also increase network scalability. Sensor nodes are considered to be homogeneous since the researches in the field of WSNs have been evolved, but some nodes may be of different energy to prolong the lifetime of a WSN and its reliability. In this paper, we study the impact of heterogeneity of nodes to the performance of WSNs. This paper surveys different clustering algorithms for heterogeneous WSNs by classifying algorithms depending upon various clustering attributes.

  7. An Efficient Cluster Algorithm for CP(N-1) Models

    CERN Document Server

    Beard, B B; Riederer, S; Wiese, U J

    2005-01-01

    We construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a new regularization for CP(N-1) models in the framework of D-theory, which is an alternative non-perturbative approach to quantum field theory formulated in terms of discrete quantum variables instead of classical fields. Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard formulation of lattice field theory. In fact, there is even a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. We present various simulations for different correlation lengths, couplings and lattice sizes. We have simulated correlation lengths up to 250 lattice spacings on lattices as large as 640x640 and we detect no evidence for critical slowing down.

  8. Morphology of open clusters NGC 1857 and Czernik 20 using clustering algorithms

    Science.gov (United States)

    Bhattacharya, S.; Mahulkar, V.; Pandaokar, S.; Singh, P. K.

    2017-01-01

    The morphology and cluster membership of the Galactic open clusters-Czernik 20 and NGC 1857 were analyzed using two different clustering algorithms. We present the maiden use of density-based spatial clustering of applications with noise (DBSCAN) to determine open cluster morphology from spatial distribution. The region of analysis has also been spatially classified using a statistical membership determination algorithm. We utilized near infrared (NIR) data for a suitably large region around the clusters from the United Kingdom Infrared Deep Sky Survey Galactic Plane Survey star catalogue database, and also from the Two Micron All Sky Survey star catalogue database. The densest regions of the cluster morphologies (1 for Czernik 20 and 2 for NGC 1857) thus identified were analyzed with a K-band extinction map and color-magnitude diagrams (CMDs). To address significant discrepancy in known distance and reddening parameters, we carried out field decontamination of these CMDs and subsequent isochrone fitting of the cleaned CMDs to obtain reliable distance and reddening parameters for the clusters (Czernik 20: D = 2900 pc; E(J- K) = 0 . 33; NGC 1857: D = 2400 pc; E(J- K) =0.18-0.19). The isochrones were also used to convert the luminosity functions for the densest regions of Czernik 20 and NGC 1857 into mass function, to derive their slopes. Additionally, a previously unknown over-density consistent with that of a star cluster is identified in the region of analysis.

  9. Combining ptychographical algorithms with the Hybrid Input-Output (HIO) algorithm.

    Science.gov (United States)

    Konijnenberg, A P; Coene, W M J; Pereira, S F; Urbach, H P

    2016-12-01

    In this article we combine the well-known Ptychographical Iterative Engine (PIE) with the Hybrid Input-Output (HIO) algorithm. The important insight is that the HIO feedback function should be kept strictly separate from the reconstructed object, which is done by introducing a separate feedback function per probe position. We have also combined HIO with floating PIE (fPIE) and extended PIE (ePIE). Simulations indicate that the combined algorithm performs significantly better in many situations. Although we have limited our research to a combination with HIO, the same insight can be used to combine ptychographical algorithms with any phase retrieval algorithm that uses a feedback function.

  10. GPU accelerated Hybrid Tree Algorithm for Collision-less N-body Simulations

    CERN Document Server

    Watanabe, Tsuyoshi

    2014-01-01

    We propose a hybrid tree algorithm for reducing calculation and communication cost of collision-less N-body simulations. The concept of our algorithm is that we split interaction force into two parts: hard-force from neighbor particles and soft-force from distant particles, and applying different time integration for the forces. For hard-force calculation, we can efficiently reduce the calculation and communication cost of the parallel tree code because we only need data of neighbor particles for this part. We implement the algorithm on GPU clusters to accelerate force calculation for both hard and soft force. As the result of implementing the algorithm on GPU clusters, we were able to reduce the communication cost and the total execution time to 40% and 80% of that of a normal tree algorithm, respectively. In addition, the reduction factor relative the normal tree algorithm is smaller for large number of processes, and we expect that the execution time can be ultimately reduced down to about 70% of the norma...

  11. Adaptive and Reliable Control Algorithm for Hybrid System Architecture

    Directory of Open Access Journals (Sweden)

    Osama Abdel Hakeem Abdel Sattar

    2012-01-01

    Full Text Available A stand-alone system is defined as an autonomous system that supplies electricity without being connected to the electric grid. Hybrid systems combined renewable energy source, that are never depleted (such solar (photovoltaic (PV, wind, hydroelectric, etc. , With other sources of energy, like Diesel. If these hybrid systems are optimally designed, they can be more cost effective and reliable than single systems. However, the design of hybrid systems is complex because of the uncertain renewable energy supplies, load demands and the non-linear characteristics of some components, so the design problem cannot be solved easily by classical optimisation methods. The use of heuristic techniques, such as the genetic algorithms, can give better results than classical methods. This paper presents to a hybrid system control algorithm and also dispatches strategy design in which wind is the primary energy resource with photovoltaic cells. The dimension of the design (max. load is 2000 kW and the sources is implemented as flow 1500 kw from wind, 500 kw from solar and diesel 2000 kw. The main task of the preposed algorithm is to take full advantage of the wind energy and solar energy when it is available and to minimize diesel fuel consumption.

  12. Detecting Clusters of Galaxies in the Sloan Digital Sky Survey I Monte Carlo Comparison of Cluster Detection Algorithms

    CERN Document Server

    Kim, R S J; Postman, M; Strauss, M A; Bahcall, Neta A; Gunn, J E; Lupton, R H; Annis, J; Nichol, R C; Castander, F J; Brinkmann, J; Brunner, R J; Connolly, A; Csabai, I; Hindsley, R B; Ivezic, Z; Vogeley, M S; York, D G; Kim, Rita S. J.; Kepner, Jeremy V.; Postman, Marc; Strauss, Michael A.; Bahcall, Neta A.; Gunn, James E.; Lupton, Robert H.; Annis, James; Nichol, Robert C.; Castander, Francisco J.; Brunner, Robert J.; Connolly, Andrew; Csabai, Istvan; Hindsley, Robert B.; Ivezic, Zeljko; Vogeley, Michael S.; York, Donald G.

    2002-01-01

    We present a comparison of three cluster finding algorithms from imaging data using Monte Carlo simulations of clusters embedded in a 25 deg^2 region of Sloan Digital Sky Survey (SDSS) imaging data: the Matched Filter (MF; Postman et al. 1996), the Adaptive Matched Filter (AMF; Kepner et al. 1999) and a color-magnitude filtered Voronoi Tessellation Technique (VTT). Among the two matched filters, we find that the MF is more efficient in detecting faint clusters, whereas the AMF evaluates the redshifts and richnesses more accurately, therefore suggesting a hybrid method (HMF) that combines the two. The HMF outperforms the VTT when using a background that is uniform, but it is more sensitive to the presence of a non-uniform galaxy background than is the VTT; this is due to the assumption of a uniform background in the HMF model. We thus find that for the detection thresholds we determine to be appropriate for the SDSS data, the performance of both algorithms are similar; we present the selection function for eac...

  13. SAR Image Segmentation Based On Hybrid PSOGSA Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Amandeep Kaur

    2014-09-01

    Full Text Available Image segmentation is useful in many applications. It can identify the regions of interest in a scene or annotate the data. It categorizes the existing segmentation algorithm into region-based segmentation, data clustering, and edge-base segmentation. Region-based segmentation includes the seeded and unseeded region growing algorithms, the JSEG, and the fast scanning algorithm. Due to the presence of speckle noise, segmentation of Synthetic Aperture Radar (SAR images is still a challenging problem. We proposed a fast SAR image segmentation method based on Particle Swarm Optimization-Gravitational Search Algorithm (PSO-GSA. In this method, threshold estimation is regarded as a search procedure that examinations for an appropriate value in a continuous grayscale interval. Hence, PSO-GSA algorithm is familiarized to search for the optimal threshold. Experimental results indicate that our method is superior to GA based, AFS based and ABC based methods in terms of segmentation accuracy, segmentation time, and Thresholding.

  14. A heuristic approach to possibilistic clustering algorithms and applications

    CERN Document Server

    Viattchenin, Dmitri A

    2013-01-01

    The present book outlines a new approach to possibilistic clustering in which the sought clustering structure of the set of objects is based directly on the formal definition of fuzzy cluster and the possibilistic memberships are determined directly from the values of the pairwise similarity of objects.   The proposed approach can be used for solving different classification problems. Here, some techniques that might be useful at this purpose are outlined, including a methodology for constructing a set of labeled objects for a semi-supervised clustering algorithm, a methodology for reducing analyzed attribute space dimensionality and a methods for asymmetric data processing. Moreover,  a technique for constructing a subset of the most appropriate alternatives for a set of weak fuzzy preference relations, which are defined on a universe of alternatives, is described in detail, and a method for rapidly prototyping the Mamdani’s fuzzy inference systems is introduced. This book addresses engineers, scientist...

  15. A comparison of clustering algorithms in article recommendation system

    Science.gov (United States)

    Tantanasiriwong, Supaporn

    2012-01-01

    Recommendation system is considered a tool that can be used to recommend researchers about resources that are suitable for their research of interest by using content-based filtering. In this paper, clustering algorithm as an unsupervised learning is introduced for grouping objects based on their feature selection and similarities. The information of publication in Science Cited Index is used to be dataset for clustering as a feature extraction in terms of dimensionality reduction of these articles by comparing Latent Dirichlet Allocation (LDA), Principal Component Analysis (PCA), and K-Mean to determine the best algorithm. In my experiment, the selected database consists of 2625 documents extraction extracted from SCI corpus from 2001 to 2009. Clustering into ranks as 50,100,200,250 is used to consider and using F-Measure evaluate among them in three algorithms. The result of this paper showed that LDA technique given the accuracy up to 95.5% which is the highest effective than any other clustering technique.

  16. Study of Magnesium Diboride Clusters Using Hybrid Density Functional Theory

    Directory of Open Access Journals (Sweden)

    D. Rodríguez

    2007-12-01

    Full Text Available Using hybrid density functional theory and a relatively large basis set, the lowest energy equilibrium structure, vibrational spectrum, and natural orbital analysis were obtained for magnesium diboride clusters [(MgB2x for x=1,2, and 3]. For comparison, boron clusters [Bx for x=2,4, and 6] were also considered. The MgB2 and (MgB22 showed equilibrium structures with the boron atoms in arrangements similar to what was obtained for pure boron atoms, whereas, for (MgB23 a different arrangement of boron was obtained. From the population analysis, large electron density in the boron atoms forming the clusters was observed.

  17. A Clustering Genetic Algorithm for Cylinder Drag Optimization

    Science.gov (United States)

    Milano, Michele; Koumoutsakos, Petros

    2002-01-01

    A real coded genetic algorithm is implemented for the optimization of actuator parameters for cylinder drag minimization. We consider two types of idealized actuators that are allowed either to move steadily and tangentially to the cylinder surface (“belts”) or to steadily blow/suck with a zero net mass constraint. The genetic algorithm we implement has the property of identifying minima basins, rather than single optimum points. The knowledge of the shape of the minimum basin enables further insights into the system properties and provides a sensitivity analysis in a fully automated way. The drag minimization problem is formulated as an optimal regulation problem. By means of the clustering property of the present genetic algorithm, a set of solutions producing drag reduction of up to 50% is identified. A comparison between the two types of actuators, based on the clustering property of the algorithm, indicates that blowing/suction actuation parameters are associated with larger tolerances when compared to optimal parameters for the belt actuators. The possibility of using a few strategically placed actuators to obtain a significant drag reduction is explored using the clustering diagnostics of this method. The optimal belt-actuator parameters obtained by optimizing the two-dimensional case is employed in three-dimensional simulations, by extending the actuators across the span of the cylinder surface. The three-dimensional controlled flow exhibits a strong two-dimensional character near the cylinder surface, resulting in significant drag reduction.

  18. Hybrid Algorithm for the Optimization of Training Convolutional Neural Network

    Directory of Open Access Journals (Sweden)

    Hayder M. Albeahdili

    2015-10-01

    Full Text Available The training optimization processes and efficient fast classification are vital elements in the development of a convolution neural network (CNN. Although stochastic gradient descend (SGD is a Prevalence algorithm used by many researchers for the optimization of training CNNs, it has vast limitations. In this paper, it is endeavor to diminish and tackle drawbacks inherited from SGD by proposing an alternate algorithm for CNN training optimization. A hybrid of genetic algorithm (GA and particle swarm optimization (PSO is deployed in this work. In addition to SGD, PSO and genetic algorithm (PSO-GA are also incorporated as a combined and efficient mechanism in achieving non trivial solutions. The proposed unified method achieves state-of-the-art classification results on the different challenge benchmark datasets such as MNIST, CIFAR-10, and SVHN. Experimental results showed that the results outperform and achieve superior results to most contemporary approaches.

  19. Hybrid Collision Detection Algorithm based on Image Space

    Directory of Open Access Journals (Sweden)

    XueLi Shen

    2013-07-01

    Full Text Available Collision detection is an important application in the field of virtual reality, and efficiently completing collision detection has become the research focus. For the poorly real-time defect of collision detection, this paper has presented an algorithm based on the hybrid collision detection, detecting the potential collision object sets quickly with the mixed bounding volume hierarchy tree, and then using the streaming pattern collision detection algorithm to make an accurate detection. With the above methods, it can achieve the purpose of balancing load of the CPU and GPU and speeding up the detection rate. The experimental results show that compared with the classic Rapid algorithm, this algorithm can effectively improve the efficiency of collision detection.

  20. Robustness of the ATLAS pixel clustering neural network algorithm

    CERN Document Server

    Sidebo, Per Edvin; The ATLAS collaboration

    2016-01-01

    Proton-proton collisions at the energy frontier puts strong constraints on track reconstruction algorithms. The algorithms depend heavily on accurate estimation of the position of particles as they traverse the inner detector elements. An artificial neural network algorithm is utilised to identify and split clusters of neighbouring read-out elements in the ATLAS pixel detector created by multiple charged particles. The method recovers otherwise lost tracks in dense environments where particles are separated by distances comparable to the size of the detector read-out elements. Such environments are highly relevant for LHC run 2, e.g. in searches for heavy resonances. Within the scope of run 2 track reconstruction performance and upgrades, the robustness of the neural network algorithm will be presented. The robustness has been studied by evaluating the stability of the algorithm’s performance under a range of variations in the pixel detector conditions.

  1. Comparative Study of Clustering Algorithms in Text Mining Context

    Directory of Open Access Journals (Sweden)

    Abdennour Mohamed Jalil

    2016-06-01

    Full Text Available The spectacular increasing of Data is due to the appearance of networks and smartphones. Amount 42% of world population using internet [1]; have created a problem related of the processing of the data exchanged, which is rising exponentially and that should be automatically treated. This paper presents a classical process of knowledge discovery databases, in order to treat textual data. This process is divided into three parts: preprocessing, processing and post-processing. In the processing step, we present a comparative study between several clustering algorithms such as KMeans, Global KMeans, Fast Global KMeans, Two Level KMeans and FWKmeans. The comparison between these algorithms is made on real textual data from the web using RSS feeds. Experimental results identified two problems: the first one quality results which remain for algorithms, which rapidly converge. The second problem is due to the execution time that needs to decrease for some algorithms.

  2. DYNAMIC REQUEST DISPATCHING ALGORITHM FOR WEB SERVER CLUSTER

    Institute of Scientific and Technical Information of China (English)

    Yang Zhenjiang; Zhang Deyun; Sun Qindong; Sun Qing

    2006-01-01

    Distributed architectures support increased load on popular web sites by dispatching client requests transparently among multiple servers in a cluster. Packet Single-Rewriting technology and client address hashing algorithm in ONE-IP technology which can ensure application-session-keep have been analyzed, an improved request dispatching algorithm which is simple, effective and supports dynamic load balance has been proposed. In this algorithm, dispatcher evaluates which server node will process request by applying a hash function to the client IP address and comparing the result with its assigned identifier subset; it adjusts the size of the subset according to the performance and current load of each server, so as to utilize all servers' resource effectively. Simulation shows that the improved algorithm has better performance than the original one.

  3. clusterMaker: a multi-algorithm clustering plugin for Cytoscape

    Science.gov (United States)

    2011-01-01

    Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin clusterMaker provides a number of clustering

  4. clusterMaker: a multi-algorithm clustering plugin for Cytoscape

    Directory of Open Access Journals (Sweden)

    Morris John H

    2011-11-01

    Full Text Available Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view, k-means, k-medoid, SCPS, AutoSOME, and native (Java MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin cluster

  5. Exploring New Clustering Algorithms for the CMS Tracker FED

    CERN Document Server

    Gamboa Alvarado, Jose Leandro

    2013-01-01

    In the current Front End (FE) firmware clusters of hits within the APV frames are found using a simple threshold comparison (which is made between the data and a 3 or 5 sigma strip noise cut) on reordered pedestal and Common Mode (CM) noise subtracted data. In addition the CM noise subtraction requires the baseline of each APV frame to be approximately uniform. Therefore, the current algorithm will fail if the APV baseline exhibits large-scale non-uniform behavior. Under very high luminosity conditions the assumption of a uniform APV baseline breaks down and the FED is unable to maintain a high efficiency of cluster finding. \

  6. Multiphase Return Trajectory Optimization Based on Hybrid Algorithm

    Directory of Open Access Journals (Sweden)

    Yi Yang

    2016-01-01

    Full Text Available A hybrid trajectory optimization method consisting of Gauss pseudospectral method (GPM and natural computation algorithm has been developed and utilized to solve multiphase return trajectory optimization problem, where a phase is defined as a subinterval in which the right-hand side of the differential equation is continuous. GPM converts the optimal control problem to a nonlinear programming problem (NLP, which helps to improve calculation accuracy and speed of natural computation algorithm. Through numerical simulations, it is found that the multiphase optimal control problem could be solved perfectly.

  7. RH+: A Hybrid Localization Algorithm for Wireless Sensor Networks

    Science.gov (United States)

    Basaran, Can; Baydere, Sebnem; Kucuk, Gurhan

    Today, localization of nodes in Wireless Sensor Networks (WSNs) is a challenging problem. Especially, it is almost impossible to guarantee that one algorithm giving optimal results for one topology will give optimal results for any other random topology. In this study, we propose a centralized, range- and anchor-based, hybrid algorithm called RH+ that aims to combine the powerful features of two orthogonal techniques: Classical Multi-Dimensional Scaling (CMDS) and Particle Spring Optimization (PSO). As a result, we find that our hybrid approach gives a fast-converging solution which is resilient to range-errors and very robust to topology changes. Across all topologies we studied, the average estimation error is less than 0.5m. when the average node density is 10 and only 2.5% of the nodes are beacons.

  8. A hybrid variational-perturbational nuclear motion algorithm

    Science.gov (United States)

    Fábri, Csaba; Furtenbacher, Tibor; Császár, Attila G.

    2014-09-01

    A hybrid variational-perturbational nuclear motion algorithm based on the perturbative treatment of the Coriolis coupling terms of the Eckart-Watson kinetic energy operator following a variational treatment of the rest of the operator is described. The algorithm has been implemented in the quantum chemical code DEWE. Performance of the hybrid treatment is assessed by comparing selected numerically exact variational vibration-only and rovibrational energy levels of the C2H4, C2D4, and CH4 molecules with their perturbatively corrected counterparts. For many of the rotational-vibrational states examined, numerical tests reveal excellent agreement between the variational and even the first-order perturbative energy levels, whilst the perturbative approach is able to reduce the computational cost of the matrix-vector product evaluations, needed by the iterative Lanczos eigensolver, by almost an order of magnitude.

  9. Solving Timetabling Problems by Hybridizing Genetic Algorithms and Taboo Search

    OpenAIRE

    Rahoual, Malek; Saad, Rachid

    2006-01-01

    International audience; As demand for Education increases and diversifies, so does the difficulty of designing workable timetables for schools and academic institutions. Besides the intractability of the basic problem, there is an increasing variety of constraints that come into play. In this paper we present a hybrid of two metaheuristics (genetic algorithm and tabu search) to tackle the problem in its most general setting. Promising experimental results are shown.

  10. Hybrid Clustering And Boundary Value Refinement for Tumor Segmentation using Brain MRI

    Science.gov (United States)

    Gupta, Anjali; Pahuja, Gunjan

    2017-08-01

    The method of brain tumor segmentation is the separation of tumor area from Brain Magnetic Resonance (MR) images. There are number of methods already exist for segmentation of brain tumor efficiently. However it’s tedious task to identify the brain tumor from MR images. The segmentation process is extraction of different tumor tissues such as active, tumor, necrosis, and edema from the normal brain tissues such as gray matter (GM), white matter (WM), as well as cerebrospinal fluid (CSF). As per the survey study, most of time the brain tumors are detected easily from brain MR image using region based approach but required level of accuracy, abnormalities classification is not predictable. The segmentation of brain tumor consists of many stages. Manually segmenting the tumor from brain MR images is very time consuming hence there exist many challenges in manual segmentation. In this research paper, our main goal is to present the hybrid clustering which consists of Fuzzy C-Means Clustering (for accurate tumor detection) and level set method(for handling complex shapes) for the detection of exact shape of tumor in minimal computational time. using this approach we observe that for a certain set of images 0.9412 sec of time is taken to detect tumor which is very less in comparison to recent existing algorithm i.e. Hybrid clustering (Fuzzy C-Means and K Means clustering).

  11. FCM Clustering Algorithms for Segmentation of Brain MR Images

    Directory of Open Access Journals (Sweden)

    Yogita K. Dubey

    2016-01-01

    Full Text Available The study of brain disorders requires accurate tissue segmentation of magnetic resonance (MR brain images which is very important for detecting tumors, edema, and necrotic tissues. Segmentation of brain images, especially into three main tissue types: Cerebrospinal Fluid (CSF, Gray Matter (GM, and White Matter (WM, has important role in computer aided neurosurgery and diagnosis. Brain images mostly contain noise, intensity inhomogeneity, and weak boundaries. Therefore, accurate segmentation of brain images is still a challenging area of research. This paper presents a review of fuzzy c-means (FCM clustering algorithms for the segmentation of brain MR images. The review covers the detailed analysis of FCM based algorithms with intensity inhomogeneity correction and noise robustness. Different methods for the modification of standard fuzzy objective function with updating of membership and cluster centroid are also discussed.

  12. Mapping cultivable land from satellite imagery with clustering algorithms

    Science.gov (United States)

    Arango, R. B.; Campos, A. M.; Combarro, E. F.; Canas, E. R.; Díaz, I.

    2016-07-01

    Open data satellite imagery provides valuable data for the planning and decision-making processes related with environmental domains. Specifically, agriculture uses remote sensing in a wide range of services, ranging from monitoring the health of the crops to forecasting the spread of crop diseases. In particular, this paper focuses on a methodology for the automatic delimitation of cultivable land by means of machine learning algorithms and satellite data. The method uses a partition clustering algorithm called Partitioning Around Medoids and considers the quality of the clusters obtained for each satellite band in order to evaluate which one better identifies cultivable land. The proposed method was tested with vineyards using as input the spectral and thermal bands of the Landsat 8 satellite. The experimental results show the great potential of this method for cultivable land monitoring from remote-sensed multispectral imagery.

  13. Brain tumor segmentation based on a hybrid clustering technique

    Directory of Open Access Journals (Sweden)

    Eman Abdel-Maksoud

    2015-03-01

    This paper presents an efficient image segmentation approach using K-means clustering technique integrated with Fuzzy C-means algorithm. It is followed by thresholding and level set segmentation stages to provide an accurate brain tumor detection. The proposed technique can get benefits of the K-means clustering for image segmentation in the aspects of minimal computation time. In addition, it can get advantages of the Fuzzy C-means in the aspects of accuracy. The performance of the proposed image segmentation approach was evaluated by comparing it with some state of the art segmentation algorithms in case of accuracy, processing time, and performance. The accuracy was evaluated by comparing the results with the ground truth of each processed image. The experimental results clarify the effectiveness of our proposed approach to deal with a higher number of segmentation problems via improving the segmentation quality and accuracy in minimal execution time.

  14. Optimization of Antennas using a Hybrid Genetic-Algorithm Space-Mapping Algorithm

    DEFF Research Database (Denmark)

    Pantoja, M.F.; Bretones, A.R.; Meincke, Peter;

    2006-01-01

    A hybrid global-local optimization technique for the design of antennas is presented. It consists of the subsequent application of a Genetic Algorithm (GA) that employs coarse models in the simulations and a space mapping (SM) that refines the solution found in the previous stage. The technique...

  15. Advanced defect detection algorithm using clustering in ultrasonic NDE

    Science.gov (United States)

    Gongzhang, Rui; Gachagan, Anthony

    2016-02-01

    A range of materials used in industry exhibit scattering properties which limits ultrasonic NDE. Many algorithms have been proposed to enhance defect detection ability, such as the well-known Split Spectrum Processing (SSP) technique. Scattering noise usually cannot be fully removed and the remaining noise can be easily confused with real feature signals, hence becoming artefacts during the image interpretation stage. This paper presents an advanced algorithm to further reduce the influence of artefacts remaining in A-scan data after processing using a conventional defect detection algorithm. The raw A-scan data can be acquired from either traditional single transducer or phased array configurations. The proposed algorithm uses the concept of unsupervised machine learning to cluster segmental defect signals from pre-processed A-scans into different classes. The distinction and similarity between each class and the ensemble of randomly selected noise segments can be observed by applying a classification algorithm. Each class will then be labelled as `legitimate reflector' or `artefacts' based on this observation and the expected probability of defection (PoD) and probability of false alarm (PFA) determined. To facilitate data collection and validate the proposed algorithm, a 5MHz linear array transducer is used to collect A-scans from both austenitic steel and Inconel samples. Each pulse-echo A-scan is pre-processed using SSP and the subsequent application of the proposed clustering algorithm has provided an additional reduction to PFA while maintaining PoD for both samples compared with SSP results alone.

  16. Core Business Selection Based on Ant Colony Clustering Algorithm

    OpenAIRE

    Yu Lan; Yan Bo; Yao Baozhen

    2014-01-01

    Core business is the most important business to the enterprise in diversified business. In this paper, we first introduce the definition and characteristics of the core business and then descript the ant colony clustering algorithm. In order to test the effectiveness of the proposed method, Tianjin Port Logistics Development Co., Ltd. is selected as the research object. Based on the current situation of the development of the company, the core business of the company can be acquired by ant c...

  17. Hybrid cloud and cluster computing paradigms for life science applications.

    Science.gov (United States)

    Qiu, Judy; Ekanayake, Jaliya; Gunarathne, Thilina; Choi, Jong Youl; Bae, Seung-Hee; Li, Hui; Zhang, Bingjing; Wu, Tak-Lon; Ruan, Yang; Ekanayake, Saliya; Hughes, Adam; Fox, Geoffrey

    2010-12-21

    Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications. However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much data analysis. Such problems can be run efficiently on clusters using MPI leading to a hybrid cloud and cluster environment. This motivates the design and implementation of an open source Iterative MapReduce system Twister. Comparisons of Amazon, Azure, and traditional Linux and Windows environments on common applications have shown encouraging performance and usability comparisons in several important non iterative cases. These are linked to MPI applications for final stages of the data analysis. Further we have released the open source Twister Iterative MapReduce and benchmarked it against basic MapReduce (Hadoop) and MPI in information retrieval and life sciences applications. The hybrid cloud (MapReduce) and cluster (MPI) approach offers an attractive production environment while Twister promises a uniform programming environment for many Life Sciences applications. We used commercial clouds Amazon and Azure and the NSF resource FutureGrid to perform detailed comparisons and evaluations of different approaches to data intensive computing. Several applications were developed in MPI, MapReduce and Twister in these different environments.

  18. Comparison of cluster expansion fitting algorithms for interactions at surfaces

    Science.gov (United States)

    Herder, Laura M.; Bray, Jason M.; Schneider, William F.

    2015-10-01

    Cluster expansions (CEs) are Ising-type interaction models that are increasingly used to model interaction and ordering phenomena at surfaces, such as the adsorbate-adsorbate interactions that control coverage-dependent adsorption or surface-vacancy interactions that control surface reconstructions. CEs are typically fit to a limited set of data derived from density functional theory (DFT) calculations. The CE fitting process involves iterative selection of DFT data points to include in a fit set and selection of interaction clusters to include in the CE. Here we compare the performance of three CE fitting algorithms-the MIT Ab-initio Phase Stability code (MAPS, the default in ATAT software), a genetic algorithm (GA), and a steepest descent (SD) algorithm-against synthetic data. The synthetic data is encoded in model Hamiltonians of varying complexity motivated by the observed behavior of atomic adsorbates on a face-centered-cubic transition metal close-packed (111) surface. We compare the performance of the leave-one-out cross-validation score against the true fitting error available from knowledge of the hidden CEs. For these systems, SD achieves lowest overall fitting and prediction error independent of the underlying system complexity. SD also most accurately predicts cluster interaction energies without ignoring or introducing extra interactions into the CE. MAPS achieves good results in fewer iterations, while the GA performs least well for these particular problems.

  19. Optimized algorithm for balancing clusters in wireless sensor networks

    Institute of Scientific and Technical Information of China (English)

    Mucheol KIM; Sun-hong KIM; Hyungjin BYUN; Sang-yong HAN

    2009-01-01

    Wireless sensor networks consist of hundreds or thousands of sensor nodes that involve numerous restrictions including computation capability and battery capacity. Topology control is an important issue for achieving a balanced placement of sensor nodes. The clustering scheme is a widely known and efficient means of topology control for transmitting information to the base station in two hops. The automatic routing scheme of the self-organizing technique is another critical element of wireless sensor networks. In this paper we propose an optimal algorithm with cluster balance taken into consideration, and compare it with three well known and widely used approaches, I.e., LEACH, MEER, and VAP-E, in performance evaluation. Experimental results show that the proposed approach increases the overall network lifetime, indicating that the amount of energy required for communication to the base station will be reduced for locating an optimal cluster.

  20. Hybrid Clustering-GWO-NARX neural network technique in predicting stock price

    Science.gov (United States)

    Das, Debashish; Safa Sadiq, Ali; Mirjalili, Seyedali; Noraziah, A.

    2017-09-01

    Prediction of stock price is one of the most challenging tasks due to nonlinear nature of the stock data. Though numerous attempts have been made to predict the stock price by applying various techniques, yet the predicted price is not always accurate and even the error rate is high to some extent. Consequently, this paper endeavours to determine an efficient stock prediction strategy by implementing a combinatorial method of Grey Wolf Optimizer (GWO), Clustering and Non Linear Autoregressive Exogenous (NARX) Technique. The study uses stock data from prominent stock market i.e. New York Stock Exchange (NYSE), NASDAQ and emerging stock market i.e. Malaysian Stock Market (Bursa Malaysia), Dhaka Stock Exchange (DSE). It applies K-means clustering algorithm to determine the most promising cluster, then MGWO is used to determine the classification rate and finally the stock price is predicted by applying NARX neural network algorithm. The prediction performance gained through experimentation is compared and assessed to guide the investors in making investment decision. The result through this technique is indeed promising as it has shown almost precise prediction and improved error rate. We have applied the hybrid Clustering-GWO-NARX neural network technique in predicting stock price. We intend to work with the effect of various factors in stock price movement and selection of parameters. We will further investigate the influence of company news either positive or negative in stock price movement. We would be also interested to predict the Stock indices.

  1. An energy-efficient and secure hybrid algorithm for wireless sensor networks using a mobile data collector

    Science.gov (United States)

    Dayananda, Karanam Ravichandran; Straub, Jeremy

    2017-05-01

    This paper proposes a new hybrid algorithm for security, which incorporates both distributed and hierarchal approaches. It uses a mobile data collector (MDC) to collect information in order to save energy of sensor nodes in a wireless sensor network (WSN) as, in most networks, these sensor nodes have limited energy. Wireless sensor networks are prone to security problems because, among other things, it is possible to use a rogue sensor node to eavesdrop on or alter the information being transmitted. To prevent this, this paper introduces a security algorithm for MDC-based WSNs. A key use of this algorithm is to protect the confidentiality of the information sent by the sensor nodes. The sensor nodes are deployed in a random fashion and form group structures called clusters. Each cluster has a cluster head. The cluster head collects data from the other nodes using the time-division multiple access protocol. The sensor nodes send their data to the cluster head for transmission to the base station node for further processing. The MDC acts as an intermediate node between the cluster head and base station. The MDC, using its dynamic acyclic graph path, collects the data from the cluster head and sends it to base station. This approach is useful for applications including warfighting, intelligent building and medicine. To assess the proposed system, the paper presents a comparison of its performance with other approaches and algorithms that can be used for similar purposes.

  2. A Hybrid Demon Algorithm for the Two-Dimensional Orthogonal Strip Packing Problem

    Directory of Open Access Journals (Sweden)

    Bili Chen

    2015-01-01

    Full Text Available This paper develops a hybrid demon algorithm for a two-dimensional orthogonal strip packing problem. This algorithm combines a placement procedure based on an improved heuristic, local search, and demon algorithm involved in setting one parameter. The hybrid algorithm is tested on a wide set of benchmark instances taken from the literature and compared with other well-known algorithms. The computation results validate the quality of the solutions and the effectiveness of the proposed algorithm.

  3. A cluster analysis on road traffic accidents using genetic algorithms

    Science.gov (United States)

    Saharan, Sabariah; Baragona, Roberto

    2017-04-01

    The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.

  4. A Hybrid Graph Representation for Recursive Backtracking Algorithms

    Science.gov (United States)

    Abu-Khzam, Faisal N.; Langston, Michael A.; Mouawad, Amer E.; Nolan, Clinton P.

    Many exact algorithms for NP-hard graph problems adopt the old Davis-Putman branch-and-reduce paradigm. The performance of these algorithms often suffers from the increasing number of graph modifications, such as deletions, that reduce the problem instance and have to be "taken back" frequently during the search process. The use of efficient data structures is necessary for fast graph modification modules as well as fast take-back procedures. In this paper, we investigate practical implementation-based aspects of exact algorithms by providing a hybrid graph representation that addresses the take-back challenge and combines the advantage of {O}(1) adjacency-queries in adjacency-matrices with the advantage of efficient neighborhood traversal in adjacency-lists.

  5. Community Clustering Algorithm in Complex Networks Based on Microcommunity Fusion

    Directory of Open Access Journals (Sweden)

    Jin Qi

    2015-01-01

    Full Text Available With the further research on physical meaning and digital features of the community structure in complex networks in recent years, the improvement of effectiveness and efficiency of the community mining algorithms in complex networks has become an important subject in this area. This paper puts forward a concept of the microcommunity and gets final mining results of communities through fusing different microcommunities. This paper starts with the basic definition of the network community and applies Expansion to the microcommunity clustering which provides prerequisites for the microcommunity fusion. The proposed algorithm is more efficient and has higher solution quality compared with other similar algorithms through the analysis of test results based on network data set.

  6. Multi Population Hybrid Genetic Algorithms for University Course Timetabling Problem

    Directory of Open Access Journals (Sweden)

    Leila Jadidi

    2012-06-01

    Full Text Available University course timetabling is one of the important and time consuming issues that each University is involved with it at the beginning of each. This problem is in class of NP-hard problem and is very difficult to solve by classic algorithms. Therefore optimization techniques are used to solve them and produce optimal or near optimal feasible solutions instead of exact solutions. Genetic algorithms, because of multidirectional search property of them, are considered as an efficient approach for solving this type of problems. In this paper three new hybrid genetic algorithms for solving the university course timetabling problem (UCTP are proposed: FGARI, FGASA and FGATS. In proposed algorithms, fuzzy logic is used to measure violation of soft constraints in fitness function to deal with inherent uncertainly and vagueness involved in real life data. Also, randomized iterative local search, simulated annealing and tabu search are applied, respectively, to improve exploitive search ability and prevent genetic algorithm to be trapped in local optimum. The experimental results indicate that the proposed algorithms are able to produce promising results for the UCTP.

  7. Multi Population Hybrid Genetic Algorithms for University Course Timetabling

    Directory of Open Access Journals (Sweden)

    Mehrnaz Shirani LIRI

    2012-08-01

    Full Text Available University course timetabling is one of the important and time consuming issues that each University is involved with at the beginning of each university year. This problem is in class of NP-hard problem and is very difficult to solve by classic algorithms. Therefore optimization techniques are used to solve them and produce optimal or almost optimal feasible solutions instead of exact solutions. Genetic algorithms, because of their multidirectional search property, are considered as an efficient approach for solving this type of problems. In this paper three new hybrid genetic algorithms for solving the university course timetabling problem (UCTP are proposed: FGARI, FGASA and FGATS. In the proposed algorithms, fuzzy logic is used to measure violation of soft constraints in fitness function to deal with inherent uncertainty and vagueness involved in real life data. Also, randomized iterative local search, simulated annealing and tabu search are applied, respectively, to improve exploitive search ability and prevent genetic algorithm to be trapped in local optimum. The experimental results indicate that the proposed algorithms are able to produce promising results for the UCTP

  8. Novel Hybrid Intrusion Detection System For Clustered Wireless Sensor Network

    Directory of Open Access Journals (Sweden)

    Hichem Sedjelmaci

    2011-08-01

    Full Text Available Wireless sensor network (WSN is regularly deployed in unattended and hostile environments. The WSN isvulnerable to security threats and susceptible to physical capture. Thus, it is necessary to use effective mechanisms to protect the network. It is widely known, that the intrusion detection is one of the mostefficient security mechanisms to protect the network against malicious attacks or unauthorized access. In this paper, we propose a hybrid intrusion detection system for clustered WSN. Our intrusion framework uses a combination between the Anomaly Detection based on support vector machine (SVM and the Misuse Detection. Experiments results show that most of routing attacks can be detected with low falsealarm.

  9. Development of hybrid artificial intelligent based handover decision algorithm

    Directory of Open Access Journals (Sweden)

    A.M. Aibinu

    2017-04-01

    Full Text Available The possibility of seamless handover remains a mirage despite the plethora of existing handover algorithms. The underlying factor responsible for this has been traced to the Handover decision module in the Handover process. Hence, in this paper, the development of novel hybrid artificial intelligent handover decision algorithm has been developed. The developed model is made up of hybrid of Artificial Neural Network (ANN based prediction model and Fuzzy Logic. On accessing the network, the Received Signal Strength (RSS was acquired over a period of time to form a time series data. The data was then fed to the newly proposed k-step ahead ANN-based RSS prediction system for estimation of prediction model coefficients. The synaptic weights and adaptive coefficients of the trained ANN was then used to compute the k-step ahead ANN based RSS prediction model coefficients. The predicted RSS value was later codified as Fuzzy sets and in conjunction with other measured network parameters were fed into the Fuzzy logic controller in order to finalize handover decision process. The performance of the newly developed k-step ahead ANN based RSS prediction algorithm was evaluated using simulated and real data acquired from available mobile communication networks. Results obtained in both cases shows that the proposed algorithm is capable of predicting ahead the RSS value to about ±0.0002 dB. Also, the cascaded effect of the complete handover decision module was also evaluated. Results obtained show that the newly proposed hybrid approach was able to reduce ping-pong effect associated with other handover techniques.

  10. Clustering Algorithms: Their Application to Gene Expression Data

    Science.gov (United States)

    Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel

    2016-01-01

    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867

  11. Identifying multiple influential spreaders by a heuristic clustering algorithm

    Science.gov (United States)

    Bao, Zhong-Kui; Liu, Jian-Guo; Zhang, Hai-Feng

    2017-03-01

    The problem of influence maximization in social networks has attracted much attention. However, traditional centrality indices are suitable for the case where a single spreader is chosen as the spreading source. Many times, spreading process is initiated by simultaneously choosing multiple nodes as the spreading sources. In this situation, choosing the top ranked nodes as multiple spreaders is not an optimal strategy, since the chosen nodes are not sufficiently scattered in networks. Therefore, one ideal situation for multiple spreaders case is that the spreaders themselves are not only influential but also they are dispersively distributed in networks, but it is difficult to meet the two conditions together. In this paper, we propose a heuristic clustering (HC) algorithm based on the similarity index to classify nodes into different clusters, and finally the center nodes in clusters are chosen as the multiple spreaders. HC algorithm not only ensures that the multiple spreaders are dispersively distributed in networks but also avoids the selected nodes to be very "negligible". Compared with the traditional methods, our experimental results on synthetic and real networks indicate that the performance of HC method on influence maximization is more significant.

  12. Gravitation field algorithm and its application in gene cluster

    Directory of Open Access Journals (Sweden)

    Zheng Ming

    2010-09-01

    Full Text Available Abstract Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA.

  13. Local rewiring algorithms to increase clustering and grow a small world

    CERN Document Server

    Alstott, Jeff; Pizza, Pamela B; Radcliffe, Mary

    2016-01-01

    Many real-world networks have high clustering among vertices: vertices that share neighbors are often also directly connected to each other. A network's clustering can be a useful indicator of its connectedness and community structure. Algorithms for generating networks with high clustering have been developed, but typically rely on adding or removing edges and nodes, sometimes from a completely empty network. Here, we introduce algorithms that create a highly clustered network by starting with an existing network and rearranging edges, without adding or removing them; these algorithms can preserve other network properties even as the clustering increases. These algorithms rely on local rewiring rules, in which a single edge changes one of its vertices in a way that is guaranteed to increase clustering. This greedy algorithm can be applied iteratively to transform a random network into a form with much higher clustering. Additionally, these algorithms grow the network's clustering faster than they increase it...

  14. Combined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL): A Robust Method for Selection of Cluster Number, K.

    Science.gov (United States)

    Sweeney, Timothy E; Chen, Albert C; Gevaert, Olivier

    2015-11-19

    In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of 'dark art', with multiple techniques available for assessing the validity of unsupervised clustering algorithms. Here, we present a new technique for unsupervised clustering that uses multiple clustering algorithms, multiple validity metrics, and progressively bigger subsets of the data to produce an intuitive 3D map of cluster stability that can help determine the optimal number of clusters in a data set, a technique we call COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL). COMMUNAL locally optimizes algorithms and validity measures for the data being used. We show its application to simulated data with a known K, and then apply this technique to several well-known cancer gene expression datasets, showing that COMMUNAL provides new insights into clustering behavior and stability in all tested cases. COMMUNAL is shown to be a useful tool for determining K in complex biological datasets, and is freely available as a package for R.

  15. ROBUST-HYBRID GENETIC ALGORITHM FOR A FLOW-SHOP SCHEDULING PROBLEM (A Case Study at PT FSCM Manufacturing Indonesia

    Directory of Open Access Journals (Sweden)

    Johan Soewanda

    2007-01-01

    Full Text Available This paper discusses the application of Robust Hybrid Genetic Algorithm to solve a flow-shop scheduling problem. The proposed algorithm attempted to reach minimum makespan. PT. FSCM Manufacturing Indonesia Plant 4's case was used as a test case to evaluate the performance of the proposed algorithm. The proposed algorithm was compared to Ant Colony, Genetic-Tabu, Hybrid Genetic Algorithm, and the company's algorithm. We found that Robust Hybrid Genetic produces statistically better result than the company's, but the same as Ant Colony, Genetic-Tabu, and Hybrid Genetic. In addition, Robust Hybrid Genetic Algorithm required less computational time than Hybrid Genetic Algorithm

  16. A Flow-Partitioned Unequal Clustering Routing Algorithm for Wireless Sensor Networks

    OpenAIRE

    Jian Peng; Xiaohai Chen; Tang Liu

    2014-01-01

    Energy efficiency and energy balance are two important issues for wireless sensor networks. In previous clustering routing algorithms, multihop transmission, sleep scheduling, and unequal clustering are always used to improve energy efficiency and energy balance. In these algorithms, only the cluster heads share the burden of data forwarding in each round. In this paper, we propose a flow-partitioned unequal clustering routing (FPUC) algorithm to achieve better energy efficiency and energy ba...

  17. A Hybrid Genetic Algorithm for the Multiple Crossdocks Problem

    Directory of Open Access Journals (Sweden)

    Zhaowei Miao

    2012-01-01

    Full Text Available We study a multiple crossdocks problem with supplier and customer time windows, where any violation of time windows will incur a penalty cost and the flows through the crossdock are constrained by fixed transportation schedules and crossdock capacities. We prove this problem to be NP-hard in the strong sense and therefore focus on developing efficient heuristics. Based on the problem structure, we propose a hybrid genetic algorithm (HGA integrating greedy technique and variable neighborhood search method to solve the problem. Extensive experiments under different scenarios were conducted, and results show that HGA outperforms CPLEX solver, providing solutions in realistic timescales.

  18. Aligning multiple protein sequences by parallel hybrid genetic algorithm.

    Science.gov (United States)

    Nguyen, Hung Dinh; Yoshihara, Ikuo; Yamamori, Kunihito; Yasunaga, Moritoshi

    2002-01-01

    This paper presents a parallel hybrid genetic algorithm (GA) for solving the sum-of-pairs multiple protein sequence alignment. A new chromosome representation and its corresponding genetic operators are proposed. A multi-population GENITOR-type GA is combined with local search heuristics. It is then extended to run in parallel on a multiprocessor system for speeding up. Experimental results of benchmarks from the BAliBASE show that the proposed method is superior to MSA, OMA, and SAGA methods with regard to quality of solution and running time. It can be used for finding multiple sequence alignment as well as testing cost functions.

  19. Development of Automatic Cluster Algorithm for Microcalcification in Digital Mammography

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Seok Yoon [Dept. of Medical Engineering, Korea University, Seoul (Korea, Republic of); Kim, Chang Soo [Dept. of Radiological Science, College of Health Sciences, Catholic University of Pusan, Pusan (Korea, Republic of)

    2009-03-15

    Digital Mammography is an efficient imaging technique for the detection and diagnosis of breast pathological disorders. Six mammographic criteria such as number of cluster, number, size, extent and morphologic shape of microcalcification, and presence of mass, were reviewed and correlation with pathologic diagnosis were evaluated. It is very important to find breast cancer early when treatment can reduce deaths from breast cancer and breast incision. In screening breast cancer, mammography is typically used to view the internal organization. Clusterig microcalcifications on mammography represent an important feature of breast mass, especially that of intraductal carcinoma. Because microcalcification has high correlation with breast cancer, a cluster of a microcalcification can be very helpful for the clinical doctor to predict breast cancer. For this study, three steps of quantitative evaluation are proposed : DoG filter, adaptive thresholding, Expectation maximization. Through the proposed algorithm, each cluster in the distribution of microcalcification was able to measure the number calcification and length of cluster also can be used to automatically diagnose breast cancer as indicators of the primary diagnosis.

  20. Effective pathfinding for four-wheeled robot based on combining Theta* and hybrid A* algorithms

    Directory of Open Access Journals (Sweden)

    Віталій Геннадійович Михалько

    2016-07-01

    Full Text Available Effective pathfinding algorithm based on Theta* and Hybrid A* algorithms was developed for four-wheeled robot. Pseudocode for algorithm was showed and explained. Algorithm and simulator for four-wheeled robot were implemented using Java programming language. Algorithm was tested on U-obstacles, complex maps and for parking problem

  1. Clustering of User Behaviour based on Web Log data using Improved K-Means Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    S.Padmaja

    2016-02-01

    Full Text Available The proposed work does an improved K-means clustering algorithm for identifying internet user behaviour. Web data analysis includes the transformation and interpretation of web log data find out the information, patterns and knowledge discovery. The efficiency of the algorithm is analyzed by considering certain parameters. The parameters are date, time, S_id, CS_method, C_IP, User_agent and time taken. The research done by using more than 2 years of real data set collected from two different group of institutions web server .this dataset provides a better analysis of Log data to identify internet user behaviour.

  2. Evolving Quantum Oracles with Hybrid Quantum-inspired Evolutionary Algorithm

    CERN Document Server

    Ding, S; Yang, Q; Ding, Shengchao; Jin, Zhi; Yang, Qing

    2006-01-01

    Quantum oracles play key roles in the studies of quantum computation and quantum information. But implementing quantum oracles efficiently with universal quantum gates is a hard work. Motivated by genetic programming, this paper proposes a novel approach to evolve quantum oracles with a hybrid quantum-inspired evolutionary algorithm. The approach codes quantum circuits with numerical values and combines the cost and correctness of quantum circuits into the fitness function. To speed up the calculation of matrix multiplication in the evaluation of individuals, a fast algorithm of matrix multiplication with Kronecker product is also presented. The experiments show the validity and the effects of some parameters of the presented approach. And some characteristics of the novel approach are discussed too.

  3. Genetic Algorithm Based Hybrid Fuzzy System for Assessing Morningness

    Directory of Open Access Journals (Sweden)

    Animesh Biswas

    2014-01-01

    Full Text Available This paper describes a real life case example on the assessment process of morningness of individuals using genetic algorithm based hybrid fuzzy system. It is observed that physical and mental performance of human beings in different time slots of a day are majorly influenced by morningness orientation of those individuals. To measure the morningness of people various self-reported questionnaires were developed by different researchers in the past. Among them reduced version of Morningness-Eveningness Questionnaire is mostly accepted. Almost all of the linguistic terms used in questionnaires are fuzzily defined. So, assessing them in crisp environments with their responses does not seem to be justifiable. Fuzzy approach based research works for assessing morningness of people are very few in the literature. In this paper, genetic algorithm is used to tune the parameters of a Mamdani fuzzy inference model to minimize error with their predicted outputs for assessing morningness of people.

  4. Healing Temperature of Hybrid Structures Based on Genetic Algorithm

    Institute of Scientific and Technical Information of China (English)

    赵中伟; 陈志华; 刘红波

    2016-01-01

    The healing temperature of suspen-dome with stacked arches(SDSA)and arch-supported single-layer lattice shell structures was investigated based on the genetic algorithm. The temperature field of arch under solar radiation was derived by FLUENT to investigate the influence of solar radiation on the determination of the healing temperature. Moreover, a multi-scale model was established to apply the complex temperature field under solar radiation. The change in the mechanical response of these two kinds of structures with the healing temperature was discussed. It can be concluded that solar radiation has great influence on the healing temperature, and the genetic algorithm can be effectively used in the optimization of the healing temperature for hybrid structures.

  5. Clustering Algorithms for Heterogeneous Wireless Sensor Networks - A Brief Survey

    Directory of Open Access Journals (Sweden)

    A.MeenaKowshalya

    2011-09-01

    Full Text Available Wireless sensor networks (WSN are emerging in vari ous fields like disaster management, battle field surveillance and border security surveillance. A la rge number of sensors in these applications are unattended and work autonomously. Clustering is a k ey technique to improve the network lifetime, reduc e the energy consumption and increase the scalability of the sensor network. In this paper, we study the impact of heterogeneity of the nodes to the perform ance of WSN. This paper surveys the different clust ering algorithm for heterogeneous WSN .

  6. Classification of posture maintenance data with fuzzy clustering algorithms

    Science.gov (United States)

    Bezdek, James C.

    1992-01-01

    Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various sensory organization test (SOT) conditions were collected in conjunction with Johnson Space Center postural control studies using a tilt-translation device (TTD). The University of West Florida applied the fuzzy c-meams (FCM) clustering algorithms to this data with a view towards identifying various states and stages of subjects experiencing such changes. Feature analysis, time step analysis, pooling data, response of the subjects, and the algorithms used are discussed.

  7. Cluster-Based Distributed Algorithms for Very Large Linear Equations

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    In many applications such as computational fluid dynamics and weather prediction, as well as image processing and state of Markov chain etc., the grade of matrix n is often very large, and any serial algorithm cannot solve the problems. A distributed cluster-based solution for very large linear equations is discussed, it includes the definitions of notations, partition of matrix, communication mechanism, and a master-slaver algorithm etc., the computing cost is O(n3/N), the memory cost is O(n2/N), the I/O cost is O(n2/N), and the communication cost is O(Nn), here, N is the number of computing nodes or processes. Some tests show that the solution could solve the double type of matrix under 106×106 effectively.

  8. Dynamic and static properties of the invaded cluster algorithm

    Science.gov (United States)

    Moriarty, K.; Machta, J.; Chayes, L. Y.

    1999-02-01

    Simulations of the two-dimensional Ising and three-state Potts models at their critical points are performed using the invaded cluster (IC) algorithm. It is argued that observables measured on a sublattice of size l should exhibit a crossover to Swendsen-Wang (SW) behavior for l sufficiently less than the lattice size L, and a scaling form is proposed to describe the crossover phenomenon. It is found that the energy autocorrelation time τɛ(l,L) for an l×l sublattice attains a maximum in the crossover region, and a dynamic exponent zIC for the IC algorithm is defined according to τɛ,max~LzIC. Simulation results for the three-state model yield zIC=0.346+/-0.002, which is smaller than values of the dynamic exponent found for the SW and Wolff algorithms and also less than the Li-Sokal bound. The results are less conclusive for the Ising model, but it appears that zICWolff algorithms.

  9. VLSI Implementation of Hybrid Algorithm Architecture for Speech Enhancement

    Directory of Open Access Journals (Sweden)

    Jigar Shah

    2012-07-01

    Full Text Available The speech enhancement techniques are required to improve the speech signal quality without causing any offshoot in many applications. Recently the growing use of cellular and mobile phones, hands free systems, VoIP phones, voice messaging service, call service centers etc. require efficient real time speech enhancement and detection strategies to make them superior over conventional speech communication systems. The speech enhancement algorithms are required to deal with additive noise and convolutive distortion that occur in any wireless communication system. Also the single channel (one microphone signal is available in real environments. Hence a single channel hybrid algorithm is used which combines minimum mean square error-log spectral amplitude (MMSE-LSA algorithm for additive noise removal and the relative spectral amplitude (RASTA algorithm for reverberation cancellation. The real time and embedded implementation on directly available DSP platforms like TMS320C6713 shows some defects. Hence the VLSI implementation using semi-custom (e.g. FPGA or full-custom approach is required. One such architecture is proposed in this paper.

  10. A Novel Dynamic Clustering Algorithm Based on Immune Network and Tabu Search

    Institute of Scientific and Technical Information of China (English)

    ZHONGJiang; WUZhongfu; WUKaigui; YANGQiang

    2005-01-01

    It's difficult to indicate the rational number of partitions in the data set before clustering usually.The problem can't be solved by traditional clustering algorithm, such as k-means or its variations. This paper proposes a novel Dynamic clustering algorithm based on the artificial immune network and tabu search (DCBIT). It optimizes the number and the location of the clusters at the same time. The algorithm includes two phases, it begins by running immune network algorithm to find a Clustering feasible solution (CFS), then it employs tabu search to get the optimum cluster number and cluster centers on the CFS. Also, the probabilities acquiring the CFS through immune network algorithm have been discussed in this paper. Some experimental results show that new algorithm has satisfied convergent probability and convergent speed.

  11. Image Transformation using Modified Kmeans clustering algorithm for Parallel saliency map

    Directory of Open Access Journals (Sweden)

    Aman Sharma

    2013-08-01

    Full Text Available to design an image transformation system is Depending on the transform chosen, the input and output images may appear entirely different and have different interpretations. Image Transformationwith the help of certain module like input image, image cluster index, object in cluster and color index transformation of image. K-means clustering algorithm is used to cluster the image for bettersegmentation. In the proposed method parallel saliency algorithm with K-means clustering is used to avoid local minima and to find the saliency map. The region behind that of using parallel saliency algorithm is proved to be more than exiting saliency algorithm.

  12. A clustering method of Chinese medicine prescriptions based on modified firefly algorithm.

    Science.gov (United States)

    Yuan, Feng; Liu, Hong; Chen, Shou-Qiang; Xu, Liang

    2016-12-01

    This paper is aimed to study the clustering method for Chinese medicine (CM) medical cases. The traditional K-means clustering algorithm had shortcomings such as dependence of results on the selection of initial value, trapping in local optimum when processing prescriptions form CM medical cases. Therefore, a new clustering method based on the collaboration of firefly algorithm and simulated annealing algorithm was proposed. This algorithm dynamically determined the iteration of firefly algorithm and simulates sampling of annealing algorithm by fitness changes, and increased the diversity of swarm through expansion of the scope of the sudden jump, thereby effectively avoiding premature problem. The results from confirmatory experiments for CM medical cases suggested that, comparing with traditional K-means clustering algorithms, this method was greatly improved in the individual diversity and the obtained clustering results, the computing results from this method had a certain reference value for cluster analysis on CM prescriptions.

  13. Active Semisupervised Clustering Algorithm with Label Propagation for Imbalanced and Multidensity Datasets

    Directory of Open Access Journals (Sweden)

    Mingwei Leng

    2013-01-01

    Full Text Available The accuracy of most of the existing semisupervised clustering algorithms based on small size of labeled dataset is low when dealing with multidensity and imbalanced datasets, and labeling data is quite expensive and time consuming in many real-world applications. This paper focuses on active data selection and semisupervised clustering algorithm in multidensity and imbalanced datasets and proposes an active semisupervised clustering algorithm. The proposed algorithm uses an active mechanism for data selection to minimize the amount of labeled data, and it utilizes multithreshold to expand labeled datasets on multidensity and imbalanced datasets. Three standard datasets and one synthetic dataset are used to demonstrate the proposed algorithm, and the experimental results show that the proposed semisupervised clustering algorithm has a higher accuracy and a more stable performance in comparison to other clustering and semisupervised clustering algorithms, especially when the datasets are multidensity and imbalanced.

  14. A study of image reconstruction algorithms for hybrid intensity interferometers

    Science.gov (United States)

    Crabtree, Peter N.; Murray-Krezan, Jeremy; Picard, Richard H.

    2011-09-01

    Phase retrieval is explored for image reconstruction using outputs from both a simulated intensity interferometer (II) and a hybrid system that combines the II outputs with partially resolved imagery from a traditional imaging telescope. Partially resolved imagery provides an additional constraint for the iterative phase retrieval process, as well as an improved starting point. The benefits of this additional a priori information are explored and include lower residual phase error for SNR values above 0.01, increased sensitivity, and improved image quality. Results are also presented for image reconstruction from II measurements alone, via current state-of-the-art phase retrieval techniques. These results are based on the standard hybrid input-output (HIO) algorithm, as well as a recent enhancement to HIO that optimizes step lengths in addition to step directions. The additional step length optimization yields a reduction in residual phase error, but only for SNR values greater than about 10. Image quality for all algorithms studied is quite good for SNR>=10, but it should be noted that the studied phase-recovery techniques yield useful information even for SNRs that are much lower.

  15. Classification of ETM+ Remote Sensing Image Based on Hybrid Algorithm of Genetic Algorithm and Back Propagation Neural Network

    Directory of Open Access Journals (Sweden)

    Haisheng Song

    2013-01-01

    Full Text Available The back propagation neural network (BPNN algorithm can be used as a supervised classification in the processing of remote sensing image classification. But its defects are obvious: falling into the local minimum value easily, slow convergence speed, and being difficult to determine intermediate hidden layer nodes. Genetic algorithm (GA has the advantages of global optimization and being not easy to fall into local minimum value, but it has the disadvantage of poor local searching capability. This paper uses GA to generate the initial structure of BPNN. Then, the stable, efficient, and fast BP classification network is gotten through making fine adjustments on the improved BP algorithm. Finally, we use the hybrid algorithm to execute classification on remote sensing image and compare it with the improved BP algorithm and traditional maximum likelihood classification (MLC algorithm. Results of experiments show that the hybrid algorithm outperforms improved BP algorithm and MLC algorithm.

  16. Clustering Algorithm Based on Crowding Niche%小生境排挤聚类算法

    Institute of Scientific and Technical Information of China (English)

    业宁; 董逸生

    2003-01-01

    A new clustering algorithm is proposed in this paper, which is based on crowding niche. Homogeneityspontaneous to withstands heterogeneity when organisms are evolving. Contemporary, Individual in same class com-pete each other to strive for limited resource. Individual that has bad fitness will be eliminated. We propose a cluster-ing algorithm based on this idea. Experiment evaluation has proved its efficiency.

  17. Cluster-cluster clustering

    Science.gov (United States)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C. S.

    1985-01-01

    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales.

  18. Cluster-cluster clustering

    Energy Technology Data Exchange (ETDEWEB)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.

    1985-08-01

    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references.

  19. A Heuristic Task Scheduling Algorithm for Heterogeneous Virtual Clusters

    Directory of Open Access Journals (Sweden)

    Weiwei Lin

    2016-01-01

    Full Text Available Cloud computing provides on-demand computing and storage services with high performance and high scalability. However, the rising energy consumption of cloud data centers has become a prominent problem. In this paper, we first introduce an energy-aware framework for task scheduling in virtual clusters. The framework consists of a task resource requirements prediction module, an energy estimate module, and a scheduler with a task buffer. Secondly, based on this framework, we propose a virtual machine power efficiency-aware greedy scheduling algorithm (VPEGS. As a heuristic algorithm, VPEGS estimates task energy by considering factors including task resource demands, VM power efficiency, and server workload before scheduling tasks in a greedy manner. We simulated a heterogeneous VM cluster and conducted experiment to evaluate the effectiveness of VPEGS. Simulation results show that VPEGS effectively reduced total energy consumption by more than 20% without producing large scheduling overheads. With the similar heuristic ideology, it outperformed Min-Min and RASA with respect to energy saving by about 29% and 28%, respectively.

  20. Ternary alloy material prediction using genetic algorithm and cluster expansion

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Chong [Iowa State Univ., Ames, IA (United States)

    2015-12-01

    This thesis summarizes our study on the crystal structures prediction of Fe-V-Si system using genetic algorithm and cluster expansion. Our goal is to explore and look for new stable compounds. We started from the current ten known experimental phases, and calculated formation energies of those compounds using density functional theory (DFT) package, namely, VASP. The convex hull was generated based on the DFT calculations of the experimental known phases. Then we did random search on some metal rich (Fe and V) compositions and found that the lowest energy structures were body centered cube (bcc) underlying lattice, under which we did our computational systematic searches using genetic algorithm and cluster expansion. Among hundreds of the searched compositions, thirteen were selected and DFT formation energies were obtained by VASP. The stability checking of those thirteen compounds was done in reference to the experimental convex hull. We found that the composition, 24-8-16, i.e., Fe3VSi2 is a new stable phase and it can be very inspiring to the future experiments.

  1. Thermodynamic Casimir effect in films: the exchange cluster algorithm.

    Science.gov (United States)

    Hasenbusch, Martin

    2015-02-01

    We study the thermodynamic Casimir force for films with various types of boundary conditions and the bulk universality class of the three-dimensional Ising model. To this end, we perform Monte Carlo simulations of the improved Blume-Capel model on the simple cubic lattice. In particular, we employ the exchange or geometric cluster cluster algorithm [Heringa and Blöte, Phys. Rev. E 57, 4976 (1998)]. In a previous work, we demonstrated that this algorithm allows us to compute the thermodynamic Casimir force for the plate-sphere geometry efficiently. It turns out that also for the film geometry a substantial reduction of the statistical error can achieved. Concerning physics, we focus on (O,O) boundary conditions, where O denotes the ordinary surface transition. These are implemented by free boundary conditions on both sides of the film. Films with such boundary conditions undergo a phase transition in the universality class of the two-dimensional Ising model. We determine the inverse transition temperature for a large range of thicknesses L(0) of the film and study the scaling of this temperature with L(0). In the neighborhood of the transition, the thermodynamic Casimir force is affected by finite size effects, where finite size refers to a finite transversal extension L of the film. We demonstrate that these finite size effects can be computed by using the universal finite size scaling function of the free energy of the two-dimensional Ising model.

  2. jClustering, an Open Framework for the Development of 4D Clustering Algorithms

    Science.gov (United States)

    Mateos-Pérez, José María; García-Villalba, Carmen; Pascau, Javier; Desco, Manuel; Vaquero, Juan J.

    2013-01-01

    We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License) to allow modification if necessary. PMID:23990913

  3. jClustering, an open framework for the development of 4D clustering algorithms.

    Directory of Open Access Journals (Sweden)

    José María Mateos-Pérez

    Full Text Available We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License to allow modification if necessary.

  4. Maximum-entropy clustering algorithm and its global convergence analysis

    Institute of Scientific and Technical Information of China (English)

    ZHANG; Zhihua

    2001-01-01

    [1]Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithm. New York: Plenum, 1981.[2]Krishnapuram, R., Keller, J., A possibilistic approach to clustering, IEEE Trans. on Fuzzy Systems, 1993, 1(2): 98.[3]Yair, E., Zeger, K., Gersho, A., Competitive learning and soft competition for vector quantizer design, IEEE Trans on Signal Processing, 1992, 40(2): 294.[4]Pal, N. R., Bezdek, J. C., Tsao, E. C. K., Generalized clustering networks and Kohonen's self-organizing scheme, IEEE Trans on Neural Networks, 1993, 4(4): 549.[5]Karayiannis, N. B., Bezdek, J. C., Pal, N. R. et al., Repair to GLVQ: a new family of competitive learning schemes, IEEE Trans on Neural Networks, 1996, 7(5): 1062.[6]Karayiannis, N. B., Pai, P. I., Fuzzy algorithms for learning vector quantization, IEEE Trans. on Neural Networks, 1996, 7(5): 1196.[7]Karayiannis, N. B., A methodology for constructing fuzzy algorithms for learning vector quantization, IEEE Trans. on Neural Networks, 1997, 8(3): 505.[8]Karayiannis, N. B., Bezdek, J. C., An integrated approach to fuzzy learning vector quantization and fuzzy C-Means clustering, IEEE Trans. on Fuzzy Systems, 1997, 5(4): 622.[9]Li Xing-si, An efficient approach to nonlinear minimax problems, Chinese Science Bulletin? 1992, 37(10): 802.[10]Li Xing-si, An efficient approach to a class of non-smooth optimization problems, Science in China, Series A,1994, 37(3): 323.[11]. Zangwill, W., Non-linear Programming: A Unified Approach, Englewood Cliffs: Prentice-Hall, 1969.[12]. Fletcher, R., Practical Methods of Optimization,2nd ed., New York: John Wiley & Sons, 1987.[13]. Zhang Zhihua, Zheng Nanning, Wang Tianshu, Behavioral analysis and improving of generalized LVQ neural network, Acta Automatica Sinica, 1999, 25(5): 582.[14]. Kirkpatrick, S., Gelatt, C. D., Vecchi, M. P., Optimization by simulated annealing, Science, 1983, 220(3): 671.[15]. Ross, K., Deterministic annealing for

  5. Hybrid Algorithms for Solving Variational Inequalities, Variational Inclusions, Mixed Equilibria, and Fixed Point Problems

    Directory of Open Access Journals (Sweden)

    Lu-Chuan Ceng

    2014-01-01

    Full Text Available We present a hybrid iterative algorithm for finding a common element of the set of solutions of a finite family of generalized mixed equilibrium problems, the set of solutions of a finite family of variational inequalities for inverse strong monotone mappings, the set of fixed points of an infinite family of nonexpansive mappings, and the set of solutions of a variational inclusion in a real Hilbert space. Furthermore, we prove that the proposed hybrid iterative algorithm has strong convergence under some mild conditions imposed on algorithm parameters. Here, our hybrid algorithm is based on Korpelevič’s extragradient method, hybrid steepest-descent method, and viscosity approximation method.

  6. A Request Distribution Algorithm for Web Server Cluster

    Directory of Open Access Journals (Sweden)

    Wei Zhang

    2011-12-01

    Full Text Available With the explosively increasing of web-based applications’ workloads, Web server cluster encounters challenge in response time for requests. Request distribution among servers in web server cluster is the key to address such challenge, especially under heavy workloads. In this paper, we propose a new request distribution algorithm named llac (least load active cache for load balancing switch in web server cluster. The goal of llac is to improve the cache hit rate and reduce response time. Packets are parsed in IP level, and back-end servers are notified to cache hot files using link change technology, neither changing URL information nor modifying the service program. This avoids switching overhead between user mode and kernel mode. The load balancing switch directly creates connection with the selected server, avoiding migrating connection overhead. This policy estimates the current composited load of each server and selects the server with the least load to serve the request. It also improves the resource utilization of web servers. Experimental results show that llac achieves better performance for web applications than wrr (weight round robin which is a popular request distribution.  

  7. The application of mixed recommendation algorithm with user clustering in the microblog advertisements promotion

    Science.gov (United States)

    Gong, Lina; Xu, Tao; Zhang, Wei; Li, Xuhong; Wang, Xia; Pan, Wenwen

    2017-03-01

    The traditional microblog recommendation algorithm has the problems of low efficiency and modest effect in the era of big data. In the aim of solving these issues, this paper proposed a mixed recommendation algorithm with user clustering. This paper first introduced the situation of microblog marketing industry. Then, this paper elaborates the user interest modeling process and detailed advertisement recommendation methods. Finally, this paper compared the mixed recommendation algorithm with the traditional classification algorithm and mixed recommendation algorithm without user clustering. The results show that the mixed recommendation algorithm with user clustering has good accuracy and recall rate in the microblog advertisements promotion.

  8. A Hybrid Technique Based on Combining Fuzzy K-means Clustering and Region Growing for Improving Gray Matter and White Matter Segmentation

    Directory of Open Access Journals (Sweden)

    Ashraf Afifi

    2012-07-01

    Full Text Available In this paper we present a hybrid approach based on combining fuzzy k-means clustering, seed region growing, and sensitivity and specificity algorithms to measure gray (GM and white matter (WM tissue. The proposed algorithm uses intensity and anatomic information for segmenting of MRIs into different tissue classes, especially GM and WM. It starts by partitioning the image into different clusters using fuzzy k-means clustering. The centers of these clusters are the input to the region growing (SRG method for creating the closed regions. The outputs of SRG technique are fed to sensitivity and specificity algorithm to merge the similar regions in one segment. The proposed algorithm is applied to challenging applications: gray matter/white matter segmentation in magnetic resonance image (MRI datasets. The experimental results show that the proposed technique produces accurate and stable results.

  9. Operation management of daily economic dispatch using novel hybrid particle swarm optimization and gravitational search algorithm with hybrid mutation strategy

    Science.gov (United States)

    Wang, Yan; Huang, Song; Ji, Zhicheng

    2017-07-01

    This paper presents a hybrid particle swarm optimization and gravitational search algorithm based on hybrid mutation strategy (HGSAPSO-M) to optimize economic dispatch (ED) including distributed generations (DGs) considering market-based energy pricing. A daily ED model was formulated and a hybrid mutation strategy was adopted in HGSAPSO-M. The hybrid mutation strategy includes two mutation operators, chaotic mutation, Gaussian mutation. The proposed algorithm was tested on IEEE-33 bus and results show that the approach is effective for this problem.

  10. Textural defect detect using a revised ant colony clustering algorithm

    Science.gov (United States)

    Zou, Chao; Xiao, Li; Wang, Bingwen

    2007-11-01

    We propose a totally novel method based on a revised ant colony clustering algorithm (ACCA) to explore the topic of textural defect detection. In this algorithm, our efforts are mainly made on the definition of local irregularity measurement and the implementation of the revised ACCA. The local irregular measurement defined evaluates the local textural inconsistency of each pixel against their mini-environment. In our revised ACCA, the behaviors of each ant are divided into two steps: release pheromone and act. The quantity of pheromone released is proportional to the irregularity measurement; the actions of the ants to act next are chosen independently of each other in a stochastic way according to some evaluated heuristic knowledge. The independency of ants implies the inherent parallel computation architecture of this algorithm. We apply the proposed method in some typical textural images with defects. From the series of pheromone distribution map (PDM), it can be clearly seen that the pheromone distribution approaches the textual defects gradually. By some post-processing, the final distribution of pheromone can demonstrate the shape and area of the defects well.

  11. Self-Expanded Clustering Algorithm Based on Density Units with Evaluation Feedback Section

    Institute of Scientific and Technical Information of China (English)

    YU Yongqian; ZHAO Xiangguo; CHEN Hengyue; WANG Bin; YU Ge; WANG Guoren

    2006-01-01

    This paper presents an effective clustering mode and a novel clustering result evaluating mode. Clustering mode has two limited integral parameters. Evaluating mode evaluates clustering results and gives each a mark. The higher mark the clustering result gains, the higher quality it has. By organizing two modes in different ways, we can build two clustering algorithms: SECDU(Self-Expanded Clustering Algorithm based on Density Units) and SECDUF(Self-Expanded Clustering Algorithm Based on Density Units with Evaluation Feedback Section). SECDU enumerates all value pairs of two parameters of clustering mode to process data set repeatedly and evaluates every clustering result by evaluating mode. Then SECDU output the clustering result that has the highest evaluating mark among all the ones. By applying "hill-climbing algorithm", SECDUF improves clustering efficiency greatly. Data sets that have different distribution features can be well adapted to both algorithms. SECDU and SECDUF can output high-quality clustering results. SECDUF tunes parameters of clustering mode automatically and no man's action involves through the whole process. In addition, SECDUF has a high clustering performance.

  12. A Fast Hybrid Algorithm Approach for the Exact String Matching Problem Via Berry Ravindran and Alpha Skip Search Algorithms

    Directory of Open Access Journals (Sweden)

    A. A. Almazroi

    2011-01-01

    Full Text Available Problem statement: String matching algorithm had been an essential means for searching biological sequence database. With the constant expansion in scientific data such as DNA and Protein; the development of enhanced algorithms have even become more critical as the major concern had always been how to raise the performances of these search algorithms to meet challenges of scientific information. Approach: Therefore a new hybrid algorithm comprising Berry Ravindran (BR and Alpha Skip Search (ASS is presented. The concept is based on BR shift function and combines with ASS to ensure improved performance. Results: The results obtained in percentages from the proposed hybrid algorithm displayed superior results in terms of number of attempts and number of character comparisons than the original algorithms when various types of data namely DNA, Protein and English text are applied to appraise the hybrid performances. The enhancement of the proposed hybrid algorithm performs better at 71%, 60% and 63% when compared to Berry-Ravindran in DNA, Protein and English text correspondingly. Moreover the rate of enhancement over Alpha Skip Search algorithm in DNA, Protein and English text are 48%, 28% and 36% respectively. Conclusion: The new proposed hybrid algorithm is relevant for searching biological science sequence database and also other string search systems.

  13. A fast hybrid algorithm for exoplanetary transit searches

    CERN Document Server

    Cameron, A C; Street, R A; Lister, T A; West, R G; Wilson, D M; Pont, F; Christian, D J; Clarkson, W I; Enoch, B; Evans, A; Fitzsimmons, A; Haswell, C A; Hellier, C; Hodgkin, S T; Horne, K; Irwin, J; Kane, S R; Keenan, F P; Norton, A J; Parley, N R; Osborne, J; Ryans, R; Skillen, I; Wheatley, P J

    2006-01-01

    We present a fast and efficient hybrid algorithm for selecting exoplanetary candidates from wide-field transit surveys. Our method is based on the widely-used SysRem and Box Least-Squares (BLS) algorithms. Patterns of systematic error that are common to all stars on the frame are mapped and eliminated using the SysRem algorithm. The remaining systematic errors caused by spatially localised flat-fielding and other errors are quantified using a boxcar-smoothing method. We show that the dimensions of the search-parameter space can be reduced greatly by carrying out an initial BLS search on a coarse grid of reduced dimensions, followed by Newton-Raphson refinement of the transit parameters in the vicinity of the most significant solutions. We illustrate the method's operation by applying it to data from one field of the SuperWASP survey, comprising 2300 observations of 7840 stars brighter than V=13.0. We identify 11 likely transit candidates. We reject stars that exhibit significant ellipsoidal variations indicat...

  14. Evaluation of hybrids algorithms for mass detection in digitalized mammograms

    Energy Technology Data Exchange (ETDEWEB)

    Cordero, Jose; Garzon Reyes, Johnson, E-mail: josecorderog@hotmail.com [Grupo de Optica y Espectroscopia GOE, Centro de Ciencia Basica, Universidad Pontifica Bolivariana de Medellin (Colombia)

    2011-01-01

    The breast cancer remains being a significant public health problem, the early detection of the lesions can increase the success possibilities of the medical treatments. The mammography is an image modality effective to early diagnosis of abnormalities, where the medical image is obtained of the mammary gland with X-rays of low radiation, this allows detect a tumor or circumscribed mass between two to three years before that it was clinically palpable, and is the only method that until now achieved reducing the mortality by breast cancer. In this paper three hybrids algorithms for circumscribed mass detection on digitalized mammograms are evaluated. In the first stage correspond to a review of the enhancement and segmentation techniques used in the processing of the mammographic images. After a shape filtering was applied to the resulting regions. By mean of a Bayesian filter the survivors regions were processed, where the characteristics vector for the classifier was constructed with few measurements. Later, the implemented algorithms were evaluated by ROC curves, where 40 images were taken for the test, 20 normal images and 20 images with circumscribed lesions. Finally, the advantages and disadvantages in the correct detection of a lesion of every algorithm are discussed.

  15. Quality Assured Optimal Resource Provisioning and Scheduling Technique Based on Improved Hierarchical Agglomerative Clustering Algorithm (IHAC

    Directory of Open Access Journals (Sweden)

    A. Meenakshi

    2016-08-01

    Full Text Available Resource allocation is the task of convenient resources to different uses. In the context of an resources, entire economy, can be assigned by different means, such as markets or central planning. Cloud computing has become a new age technology that has got huge potentials in enterprises and markets. Clouds can make it possible to access applications and associated data from anywhere. The fundamental motive of the resource allocation is to allot the available resource in the most effective manner. In the initial phase, a representative resource usage distribution for a group of nodes with identical resource usage patterns is evaluated as resource bundle which can be easily employed to locate a group of nodes fulfilling a standard criterion. In the document, an innovative clustering-based resource aggregation viz. the Improved Hierarchal Agglomerative Clustering Algorithm (IHAC is elegantly launched to realize the compact illustration of a set of identically behaving nodes for scalability. In the subsequent phase concerned with energetic resource allocation procedure, the hybrid optimization technique is brilliantly brought in. The novel technique is devised for scheduling functions to cloud resources which duly consider both financial and evaluation expenses. The efficiency of the novel Resource allocation system is assessed by means of several parameters such the reliability, reusability and certain other metrics. The optimal path choice is the consequence of the hybrid optimization approach. The new-fangled technique allocates the available resource based on the optimal path.

  16. An Affinity Propagation Clustering Algorithm for Mixed Numeric and Categorical Datasets

    Directory of Open Access Journals (Sweden)

    Kang Zhang

    2014-01-01

    Full Text Available Clustering has been widely used in different fields of science, technology, social science, and so forth. In real world, numeric as well as categorical features are usually used to describe the data objects. Accordingly, many clustering methods can process datasets that are either numeric or categorical. Recently, algorithms that can handle the mixed data clustering problems have been developed. Affinity propagation (AP algorithm is an exemplar-based clustering method which has demonstrated good performance on a wide variety of datasets. However, it has limitations on processing mixed datasets. In this paper, we propose a novel similarity measure for mixed type datasets and an adaptive AP clustering algorithm is proposed to cluster the mixed datasets. Several real world datasets are studied to evaluate the performance of the proposed algorithm. Comparisons with other clustering algorithms demonstrate that the proposed method works well not only on mixed datasets but also on pure numeric and categorical datasets.

  17. Modified Structural and Attribute Clustering Algorithm for Improving Cluster Quality in Data Mining: A Quality Oriented Approach

    Directory of Open Access Journals (Sweden)

    G. Abel Thangaraja

    2014-11-01

    Full Text Available The need of Data mining is because of the explosive growth of data from terabytes to petabytes. Data mining preprocess aims to produce the quality mining result in descriptive and predictive analysis. The quality of a clustering result depends on both the similarity measure used by the method and its implementation. A straightforward way to combine structural and attribute similarities is to use a weighted distance function. Clustering results are arrived based on attribute similarities. The clusters balance the attribute and structural similarities. The existing Structural and Attribute cluster algorithm is analyzed and a new algorithm is proposed. Both the algorithms are compared and results are analyzed. It is found that the modified algorithm gives better quality clusters.

  18. Validation and incremental value of the hybrid algorithm for CTO PCI.

    Science.gov (United States)

    Pershad, Ashish; Eddin, Moneer; Girotra, Sudhakar; Cotugno, Richard; Daniels, David; Lombardi, William

    2014-10-01

    To evaluate the outcomes and benefits of using the hybrid algorithm for chronic total occlusion (CTO) percutaneous coronary intervention (PCI). The hybrid algorithm harmonizes antegrade and retrograde techniques for performing CTO PCI. It has the potential to increase success rates and improve efficiency for CTO PCI. No previous data have analyzed the impact of this algorithm on CTO PCI success rates and procedural efficiency. Retrospective analysis of contemporary CTO PCI performed at two high-volume centers with adoption of the hybrid technique was compared to previously published CTO outcomes in a well matched group of patients and lesion subsets. After adoption of the hybrid algorithm, technical success was significantly higher in the post hybrid algorithm group 189/198 (95.4%) vs the pre-algorithm group 367/462 (79.4%) (P CTO PCI. © 2014 Wiley Periodicals, Inc.

  19. Combined Density-based and Constraint-based Algorithm for Clustering

    Institute of Scientific and Technical Information of China (English)

    CHEN Tung-shou; CHEN Rong-chang; LIN Chih-chiang; CHIU Yung-hsing

    2006-01-01

    We propose a new clustering algorithm that assists the researchers to quickly and accurately analyze data. We call this algorithm Combined Density-based and Constraint-based Algorithm (CDC). CDC consists of two phases. In the first phase, CDC employs the idea of density-based clustering algorithm to split the original data into a number of fragmented clusters. At the same time, CDC cuts off the noises and outliers. In the second phase, CDC employs the concept of K-means clustering algorithm to select a greater cluster to be the center. Then, the greater cluster merges some smaller clusters which satisfy some constraint rules.Due to the merged clusters around the center cluster, the clustering results show high accu racy. Moreover, CDC reduces the calculations and speeds up the clustering process. In this paper, the accuracy of CDC is evaluated and compared with those of K-means, hierarchical clustering, and the genetic clustering algorithm (GCA)proposed in 2004. Experimental results show that CDC has better performance.

  20. Robust K-Median and K-Means Clustering Algorithms for Incomplete Data

    Directory of Open Access Journals (Sweden)

    Jinhua Li

    2016-01-01

    Full Text Available Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for complete data, such as K-median and K-means. However, in practice, it is often hard to obtain accurate estimation of the missing values, which deteriorates the performance of clustering. To enhance the robustness of clustering algorithms, this paper represents the missing values by interval data and introduces the concept of robust cluster objective function. A minimax robust optimization (RO formulation is presented to provide clustering results, which are insensitive to estimation errors. To solve the proposed RO problem, we propose robust K-median and K-means clustering algorithms with low time and space complexity. Comparisons and analysis of experimental results on both artificially generated and real-world incomplete data sets validate the robustness and effectiveness of the proposed algorithms.

  1. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering

    Directory of Open Access Journals (Sweden)

    Suraj

    2015-01-01

    Full Text Available Transferring the brain computer interface (BCI from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG signal reasons us to look toward evolutionary algorithm (EA. Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA and particle swarm optimization (PSO based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD and desynchronization (ERD feature vector is formed.

  2. Ant Colony Clustering Algorithm and Improved Markov Random Fusion Algorithm in Image Segmentation of Brain Images

    Directory of Open Access Journals (Sweden)

    Guohua Zou

    2016-12-01

    Full Text Available New medical imaging technology, such as Computed Tomography and Magnetic Resonance Imaging (MRI, has been widely used in all aspects of medical diagnosis. The purpose of these imaging techniques is to obtain various qualitative and quantitative data of the patient comprehensively and accurately, and provide correct digital information for diagnosis, treatment planning and evaluation after surgery. MR has a good imaging diagnostic advantage for brain diseases. However, as the requirements of the brain image definition and quantitative analysis are always increasing, it is necessary to have better segmentation of MR brain images. The FCM (Fuzzy C-means algorithm is widely applied in image segmentation, but it has some shortcomings, such as long computation time and poor anti-noise capability. In this paper, firstly, the Ant Colony algorithm is used to determine the cluster centers and the number of FCM algorithm so as to improve its running speed. Then an improved Markov random field model is used to improve the algorithm, so that its antinoise ability can be improved. Experimental results show that the algorithm put forward in this paper has obvious advantages in image segmentation speed and segmentation effect.

  3. Training Artificial Neural Networks by a Hybrid PSO-CS Algorithm

    Directory of Open Access Journals (Sweden)

    Jeng-Fung Chen

    2015-06-01

    Full Text Available Presenting a satisfactory and efficient training algorithm for artificial neural networks (ANN has been a challenging task in the supervised learning area. Particle swarm optimization (PSO is one of the most widely used algorithms due to its simplicity of implementation and fast convergence speed. On the other hand, Cuckoo Search (CS algorithm has been proven to have a good ability for finding the global optimum; however, it has a slow convergence rate. In this study, a hybrid algorithm based on PSO and CS is proposed to make use of the advantages of both PSO and CS algorithms. The proposed hybrid algorithm is employed as a new training method for feedforward neural networks (FNNs. To investigate the performance of the proposed algorithm, two benchmark problems are used and the results are compared with those obtained from FNNs trained by original PSO and CS algorithms. The experimental results show that the proposed hybrid algorithm outperforms both PSO and CS in training FNNs.

  4. Resizing Technique-Based Hybrid Genetic Algorithm for Optimal Drift Design of Multistory Steel Frame Buildings

    Directory of Open Access Journals (Sweden)

    Hyo Seon Park

    2014-01-01

    Full Text Available Since genetic algorithm-based optimization methods are computationally expensive for practical use in the field of structural optimization, a resizing technique-based hybrid genetic algorithm for the drift design of multistory steel frame buildings is proposed to increase the convergence speed of genetic algorithms. To reduce the number of structural analyses required for the convergence, a genetic algorithm is combined with a resizing technique that is an efficient optimal technique to control the drift of buildings without the repetitive structural analysis. The resizing technique-based hybrid genetic algorithm proposed in this paper is applied to the minimum weight design of three steel frame buildings. To evaluate the performance of the algorithm, optimum weights, computational times, and generation numbers from the proposed algorithm are compared with those from a genetic algorithm. Based on the comparisons, it is concluded that the hybrid genetic algorithm shows clear improvements in convergence properties.

  5. Clustering Algorithm for Unsupervised Monaural Musical Sound Separation Based on Non-negative Matrix Factorization

    Science.gov (United States)

    Park, Sang Ha; Lee, Seokjin; Sung, Koeng-Mo

    Non-negative matrix factorization (NMF) is widely used for monaural musical sound source separation because of its efficiency and good performance. However, an additional clustering process is required because the musical sound mixture is separated into more signals than the number of musical tracks during NMF separation. In the conventional method, manual clustering or training-based clustering is performed with an additional learning process. Recently, a clustering algorithm based on the mel-frequency cepstrum coefficient (MFCC) was proposed for unsupervised clustering. However, MFCC clustering supplies limited information for clustering. In this paper, we propose various timbre features for unsupervised clustering and a clustering algorithm with these features. Simulation experiments are carried out using various musical sound mixtures. The results indicate that the proposed method improves clustering performance, as compared to conventional MFCC-based clustering.

  6. Energy Efficient Backoff Hierarchical Clustering Algorithms for Multi-Hop Wireless Sensor Networks

    Institute of Scientific and Technical Information of China (English)

    Jun Wang; Yong-Tao Cao; Jun-Yuan Xie; Shi-Fu Chen

    2011-01-01

    Compared with flat routing protocols, clustering is a fundamental performance improvement technique in wireless sensor networks, which can increase network scalability and lifetime. In this paper, we integrate the multi-hop technique with a backoff-based clustering algorithm to organize sensors. By using an adaptive backoff strategy, the algorithm not only realizes load balance among sensor node, but also achieves fairly uniform cluster head distribution across the network. Simulation results also demonstrate our algorithm is more energy-efficient than classical ones. Our algorithm is also easily extended to generate a hierarchy of cluster heads to obtain better network management and energy-efficiency.

  7. Hybrid Monte Carlo algorithm with fat link fermion actions

    CERN Document Server

    Kamleh, Waseem; Williams, Anthony G; 10.1103/PhysRevD.70.014502

    2004-01-01

    The use of APE smearing or other blocking techniques in lattice fermion actions can provide many advantages. There are many variants of these fat link actions in lattice QCD currently, such as flat link irrelevant clover (FLIC) fermions. The FLIC fermion formalism makes use of the APE blocking technique in combination with a projection of the blocked links back into the special unitary group. This reunitarization is often performed using an iterative maximization of a gauge invariant measure. This technique is not differentiable with respect to the gauge field and thus prevents the use of standard Hybrid Monte Carlo simulation algorithms. The use of an alternative projection technique circumvents this difficulty and allows the simulation of dynamical fat link fermions with standard HMC and its variants. The necessary equations of motion for FLIC fermions are derived, and some initial simulation results are presented. The technique is more general however, and is straightforwardly applicable to other smearing ...

  8. A hybrid algorithm for parallel molecular dynamics simulations

    CERN Document Server

    Mangiardi, Chris M

    2016-01-01

    This article describes an algorithm for hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-ranged forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with AVX and AVX-2 processors as well as Xeon-Phi co-processors.

  9. A hybrid nested partitions algorithm for banking facility location problems

    KAUST Repository

    Xia, Li

    2010-07-01

    The facility location problem has been studied in many industries including banking network, chain stores, and wireless network. Maximal covering location problem (MCLP) is a general model for this type of problems. Motivated by a real-world banking facility optimization project, we propose an enhanced MCLP model which captures the important features of this practical problem, namely, varied costs and revenues, multitype facilities, and flexible coverage functions. To solve this practical problem, we apply an existing hybrid nested partitions algorithm to the large-scale situation. We further use heuristic-based extensions to generate feasible solutions more efficiently. In addition, the upper bound of this problem is introduced to study the quality of solutions. Numerical results demonstrate the effectiveness and efficiency of our approach. © 2010 IEEE.

  10. A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM

    Directory of Open Access Journals (Sweden)

    P.Prabhu

    2014-03-01

    Full Text Available Business Intelligence is a set of methods, process and technologies that transform raw data into meaningful and useful information. Recommender system is one of business intelligence system that is used to obtain knowledge to the active user for better decision making. Recommender systems apply data mining techniques to the problem of making personalized recommendations for information. Due to the growth in the number of information and the users in recent years offers challenges in recommender systems. Collaborative, content, demographic and knowledge-based are four different types of recommendations systems. In this paper, a new hybrid algorithm is proposed for recommender system which combines knowledge based, profile of the users and most frequent item mining technique to obtain intelligence.

  11. A study of speech emotion recognition based on hybrid algorithm

    Science.gov (United States)

    Zhu, Ju-xia; Zhang, Chao; Lv, Zhao; Rao, Yao-quan; Wu, Xiao-pei

    2011-10-01

    To effectively improve the recognition accuracy of the speech emotion recognition system, a hybrid algorithm which combines Continuous Hidden Markov Model (CHMM), All-Class-in-One Neural Network (ACON) and Support Vector Machine (SVM) is proposed. In SVM and ACON methods, some global statistics are used as emotional features, while in CHMM method, instantaneous features are employed. The recognition rate by the proposed method is 92.25%, with the rejection rate to be 0.78%. Furthermore, it obtains the relative increasing of 8.53%, 4.69% and 0.78% compared with ACON, CHMM and SVM methods respectively. The experiment result confirms the efficiency of distinguishing anger, happiness, neutral and sadness emotional states.

  12. A hybrid algorithm for parallel molecular dynamics simulations

    Science.gov (United States)

    Mangiardi, Chris M.; Meyer, R.

    2017-10-01

    This article describes algorithms for the hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-range forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with Sandy Bridge and Haswell processors as well as systems with Xeon Phi many-core processors.

  13. Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm

    DEFF Research Database (Denmark)

    Grotkjær, Thomas; Winther, Ole; Regenberg, Birgitte

    2006-01-01

    Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods...... analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset....... The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering...

  14. Symbolic-numerical Algorithm for Generating Cluster Eigenfunctions: Tunneling of Clusters Through Repulsive Barriers

    CERN Document Server

    Vinitsky, Sergue; Chuluunbaatar, Ochbadrakh; Rostovtsev, Vitaly; Hai, Luong Le; Derbov, Vladimir; Krassovitskiy, Pavel

    2013-01-01

    A model for quantum tunnelling of a cluster comprising A identical particles, coupled by oscillator-type potential, through short-range repulsive potential barriers is introduced for the first time in the new symmetrized-coordinate representation and studied within the s-wave approximation. The symbolic-numerical algorithms for calculating the effective potentials of the close-coupling equations in terms of the cluster wave functions and the energy of the barrier quasistationary states are formulated and implemented using the Maple computer algebra system. The effect of quantum transparency, manifesting itself in nonmonotonic resonance-type dependence of the transmission coefficient upon the energy of the particles, the number of the particles A=2,3,4, and their symmetry type, is analyzed. It is shown that the resonance behavior of the total transmission coefficient is due to the existence of barrier quasistationary states imbedded in the continuum.

  15. Development of hybrid genetic algorithms for product line designs.

    Science.gov (United States)

    Balakrishnan, P V Sundar; Gupta, Rakesh; Jacob, Varghese S

    2004-02-01

    In this paper, we investigate the efficacy of artificial intelligence (AI) based meta-heuristic techniques namely genetic algorithms (GAs), for the product line design problem. This work extends previously developed methods for the single product design problem. We conduct a large scale simulation study to determine the effectiveness of such an AI based technique for providing good solutions and bench mark the performance of this against the current dominant approach of beam search (BS). We investigate the potential advantages of pursuing the avenue of developing hybrid models and then implement and study such hybrid models using two very distinct approaches: namely, seeding the initial GA population with the BS solution, and employing the BS solution as part of the GA operator's process. We go on to examine the impact of two alternate string representation formats on the quality of the solutions obtained by the above proposed techniques. We also explicitly investigate a critical managerial factor of attribute importance in terms of its impact on the solutions obtained by the alternate modeling procedures. The alternate techniques are then evaluated, using statistical analysis of variance, on a fairy large number of data sets, as to the quality of the solutions obtained with respect to the state-of-the-art benchmark and in terms of their ability to provide multiple, unique product line options.

  16. Identifying prototypical components in behaviour using clustering algorithms.

    Directory of Open Access Journals (Sweden)

    Elke Braun

    Full Text Available Quantitative analysis of animal behaviour is a requirement to understand the task solving strategies of animals and the underlying control mechanisms. The identification of repeatedly occurring behavioural components is thereby a key element of a structured quantitative description. However, the complexity of most behaviours makes the identification of such behavioural components a challenging problem. We propose an automatic and objective approach for determining and evaluating prototypical behavioural components. Behavioural prototypes are identified using clustering algorithms and finally evaluated with respect to their ability to represent the whole behavioural data set. The prototypes allow for a meaningful segmentation of behavioural sequences. We applied our clustering approach to identify prototypical movements of the head of blowflies during cruising flight. The results confirm the previously established saccadic gaze strategy by the set of prototypes being divided into either predominantly translational or rotational movements, respectively. The prototypes reveal additional details about the saccadic and intersaccadic flight sections that could not be unravelled so far. Successful application of the proposed approach to behavioural data shows its ability to automatically identify prototypical behavioural components within a large and noisy database and to evaluate these with respect to their quality and stability. Hence, this approach might be applied to a broad range of behavioural and neural data obtained from different animals and in different contexts.

  17. Number of Clusters and the Quality of Hybrid Predictive Models in Analytical CRM

    Directory of Open Access Journals (Sweden)

    Łapczyński Mariusz

    2014-08-01

    Full Text Available Making more accurate marketing decisions by managers requires building effective predictive models. Typically, these models specify the probability of customer belonging to a particular category, group or segment. The analytical CRM categories refer to customers interested in starting cooperation with the company (acquisition models, customers who purchase additional products (cross- and up-sell models or customers intending to resign from the cooperation (churn models. During building predictive models researchers use analytical tools from various disciplines with an emphasis on their best performance. This article attempts to build a hybrid predictive model combining decision trees (C&RT algorithm and cluster analysis (k-means. During experiments five different cluster validity indices and eight datasets were used. The performance of models was evaluated by using popular measures such as: accuracy, precision, recall, G-mean, F-measure and lift in the first and in the second decile. The authors tried to find a connection between the number of clusters and models' quality.

  18. A SAA-based Novel Hybrid Intelligent Evolutionary Algorithm for Job Shop Scheduling Problem

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Through systematic analysis and comparison of the common features of SAA, ES and traditional LS (local search) algorithm, a new hybrid strategy of mixing SA, ES with LS, namely HIEA (Hybrid Intelligent Evolutionary Algorithm), is proposed in this paper. Viewed as a whole, the hybrid strategy is also an intelligent heuristic searching procedure. But it has some characteristics such as generality, robustness, etc., because it synthesizes advantages of SA, ES and LS, while the shortages of the three methods are overcome. This paper applies Markov chain theory to describe the hybrid strategy mathematically, and proves that the algorithm possesses the global asymptotical convergence and analyzes the performance of HIEA.

  19. Parallel Genetic Algorithms with Dynamic Topology using Cluster Computing

    Directory of Open Access Journals (Sweden)

    ADAR, N.

    2016-08-01

    Full Text Available A parallel genetic algorithm (PGA conducts a distributed meta-heuristic search by employing genetic algorithms on more than one subpopulation simultaneously. PGAs migrate a number of individuals between subpopulations over generations. The layout that facilitates the interactions of the subpopulations is called the topology. Static migration topologies have been widely incorporated into PGAs. In this article, a PGA with a dynamic migration topology (D-PGA is proposed. D-PGA generates a new migration topology in every epoch based on the average fitness values of the subpopulations. The D-PGA has been tested against ring and fully connected migration topologies in a Beowulf Cluster. The D-PGA has outperformed the ring migration topology with comparable communication cost and has provided competitive or better results than a fully connected migration topology with significantly lower communication cost. PGA convergence behaviors have been analyzed in terms of the diversities within and between subpopulations. Conventional diversity can be considered as the diversity within a subpopulation. A new concept of permeability has been introduced to measure the diversity between subpopulations. It is shown that the success of the proposed D-PGA can be attributed to maintaining a high level of permeability while preserving diversity within subpopulations.

  20. A Heuristic Clustering Algorithm for Mining Communities in Signed Networks

    Institute of Scientific and Technical Information of China (English)

    Bo Yang; Da-You Liu

    2007-01-01

    Signed network is an important kind of complex network, which includes both positive relations and negative relations. Communities of a signed network are defined as the groups of vertices, within which positive relations are dense and between which negative relations are also dense. Being able to identify communities of signed networks is helpful for analysis of such networks. Hitherto many algorithms for detecting network communities have been developed. However, most of them are designed exclusively for the networks including only positive relations and are not suitable for signed networks.So the problem of mining communities of signed networks quickly and correctly has not been solved satisfactorily. In this paper, we propose a heuristic algorithm to address this issue. Compared with major existing methods, our approach has three distinct features. First, it is very fast with a roughly linear time with respect to network size. Second, it exhibits a good clustering capability and especially can work well with complex networks without well-defined community structures.Finally, it is insensitive to its built-in parameters and requires no prior knowledge.

  1. An Efficient Multi-path Routing Algorithm Based on Hybrid Firefly Algorithm for Wireless Mesh Networks

    Directory of Open Access Journals (Sweden)

    K. Kumaravel

    2015-05-01

    Full Text Available Wireless Mesh Network (WMN uses the latest technology which helps in providing end users a high quality service referred to as the Internet’s “last mile”. Also considering WMN one of the most important technologies that are employed is multicast communication. Among the several issues routing which is significantly an important issue is addressed by every WMN technologies and this is done during the process of data transmission. The IEEE 802.11s Standard entails and sets procedures which need to be followed to facilitate interconnection and thus be able to devise an appropriate WMN. There has been introduction of several protocols by many authors which are mainly devised on the basis of machine learning and artificial intelligence. Multi-path routing may be considered as one such routing method which facilitates transmission of data over several paths, proving its capabilities as a useful strategy for achieving reliability in WMN. Though, multi-path routing in any manner cannot really guarantee deterministic transmission. As here there are multiple paths available for enabling data transmission from source to destination node. The algorithm that had been employed before in the studies conducted did not take in to consideration routing metrics which include energy aware metrics that are used for path selection during transferring of data. The following study proposes use of the hybrid multipath routing algorithm while taking in to consideration routing metrics which include energy, minimal loss for efficient path selection and transferring of data. Proposed algorithm here has two phases. In the first phase prim’s algorithm has been proposed so that in networks route discovery may be possible. For the second one the Hybrid firefly algorithm which is based on harmony search has been employed for selection of the most suitable and best through proper analysis of metrics which include energy awareness and minimal loss for every path that has

  2. Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.

    Science.gov (United States)

    Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania

    2015-01-01

    This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.

  3. IMPROVING THE CLUSTER PERFORMANCE BY COMBINING PSO AND K-MEANS ALGORITHM

    Directory of Open Access Journals (Sweden)

    G. Komarasamy

    2011-04-01

    Full Text Available Clustering is a technique that can divide data objects into groups based on information found in the data that describes the objects and their relationships. In this paper describe to improving the clustering performance by combine Particle Swarm Optimization (PSO and K-means algorithm. The PSO algorithm successfully converges during the initial stages of a global search, but around global optimum, the search process will become very slow. On the contrary, K-means algorithm can achieve faster convergence to optimum solution. Unlike K-means method, new algorithm does not require a specific number of clusters given before performing the clustering process and it is able to find the local optimal number of clusters during the clustering process. In each iteration process, the inertia weight was changed based on the current iteration and best fitness. The experimental result shows that better performance of new algorithm by using different data sets.

  4. A new-style clustering algorithm based on swarm intelligent theory

    Institute of Scientific and Technical Information of China (English)

    CHEN Zhuo; LIU Xiang-shuang

    2007-01-01

    Traditional clustering algorithms generally have some problems, such as the sensitivity to initializing parameter, difficulty in finding out the optimization clustering result and the validity of clustering. In this paper, a FSM and a mathematic model of a new-style clustering algorithm based on the swarm intelligence are provided. In this algorithm, the clustering main body moves in a three-dimensional space and has the abilities of memory, communication, analysis, judgment and coordinating information. Experimental results conform that this algorithm has many merits such as insensitive to the order of the data, capable of dealing with exceptional,high-dimension or complicated data. The algorithm can be used in the fields of Web mining, incremental clustering, economic analysis, pattern recognition, document classification and so on.

  5. A hybrid multiview stereo algorithm for modeling urban scenes.

    Science.gov (United States)

    Lafarge, Florent; Keriven, Renaud; Brédif, Mathieu; Vu, Hoang-Hiep

    2013-01-01

    We present an original multiview stereo reconstruction algorithm which allows the 3D-modeling of urban scenes as a combination of meshes and geometric primitives. The method provides a compact model while preserving details: Irregular elements such as statues and ornaments are described by meshes, whereas regular structures such as columns and walls are described by primitives (planes, spheres, cylinders, cones, and tori). We adopt a two-step strategy consisting first in segmenting the initial meshbased surface using a multilabel Markov Random Field-based model and second in sampling primitive and mesh components simultaneously on the obtained partition by a Jump-Diffusion process. The quality of a reconstruction is measured by a multi-object energy model which takes into account both photo-consistency and semantic considerations (i.e., geometry and shape layout). The segmentation and sampling steps are embedded into an iterative refinement procedure which provides an increasingly accurate hybrid representation. Experimental results on complex urban structures and large scenes are presented and compared to state-of-the-art multiview stereo meshing algorithms.

  6. Beam Pattern Synthesis Based on Hybrid Optimization Algorithm

    Institute of Scientific and Technical Information of China (English)

    YU Yan-li; WANG Ying-min; LI Lei

    2010-01-01

    As conventional methods for beam pattern synthesis can not always obtain the desired optimum pattern for the arbitrary underwater acoustic sensor arrays, a hybrid numerical synthesis method based on adaptive principle and genetic algorithm was presented in this paper. First, based on the adaptive theory, a given array was supposed as an adaptive array and its sidelobes were reduced by assigning a number of interference signals in the sidelobe region. An initial beam pattern was obtained after several iterations and adjustments of the interference intensity, and based on its parameters, a desired pattern was created. Then, an objective function based on the difference between the designed and desired patterns can be constructed. The pattern can be optimized by using the genetic algorithm to minimize the objective function. A design example for a double-circular array demonstrates the effectiveness of this method. Compared with the approaches existing before, the proposed method can reduce the sidelobe effectively and achieve less synthesis magnitude error in the mainlobe.The method can search for optimum attainable pattern for the specific elements if the desired pattern can not be found.

  7. ROBUST-HYBRID GENETIC ALGORITHM FOR A FLOW-SHOP SCHEDULING PROBLEM (A Case Study at PT FSCM Manufacturing Indonesia)

    OpenAIRE

    Johan Soewanda; Tanti Octavia; Iwan Halim Sahputra

    2007-01-01

    This paper discusses the application of Robust Hybrid Genetic Algorithm to solve a flow-shop scheduling problem. The proposed algorithm attempted to reach minimum makespan. PT. FSCM Manufacturing Indonesia Plant 4's case was used as a test case to evaluate the performance of the proposed algorithm. The proposed algorithm was compared to Ant Colony, Genetic-Tabu, Hybrid Genetic Algorithm, and the company's algorithm. We found that Robust Hybrid Genetic produces statistically better result than...

  8. A hybrid model for bankruptcy prediction using genetic algorithm, fuzzy c-means and mars

    CERN Document Server

    Martin, A; Saranya, G; Gayathri, P; Venkatesan, Prasanna

    2011-01-01

    Bankruptcy prediction is very important for all the organization since it affects the economy and rise many social problems with high costs. There are large number of techniques have been developed to predict the bankruptcy, which helps the decision makers such as investors and financial analysts. One of the bankruptcy prediction models is the hybrid model using Fuzzy C-means clustering and MARS, which uses static ratios taken from the bank financial statements for prediction, which has its own theoretical advantages. The performance of existing bankruptcy model can be improved by selecting the best features dynamically depend on the nature of the firm. This dynamic selection can be accomplished by Genetic Algorithm and it improves the performance of prediction model.

  9. A HYBRID MODEL FOR BANKRUPTCY PREDICTION USING GENETIC ALGORITHM, FUZZY C-MEANS AND MARS

    Directory of Open Access Journals (Sweden)

    A.Martin

    2011-05-01

    Full Text Available Bankruptcy prediction is very important for all the organization since it affects the economy and rise manysocial problems with high costs. There are large number of techniques have been developed to predict thebankruptcy, which helps the decision makers such as investors and financial analysts. One of thebankruptcy prediction models is the hybrid model using Fuzzy C-means clustering and MARS, which usesstatic ratios taken from the bank financial statements for prediction, which has its own theoreticaladvantages. The performance of existing bankruptcy model can be improved by selecting the best featuresdynamically depend on the nature of the firm. This dynamic selection can be accomplished by GeneticAlgorithm and it improves the performance of prediction model. .

  10. Double Motor Coordinated Control Based on Hybrid Genetic Algorithm and CMAC

    Science.gov (United States)

    Cao, Shaozhong; Tu, Ji

    A novel hybrid cerebellar model articulation controller (CMAC) and online adaptive genetic algorithm (GA) controller is introduced to control two Brushless DC motor (BLDCM) which applied in a biped robot. Genetic Algorithm simulates the random learning among the individuals of a group, and CMAC simulates the self-learning of an individual. To validate the ability and superiority of the novel algorithm, experiments have been done in MATLAB/SIMULINK. Analysis among GA, hybrid GA-CMAC and CMAC feed-forward control is also given. The results prove that the torque ripple of the coordinated control system is eliminated by using the hybrid GA-CMAC algorithm.

  11. Optimum Performance-Based Seismic Design Using a Hybrid Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    S. Talatahari

    2014-01-01

    Full Text Available A hybrid optimization method is presented to optimum seismic design of steel frames considering four performance levels. These performance levels are considered to determine the optimum design of structures to reduce the structural cost. A pushover analysis of steel building frameworks subject to equivalent-static earthquake loading is utilized. The algorithm is based on the concepts of the charged system search in which each agent is affected by local and global best positions stored in the charged memory considering the governing laws of electrical physics. Comparison of the results of the hybrid algorithm with those of other metaheuristic algorithms shows the efficiency of the hybrid algorithm.

  12. Clustering Web Documents based on Efficient Multi-Tire Hashing Algorithm for Mining Frequent Termsets

    Directory of Open Access Journals (Sweden)

    Noha Negm

    2013-06-01

    Full Text Available Document Clustering is one of the main themes in text mining. It refers to the process of grouping documents with similar contents or topics into clusters to improve both availability and reliability of text mining applications. Some of the recent algorithms address the problem of high dimensionality of the text by using frequent termsets for clustering. Although the drawbacks of the Apriori algorithm, it still the basic algorithm for mining frequent termsets. This paper presents an approach for Clustering Web Documents based on Hashing algorithm for mining Frequent Termsets (CWDHFT. It introduces an efficient Multi-Tire Hashing algorithm for mining Frequent Termsets (MTHFT instead of Apriori algorithm. The algorithm uses new methodology for generating frequent termsets by building the multi-tire hash table during the scanning process of documents only one time. To avoid hash collision, Multi Tire technique is utilized in this proposed hashing algorithm. Based on the generated frequent termset the documents are partitioned and the clustering occurs by grouping the partitions through the descriptive keywords. By using MTHFT algorithm, the scanning cost and computational cost is improved moreover the performance is considerably increased and increase up the clustering process. The CWDHFT approach improved accuracy, scalability and efficiency when compared with existing clustering algorithms like Bisecting K-means and FIHC.

  13. Financial Time Series Modelling with Hybrid Model Based on Customized RBF Neural Network Combined With Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Lukas Falat

    2014-01-01

    Full Text Available In this paper, authors apply feed-forward artificial neural network (ANN of RBF type into the process of modelling and forecasting the future value of USD/CAD time series. Authors test the customized version of the RBF and add the evolutionary approach into it. They also combine the standard algorithm for adapting weights in neural network with an unsupervised clustering algorithm called K-means. Finally, authors suggest the new hybrid model as a combination of a standard ANN and a moving average for error modeling that is used to enhance the outputs of the network using the error part of the original RBF. Using high-frequency data, they examine the ability to forecast exchange rate values for the horizon of one day. To determine the forecasting efficiency, authors perform the comparative out-of-sample analysis of the suggested hybrid model with statistical models and the standard neural network.

  14. Cell Assignment in Hybrid CMOS/Nanodevices Architecture Using a PSO/SA Hybrid Algorithm

    Directory of Open Access Journals (Sweden)

    Sadiq M. Sait

    2013-10-01

    anowire\\MOLecular Hybrid, higher circuit densities are possible. In CMOL there is an additional layer of nanofabric on top of CMOS stack. Nanodevices that lie between overlapping nanowires are programmable and can implement any combinational logic using a netlist of NOR gates. The limitation on the length of nanowires put a constraint on the connectivity domain of a circuit. The gates connected to each other must be within a connectivity radius; otherwise an extra buffer is inserted to connect them. Particle swarm optimization (PSO has been used in a variety of problems that are NP- hard. PSO compared to the other iterative heuristic techniques is simpler to implement. Besides, it delivers comparable results. In this paper, a hybrid of PSO and simulated annealing (SA for solving the cell assignment in CMOL, an NP-hard problem, is proposed. The proposed method takes advantage of the exploration and exploitation factors of PSO and the intrinsic hill climbing feature of SA to reduce the number of buffers to be inserted. Experiments conducted on ISCAS'89 benchmark circuits and a comparison with other heuristic techniques, are presented. Results showed that the proposed hybrid algorithm achieved better solution in terms of buffer count in reasonable time.

  15. Fast hybrid CPU- and GPU-based CT reconstruction algorithm using air skipping technique.

    Science.gov (United States)

    Lee, Byeonghun; Lee, Ho; Shin, Yeong Gil

    2010-01-01

    This paper presents a fast hybrid CPU- and GPU-based CT reconstruction algorithm to reduce the amount of back-projection operation using air skipping involving polygon clipping. The algorithm easily and rapidly selects air areas that have significantly higher contrast in each projection image by applying K-means clustering method on CPU, and then generates boundary tables for verifying valid region using segmented air areas. Based on these boundary tables of each projection image, clipped polygon that indicates active region when back-projection operation is performed on GPU is determined on each volume slice. This polygon clipping process makes it possible to use smaller number of voxels to be back-projected, which leads to a faster GPU-based reconstruction method. This approach has been applied to a clinical data set and Shepp-Logan phantom data sets having various ratio of air region for quantitative and qualitative comparison and analysis of our and conventional GPU-based reconstruction methods. The algorithm has been proved to reduce computational time to half without losing any diagnostic information, compared to conventional GPU-based approaches.

  16. A Hybrid Quantum Search Engine: A Fast Quantum Algorithm for Multiple Matches

    CERN Document Server

    Younes, A; Miller, J; Younes, Ahmed; Rowe, Jon; Miller, Julian

    2003-01-01

    In this paper we will present a quantum algorithm which works very efficiently in case of multiple matches within the search space and in the case of few matches, the algorithm performs classically. This allows us to propose a hybrid quantum search engine that integrates Grover's algorithm and the proposed algorithm here to have general performance better that any pure classical or quantum search algorithm.

  17. A robust cluster-based dynamic-super-node scheme for hybrid peer-to-peer network

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Hybrid peer-to-peer (P2P) system can improve the performance of the entire system using super-peer. But it is difficult to measure a peer's capability exactly and ensure high reliability of the network. This paper proposes a scheme to solve these problems. Firstly, we present a hybrid P2P network in which the upper layer is Chord network and the lower layer is cluster. Then we provide a strategy to measure a peer's capability so that a cluster can be organized to be a sorting network in which peers are classified into three types: dynamic-super-node (DSN), backup-node (BN) and ordinary-node (ON). In a cluster, DSN and BNs are strongly connected. And based on this, we present an algorithm DSN flood min (DSNFM) to select DSN BN and maintain consensus of the cluster. Furthermore, we do a reliability analysis of the cluster based on churn rate of the network and gathered three rules of thumb from our simulations.

  18. A Novel Distributed Clustering Algorithm for Mobile Ad-hoc Networks

    Directory of Open Access Journals (Sweden)

    Sahar Adabi

    2008-01-01

    Full Text Available This paper proposed a new Distributed Score Based Clustering Algorithm (DSBCA for Mobile Ad-hoc Networks (MANETs.In MANETs, select suitable nodes in clusters as cluster heads are so important. The proposed Clustering Algorithm considers the Battery Remaining, Number of Neighbors, Number of Members, and Stability in order to calculate the node's score with a linear algorithm. After each node calculates its score independently, the neighbors of the node must be notified about it. Also each node selects one of its neighbors with the highest score to be its cluster head and, therefore the selection of cluster heads is performed in a distributed manner with most recent information about current status of neighbor nodes. The proposed algorithm was compared with Weighted Clustering Algorithm and Distributed Weighted Clustering Algorithm in terms of number of clusters, number of re-affiliations, lifespan of nodes in the system, end-to-end throughput and overhead. The simulation results proved that the proposed algorithm has achieved the goals.

  19. User-Based Document Clustering by Redescribing Subject Descriptions with a Genetic Algorithm.

    Science.gov (United States)

    Gordon, Michael D.

    1991-01-01

    Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…

  20. Contributions to "k"-Means Clustering and Regression via Classification Algorithms

    Science.gov (United States)

    Salman, Raied

    2012-01-01

    The dissertation deals with clustering algorithms and transforming regression problems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learning environment for solving regression problems as classification tasks by using…

  1. A Cluster Algorithm for the 2-D SU(3) × SU(3) Chiral Model

    Science.gov (United States)

    Ji, Da-ren; Zhang, Jian-bo

    1996-07-01

    To extend the cluster algorithm to SU(N) × SU(N) chiral models, a variant version of Wolff's cluster algorithm is proposed and tested for the 2-dimensional SU(3) × SU(3) chiral model. The results show that the new method can reduce the critical slowing down in SU(3) × SU(3) chiral model.

  2. A Hybrid Sales Forecasting Scheme by Combining Independent Component Analysis with K-Means Clustering and Support Vector Regression

    Science.gov (United States)

    2014-01-01

    Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738

  3. A Hybrid Sales Forecasting Scheme by Combining Independent Component Analysis with K-Means Clustering and Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Chi-Jie Lu

    2014-01-01

    Full Text Available Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA with K-means clustering and support vector regression (SVR. The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.

  4. Multiuser Detection in MIMO-OFDM Wireless Communication System Using Hybrid Firefly Algorithm

    Directory of Open Access Journals (Sweden)

    B. Sathish Kumar

    2014-05-01

    Full Text Available In recent years, future generation wireless communication technologies are most the prominent fields in which many innovative techniques are used for effective communication. Orthogonal frequency-division multiplexing is one of the important technologies used for communication in future generation technologies. Although it gives efficient results, it has some problems during the implementation in real-time. MIMO and OFDM are integrated to have both their benefits. But, noise and interference are the major issues in the MIMO OFDM systems. To overcome these issues multiuser detection method is used in MIMO OFDM. Several algorithms and mathematical formulations have been presented for solving multiuser detection problem in MIMO OFDM systems. The algorithms such as genetic simulated annealing algorithm, hybrid ant colony optimization algorithm are used for multiuser detection problem in previous studies. But, due to the limitations of those optimization algorithms, the results obtained are not significant. In this research, to overcome the noise and interference problems, hybrid firefly optimization algorithm based on the evolutionary algorithm is proposed. The proposed algorithm is compared with the existing multiuser detection algorithm such as particle swarm optimization, CEFM-GADA [complementary error function mutation (CEFM and a differential algorithm (DA genetic algorithm (GA] and Hybrid firefly optimization algorithm based on evolutionary algorithm. The simulation results shows that performance of the proposed algorithm is better than the existing algorithm and it provides a satisfactory trade-off between computational complexity and detection performance

  5. Lowest-ID with Adaptive ID Reassignment: A Novel Mobile Ad-Hoc Networks Clustering Algorithm

    CERN Document Server

    Gavalas, Damianos; Konstantopoulos, Charalampos; Mamalis, Basilis

    2011-01-01

    Clustering is a promising approach for building hierarchies and simplifying the routing process in mobile ad-hoc network environments. The main objective of clustering is to identify suitable node representatives, i.e. cluster heads (CHs), to store routing and topology information and maximize clusters stability. Traditional clustering algorithms suggest CH election exclusively based on node IDs or location information and involve frequent broadcasting of control packets, even when network topology remains unchanged. More recent works take into account additional metrics (such as energy and mobility) and optimize initial clustering. However, in many situations (e.g. in relatively static topologies) re-clustering procedure is hardly ever invoked; hence initially elected CHs soon reach battery exhaustion. Herein, we introduce an efficient distributed clustering algorithm that uses both mobility and energy metrics to provide stable cluster formations. CHs are initially elected based on the time and cost-efficien...

  6. A HYBRID GRANULARITY PARALLEL ALGORITHM FOR PRECISE INTEGRATION OF STRUCTURAL DYNAMIC RESPONSES

    Institute of Scientific and Technical Information of China (English)

    Yuanyin Li; Xianlong Jin; Genguo Li

    2008-01-01

    Precise integration methods to solve structural dynamic responses and the corre-sponding time integration formula are composed of two parts: the multiplication of an exponential matrix with a vector and the integration term. The second term can be solved by the series solu-tion. Two hybrid granularity parallel algorithms are designed, that is, the exponential matrix and the first term are computed by the fine-grained parallel algorithm and the second term is com-puted by the coarse-grained parallel algorithm. Numerical examples show that these two hybrid granularity parallel algorithms obtain higher speedup and parallel efficiency than two existing parallel algorithms.

  7. A ROBUST PHASE-ONLY DIRECT DATA DOMAIN ALGORITHM BASED ON GENERALIZED RAYLEIGH QUOTIENT OPTIMIZATION USING HYBRID GENETIC ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Shao Wei; Qian Zuping; Yuan Feng

    2007-01-01

    A robust phase-only Direct Data Domain Least Squares (D3LS) algorithm based on generalized Rayleigh quotient optimization using hybrid Genetic Algorithm (GA) is presented in this letter. The optimization efficiency and computational speed are improved via the hybrid GA composed of standard GA and Nelder-Mead simplex algorithms. First, the objective function, with a form of generalized Rayleigh quotient, is derived via the standard D3LS algorithm. It is then taken as a fitness function and the unknown phases of all adaptive weights are taken as decision variables.Then, the nonlinear optimization is performed via the hybrid GA to obtain the optimized solution of phase-only adaptive weights. As a phase-only adaptive algorithm, the proposed algorithm is simpler than conventional algorithms when it comes to hardware implementation. Moreover, it processes only a single snapshot data as opposed to forming sample covariance matrix and operating matrix inversion. Simulation results show that the proposed algorithm has a good signal recovery and interferences nulling performance, which are superior to that of the phase-only D3LS algorithm based on standard GA.

  8. Enhanced hybrid search algorithm for protein structure prediction using the 3D-HP lattice model.

    Science.gov (United States)

    Zhou, Changjun; Hou, Caixia; Zhang, Qiang; Wei, Xiaopeng

    2013-09-01

    The problem of protein structure prediction in the hydrophobic-polar (HP) lattice model is the prediction of protein tertiary structure. This problem is usually referred to as the protein folding problem. This paper presents a method for the application of an enhanced hybrid search algorithm to the problem of protein folding prediction, using the three dimensional (3D) HP lattice model. The enhanced hybrid search algorithm is a combination of the particle swarm optimizer (PSO) and tabu search (TS) algorithms. Since the PSO algorithm entraps local minimum in later evolution extremely easily, we combined PSO with the TS algorithm, which has properties of global optimization. Since the technologies of crossover and mutation are applied many times to PSO and TS algorithms, so enhanced hybrid search algorithm is called the MCMPSO-TS (multiple crossover and mutation PSO-TS) algorithm. Experimental results show that the MCMPSO-TS algorithm can find the best solutions so far for the listed benchmarks, which will help comparison with any future paper approach. Moreover, real protein sequences and Fibonacci sequences are verified in the 3D HP lattice model for the first time. Compared with the previous evolutionary algorithms, the new hybrid search algorithm is novel, and can be used effectively to predict 3D protein folding structure. With continuous development and changes in amino acids sequences, the new algorithm will also make a contribution to the study of new protein sequences.

  9. Analisis Perbandingan Performansi Penjadwalan Paket Antara Homogeneous Algorithm Dengan Hybrid Algorithm Pada Jaringan Point-To-Multipoint Wimax

    Directory of Open Access Journals (Sweden)

    Dadiek Pranindito

    2014-11-01

    Full Text Available Saat ini, dalam dunia telekomunikasi, (Worldwide Interoperability for Microwave Access WiMaX merupakan teknologi nirkabel yang menyediakan hubungan jalur lebar dalam jarak jauh, memiliki kecepatan akses yang tinggi dan jangkauan yang luas serta menyediakan berbagai macam jenis layanan. Masalah yang menarik dan menantang pada WiMaX adalah dalam hal menyediakan jaminan kualitas pelayanan (QoS untuk jenis layanan yang berbeda dengan bermacam-macam kebutuhan QoS-nya. Untuk memenuhi kebutuhan QoS tersebut, maka diperlukan suatu algoritma penjadwalan. Dalam penelitian ini dilakukan simulasi jaringan WiMaX menerapkan algoritma penjadwalan dengan metode homogeneous algorithm dan hybrid algorithm. Perwakilan pada metode homogeneous algorithm akan menggunakan algoritma penjadwalan Weighted Fair Queuing (WFQ dan Deficit Round Robin (DRR, sedangkan pada metode hybrid algorithm menggunakan penggabungan antara algoritma penjadwalan DRR dan WFQ. Pengujian kinerja algoritma penjadwalan tersebut dilakukan dengan membandingkan kedalam 5 jenis kelas QoS pada WiMAX yaitu UGS,  rtPS, nrtPS, ertPS, dan Best Effort. Dari hasil pengujian, hybrid algorithm memberikan nilai QoS yang lebih baik jika dibandingkan dengan homogeneous algorithm. hybrid algorithm sangat cocok jika diterapkan pada kondisi jaringan yang memiliki trafik dengan paket data yang bervariasi, karena dapat menghasilkan throughput yang tinggi, serta dapat menghasilkan nilai delay dan jitter yang rendah

  10. Combinatorial Clustering Algorithm of Quantum-Behaved Particle Swarm Optimization and Cloud Model

    Directory of Open Access Journals (Sweden)

    Mi-Yuan Shan

    2013-01-01

    Full Text Available We propose a combinatorial clustering algorithm of cloud model and quantum-behaved particle swarm optimization (COCQPSO to solve the stochastic problem. The algorithm employs a novel probability model as well as a permutation-based local search method. We are setting the parameters of COCQPSO based on the design of experiment. In the comprehensive computational study, we scrutinize the performance of COCQPSO on a set of widely used benchmark instances. By benchmarking combinatorial clustering algorithm with state-of-the-art algorithms, we can show that its performance compares very favorably. The fuzzy combinatorial optimization algorithm of cloud model and quantum-behaved particle swarm optimization (FCOCQPSO in vague sets (IVSs is more expressive than the other fuzzy sets. Finally, numerical examples show the clustering effectiveness of COCQPSO and FCOCQPSO clustering algorithms which are extremely remarkable.

  11. A Heuristic Clustering Algorithm for Intrusion Detection Based on Information Entropy

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    This paper studied on the clustering problem for intrusion detection with the theory of information entropy, it was put forward that the clustering problem for exact intrusion detection based on information entropy is NP-complete, therefore, the heuristic algorithm to solve the clustering problem for intrusion detection was designed, this algorithm has the characteristic of incremental development, it can deal with the database with large connection records from the internet.

  12. A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters

    Science.gov (United States)

    Wang, Zhihao; Yi, Jing

    2016-01-01

    For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule n and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result. PMID:28042291

  13. A Cluster Maintenance Algorithm Based on Relative Mobility for Mobile Ad Hoc Network Management

    Institute of Scientific and Technical Information of China (English)

    SHENZhong; CHANGYilin; ZHANGXin

    2005-01-01

    The dynamic topology of mobile ad hoc networks makes network management significantly more challenging than wireline networks. The traditional Client/Server (Manager/Agent) management paradigm could not work well in such a dynamic environment, while the hierarchical network management architecture based on clustering is more feasible. Although the movement of nodes makes the cluster structure changeable and introduces new challenges for network management, the mobility is a relative concept. A node with high relative mobility is more prone to unstable behavior than a node with less relative mobility, thus the relative mobility of a node can be used to predict future node behavior. This paper presents the cluster availability which provides a quantitative measurement of cluster stability. Furthermore, a cluster maintenance algorithm based on cluster availability is proposed. The simulation results show that, compared to the Minimum ID clustering algorithm, our algorithm successfully alleviates the influence caused by node mobility and make the network management more efficient.

  14. A Comparative Study of Several Hybrid Particle Swarm Algorithms for Function Optimization

    Directory of Open Access Journals (Sweden)

    Yanhua Zhong

    2012-11-01

    Full Text Available Currently, the researchers have made a lot of hybrid particle swarm algorithm in order to solve the shortcomings that the Particle Swarm Algorithms is easy to converge to local extremum, these algorithms declare that there has been better than the standard particle swarm. This study selects three kinds of representative hybrid particle swarm optimizations (differential evolution particle swarm optimization, GA particle swarm optimization, quantum particle swarm optimization and the standard particle swarm optimization to test with three objective functions. We compare evolutionary algorithm performance by a fixed number of iterations of the convergence speed and accuracy and the number of iterations under the fixed convergence precision; analyzing these types of hybrid particle swarm optimization results and practical performance. Test results show hybrid particle algorithm performance has improved significantly.

  15. A Comparative Study of Several Hybrid Particle Swarm Algorithms for Function Optimization

    Directory of Open Access Journals (Sweden)

    Yanhua Zhong

    2013-01-01

    Full Text Available Currently, the researchers have made a lot of hybrid particle swarm algorithm in order to solve the shortcomings that the Particle Swarm Algorithms is easy to converge to local extremum, these algorithms declare that there has been better than the standard particle swarm. This study selects three kinds of representative hybrid particle swarm optimizations (differential evolution particle swarm optimization, GA particle swarm optimization, quantum particle swarm optimization and the standard particle swarm optimization to test with three objective functions. We compare evolutionary algorithm performance by a fixed number of iterations of the convergence speed and accuracy and the number of iterations under the fixed convergence precision, analyzing these types of hybrid particle swarm optimization results and practical performance. Test results show hybrid particle algorithm performance has improved significantly.

  16. Parallelization of the Wolff single-cluster algorithm

    Science.gov (United States)

    Kaupužs, J.; Rimšāns, J.; Melnik, R. V. N.

    2010-02-01

    A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024 , we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n≥2 . Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available.

  17. Using Clustering Algorithms to Identify Brown Dwarf Characteristics

    Science.gov (United States)

    Choban, Caleb

    2016-06-01

    Brown dwarfs are stars that are not massive enough to sustain core hydrogen fusion, and thus fade and cool over time. The molecular composition of brown dwarf atmospheres can be determined by observing absorption features in their infrared spectrum, which can be quantified using spectral indices. Comparing these indices to one another, we can determine what kind of brown dwarf it is, and if it is young or metal-poor. We explored a new method for identifying these subgroups through the expectation-maximization machine learning clustering algorithm, which provides a quantitative and statistical way of identifying index pairs which separate rare populations. We specifically quantified two statistics, completeness and concentration, to identify the best index pairs. Starting with a training set, we defined selection regions for young, metal-poor and binary brown dwarfs, and tested these on a large sample of L dwarfs. We present the results of this analysis, and demonstrate that new objects in these classes can be found through these methods.

  18. A multi-sequential number-theoretic optimization algorithm using clustering methods

    Institute of Scientific and Technical Information of China (English)

    XU Qing-song; LIANG Yi-zeng; HOU Zhen-ting

    2005-01-01

    A multi-sequential number-theoretic optimization method based on clustering was developed and applied to the optimization of functions with many local extrema. Details of the procedure to generate the clusters and the sequential schedules were given. The algorithm was assessed by comparing its performance with generalized simulated annealing algorithm in a difficult instructive example and a D-optimum experimental design problem. It is shown the presented algorithm to be more effective and reliable based on the two examples.

  19. Comparison and evaluation of network clustering algorithms applied to genetic interaction networks.

    Science.gov (United States)

    Hou, Lin; Wang, Lin; Berg, Arthur; Qian, Minping; Zhu, Yunping; Li, Fangting; Deng, Minghua

    2012-01-01

    The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes; Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets; the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.

  20. Sonar Image Detection Algorithm Based on Two-Phase Manifold Partner Clustering

    Institute of Scientific and Technical Information of China (English)

    Xingmei Wang; Zhipeng Liu; Jianchuang Sun; Shu Liu

    2015-01-01

    According to the characteristics of sonar image data with manifold feature, the sonar image detection method based on two⁃phase manifold partner clustering algorithm is proposed. Firstly, K⁃means block clustering based on euclidean distance is proposed to reduce the data set. Mean value, standard deviation, and gray minimum value are considered as three features based on the relatinship between clustering model and data structure. Then K⁃means clustering algorithm based on manifold distance is utilized clustering again on the reduced data set to improve the detection efficiency. In K⁃means clustering algorithm based on manifold distance, line segment length on the manifold is analyzed, and a new power function line segment length is proposed to decrease the computational complexity. In order to quickly calculate the manifold distance, new all⁃source shortest path as the pretreatment of efficient algorithm is proposed. Based on this, the spatial feature of the image block is added in the three features to get the final precise partner clustering algorithm. The comparison with the other typical clustering algorithms demonstrates that the proposed algorithm gets good detection result. And it has better adaptability by experiments of the different real sonar images.

  1. PERFORMANCE OF K-MEANS CLUSTERING AND BIRD FLOCKING ALGORITHM FOR GROUPING THE WEB LOG FILES

    Directory of Open Access Journals (Sweden)

    R. SUGUNA

    2012-10-01

    Full Text Available Data mining is the process of analyzing the interesting pattern and knowledge in different perspectives and summarizing it into useful information from the large amount of data. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. The unlabled vast amount of data can be grouped using clustering or classification algorithms. Cluster analysis or clustering is the task of assigning a set of objects into groups called clusters. So, the objects in the same cluster are more similar to each other than to those in other clusters. Many of the researchers evaluated the performance of thefamiliar K-means clustering algorithm and attempt to improve the efficiency of the algorithm. This paper will analyze the performance of the K-means clustering algorithm with the biological based algorithm called Bird flocking algorithm for grouping the web logs. Web logs are unformatted text files which contains the information regarding the user’s browser detail. The proposed system takes the input as web log files and groups the web sites based on the interesting rate of the users. The performance is evaluated in terms of no of clusters, CPU utilization time and accuracy.

  2. Hybrid genetic algorithm approach for selective harmonic control

    Energy Technology Data Exchange (ETDEWEB)

    Dahidah, Mohamed S.A. [Faculty of Engineering, Multimedia University, 63100, Jalan Multimedia-Cyberjaya, Selangor (Malaysia); Agelidis, Vassilios G. [School of Electrical and Information Engineering, The University of Sydney, NSW (Australia); Rao, Machavaram V. [Faculty of Engineering and Technology, Multimedia University, 75450, Jalan Ayer Keroh Lama-Melaka (Malaysia)

    2008-02-15

    The paper presents an optimal solution for a selective harmonic elimination pulse width modulated (SHE-PWM) technique suitable for a high power inverter used in constant frequency utility applications. The main challenge of solving the associated non-linear equations, which are transcendental in nature and, therefore, have multiple solutions, is the convergence, and therefore, an initial point selected considerably close to the exact solution is required. The paper discusses an efficient hybrid real coded genetic algorithm (HRCGA) that reduces significantly the computational burden, resulting in fast convergence. An objective function describing a measure of the effectiveness of eliminating selected orders of harmonics while controlling the fundamental, namely a weighted total harmonic distortion (WTHD) is derived, and a comparison of different operating points is reported. It is observed that the method was able to find the optimal solution for a modulation index that is higher than unity. The theoretical considerations reported in this paper are verified through simulation and experimentally on a low power laboratory prototype. (author)

  3. An Effective Hybrid Artificial Bee Colony Algorithm for Nonnegative Linear Least Squares Problems

    Directory of Open Access Journals (Sweden)

    Xiangyu Kong

    2014-07-01

    Full Text Available An effective hybrid artificial bee colony algorithm is proposed in this paper for nonnegative linear least squares problems. To further improve the performance of algorithm, orthogonal initialization method is employed to generate the initial swarm. Furthermore, to balance the exploration and exploitation abilities, a new search mechanism is designed. The performance of this algorithm is verified by using 27 benchmark functions and 5 nonnegative linear least squares test problems. And the comparison analyses are given between the proposed algorithm and other swarm intelligence algorithms. Numerical results demonstrate that the proposed algorithm displays a high performance compared with other algorithms for global optimization problems and nonnegative linear least squares problems.

  4. The novel generating algorithm and properties of hybrid-P-ary generalized bridge functions

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    In this paper, we develop novel non-sine functions, named hybrid-P-ary generalized bridge functions, based on the copy and shift methods. The generating algorithm of hybrid-P-ary generalized bridge functions is introduced based on the hybrid-P-ary generalized Walsh function's copy algorithm. The main property, product property, is also discussed. This function may be viewed as the generalization of the theory of bridge functions. And a lot of non-sine orthogonal functions are the special subset of these novel functions. The hybrid-P-ary generalized bridge functions can be used to search many unknown non-sine functions by defining different parameters.

  5. Composite Hybrid Cluster Built from the Integration of Polyoxometalate and a Metal Halide Cluster: Synthetic Strategy, Structure, and Properties.

    Science.gov (United States)

    Li, Xin-Xiong; Ma, Xiang; Zheng, Wen-Xu; Qi, Yan-Jie; Zheng, Shou-Tian; Yang, Guo-Yu

    2016-09-06

    A step-by-step synthetic strategy, setting up a bridge between the polyoxometalate (POM) and metal halide cluster (MHC) systems, is demonstrated to construct an unprecedented composite hybrid cluster built up from one high-nuclearity cationic MHC [Cu8I6](2+) and eight Anderson-type anionic POMs [HCrMo6O18(OH)6](2-) cross-linked by a tripodal alcohol derivative.

  6. Improved FIFO Scheduling Algorithm Based on Fuzzy Clustering in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Jian Li

    2017-02-01

    Full Text Available In cloud computing, some large tasks may occupy too many resources and some small tasks may wait for a long time based on First-In-First-Out (FIFO scheduling algorithm. To reduce tasks’ waiting time, we propose a task scheduling algorithm based on fuzzy clustering algorithms. We construct a task model, resource model, and analyze tasks’ preference, then classify resources with fuzzy clustering algorithms. Based on the parameters of cloud tasks, the algorithm will calculate resource expectation and assign tasks to different resource clusters, so the complexity of resource selection will be decreased. As a result, the algorithm will reduce tasks’ waiting time and improve the resource utilization. The experiment results show that the proposed algorithm shortens the execution time of tasks and increases the resource utilization.

  7. A hybrid approach using chaotic dynamics and global search algorithms for combinatorial optimization problems

    Science.gov (United States)

    Igeta, Hideki; Hasegawa, Mikio

    Chaotic dynamics have been effectively applied to improve various heuristic algorithms for combinatorial optimization problems in many studies. Currently, the most used chaotic optimization scheme is to drive heuristic solution search algorithms applicable to large-scale problems by chaotic neurodynamics including the tabu effect of the tabu search. Alternatively, meta-heuristic algorithms are used for combinatorial optimization by combining a neighboring solution search algorithm, such as tabu, gradient, or other search method, with a global search algorithm, such as genetic algorithms (GA), ant colony optimization (ACO), or others. In these hybrid approaches, the ACO has effectively optimized the solution of many benchmark problems in the quadratic assignment problem library. In this paper, we propose a novel hybrid method that combines the effective chaotic search algorithm that has better performance than the tabu search and global search algorithms such as ACO and GA. Our results show that the proposed chaotic hybrid algorithm has better performance than the conventional chaotic search and conventional hybrid algorithms. In addition, we show that chaotic search algorithm combined with ACO has better performance than when combined with GA.

  8. Application of Hybrid Optimization Algorithm in the Synthesis of Linear Antenna Array

    Directory of Open Access Journals (Sweden)

    Ezgi Deniz Ülker

    2014-01-01

    Full Text Available The use of hybrid algorithms for solving real-world optimization problems has become popular since their solution quality can be made better than the algorithms that form them by combining their desirable features. The newly proposed hybrid method which is called Hybrid Differential, Particle, and Harmony (HDPH algorithm is different from the other hybrid forms since it uses all features of merged algorithms in order to perform efficiently for a wide variety of problems. In the proposed algorithm the control parameters are randomized which makes its implementation easy and provides a fast response. This paper describes the application of HDPH algorithm to linear antenna array synthesis. The results obtained with the HDPH algorithm are compared with three merged optimization techniques that are used in HDPH. The comparison shows that the performance of the proposed algorithm is comparatively better in both solution quality and robustness. The proposed hybrid algorithm HDPH can be an efficient candidate for real-time optimization problems since it yields reliable performance at all times when it gets executed.

  9. Soft learning vector quantization and clustering algorithms based on non-Euclidean norms: single-norm algorithms.

    Science.gov (United States)

    Karayiannis, Nicolaos B; Randolph-Gips, Mary M

    2005-03-01

    This paper presents the development of soft clustering and learning vector quantization (LVQ) algorithms that rely on a weighted norm to measure the distance between the feature vectors and their prototypes. The development of LVQ and clustering algorithms is based on the minimization of a reformulation function under the constraint that the generalized mean of the norm weights be constant. According to the proposed formulation, the norm weights can be computed from the data in an iterative fashion together with the prototypes. An error analysis provides some guidelines for selecting the parameter involved in the definition of the generalized mean in terms of the feature variances. The algorithms produced from this formulation are easy to implement and they are almost as fast as clustering algorithms relying on the Euclidean norm. An experimental evaluation on four data sets indicates that the proposed algorithms outperform consistently clustering algorithms relying on the Euclidean norm and they are strong competitors to non-Euclidean algorithms which are computationally more demanding.

  10. HYBRID OF FUZZY CLUSTERING NEURAL NETWORK OVER NSL DATASET FOR INTRUSION DETECTION SYSTEM

    Directory of Open Access Journals (Sweden)

    Dahlia Asyiqin Ahmad Zainaddin

    2013-01-01

    Full Text Available Intrusion Detection System (IDS is one of the component that take part in the system defence, to identify abnormal activities happening in the computer system. Nowadays, IDS facing composite demands to defeat modern attack activities from damaging the computer systems. Anomaly-Based IDS examines ongoing traffic, activity, transactions and behavior in order to identify intrusions by detecting anomalies. These technique identifies activities which degenerates from the normal behaviours. In recent years, data mining approach for intrusion detection have been advised and used. The approach such as Genetic Algorithms , Support Vector Machines, Neural Networks as well as clustering has resulted in high accuracy and good detection rates but with moderate false alarm on novel attacks. Many researchers also have proposed hybrid data mining techniques. The previous resechers has intoduced the combination of Fuzzy Clustering and Artificial Neural Network. However, it was tested only on randomn selection of KDDCup 1999 dataset. In this study the framework experiment introduced, has been used over the NSL dataset to test the stability and reliability of the technique. The result of precision, recall and f-value rate is compared with previous experiment. Both dataset covers four types of main attacks, which are Derial of Services (DoS, User to Root (U2R, Remote to Local (R2L and Probe. Results had guarenteed that the hybrid approach performed better detection especially for low frequent over NSL datataset compared to original KDD dataset, due to the removal of redundancy and uncomplete elements in the original dataset. This electronic document is a “live” template. The various components of your paper [title, text, tables, figures and references] are already defined on the style sheet, as illustrated by the portions given in this document.

  11. Data Accuracy Model for Distributed Clustering Algorithm based on Spatial Data Correlation in Wireless Sensor Networks

    CERN Document Server

    Karjee, Jyotirmoy

    2011-01-01

    Objective: The main objective of this paper is to construct a distributed clustering algorithm based upon spatial data correlation among sensor nodes and perform data accuracy for each distributed cluster at their respective cluster head node. Design Procedure/Approach: We investigate that due to deployment of high density of sensor nodes in the sensor field, spatial data are highly correlated among sensor nodes in spatial domain. Based on high data correlation among sensor nodes, we propose a non -overlapping irregular distributed clustering algorithm with different sizes to collect most accurate or precise data at the cluster head node for each respective distributed cluster. To collect the most accurate data at the cluster head node for each distributed cluster in sensor field, we propose a Data accuracy model and compare the results with Information accuracy model. Finding: Simulation results shows that our propose Data accuracy model collects more accurate data and gives better performance than Informati...

  12. Wireless Meter Reading Based Energy-Balanced Steady Clustering Routing Algorithm for Sensor Networks

    Directory of Open Access Journals (Sweden)

    TANG, Z.

    2011-05-01

    Full Text Available According to the characteristics of wireless meter reading system, an energy-balanced and energy-efficient steady clustering routing algorithm (EBSC, Energy-Balanced Steady Clustering is proposed. In the clustering mechanism, the current cluster head nodes determine cluster head nodes for next round according to the residual energy of the cluster members. In the next round, each non-cluster head node decides the cluster to which it will belong according to energy-distance function. The cluster head nodes send data to base station by the communication model of single hop and multi-hop that is decided according to the criterion of minimum energy consumption. In EBSC algorithm, the number of cluster head nodes generated in each round is very steady, and EBSC combines the advantage both distributed and centralized clustering algorithm. Experimental results show that the proposed routing algorithm not only efficiently uses limited energy of network nodes, but also well balances energy consumption of all nodes, and significantly prolongs network lifetime.

  13. A hybrid intelligent algorithm for portfolio selection problem with fuzzy returns

    Science.gov (United States)

    Li, Xiang; Zhang, Yang; Wong, Hau-San; Qin, Zhongfeng

    2009-11-01

    Portfolio selection theory with fuzzy returns has been well developed and widely applied. Within the framework of credibility theory, several fuzzy portfolio selection models have been proposed such as mean-variance model, entropy optimization model, chance constrained programming model and so on. In order to solve these nonlinear optimization models, a hybrid intelligent algorithm is designed by integrating simulated annealing algorithm, neural network and fuzzy simulation techniques, where the neural network is used to approximate the expected value and variance for fuzzy returns and the fuzzy simulation is used to generate the training data for neural network. Since these models are used to be solved by genetic algorithm, some comparisons between the hybrid intelligent algorithm and genetic algorithm are given in terms of numerical examples, which imply that the hybrid intelligent algorithm is robust and more effective. In particular, it reduces the running time significantly for large size problems.

  14. Near-infrared silver cluster optically signaling oligonucleotide hybridization and assembling two DNA hosts.

    Science.gov (United States)

    Petty, Jeffrey T; Nicholson, David A; Sergev, Orlin O; Graham, Stuart K

    2014-09-16

    Silver clusters with ~10 atoms form within DNA strands, and the conjugates are chemical sensors. The DNA host hybridizes with short oligonucleotides, and the cluster moieties optically respond to these analytes. Our studies focus on how the cluster adducts perturb the structure of their DNA hosts. Our sensor is comprised of an oligonucleotide with two components: a 5'-cluster domain that complexes silver clusters and a 3'-recognition site that hybridizes with a target oligonucleotide. The single-stranded sensor encapsulates an ~11 silver atom cluster with violet absorption at 400 nm and with minimal emission. The recognition site hybridizes with complementary oligonucleotides, and the violet cluster converts to an emissive near-infrared cluster with absorption at 730 nm. Our key finding is that the near-infrared cluster coordinates two of its hybridized hosts. The resulting tertiary structure was investigated using intermolecular and intramolecular variants of the same dimer. The intermolecular dimer assembles in concentrated (~5 μM) DNA solutions. Strand stoichiometries and orientations were chromatographically determined using thymine-modified complements that increase the overall conjugate size. The intramolecular dimer develops within a DNA scaffold that is founded on three linked duplexes. The high local cluster concentrations and relative strand arrangements again favor the antiparallel dimer for the near-infrared cluster. When the two monomeric DNA/violet cluster conjugates transform to one dimeric DNA/near-infrared conjugate, the DNA strands accumulate silver. We propose that these correlated changes in DNA structure and silver stoichiometry underlie the violet to near-infrared cluster transformation.

  15. The Research of an Incremental Conceptive Clustering Algorithm and Its Application in Detecting Money Laundering

    Institute of Scientific and Technical Information of China (English)

    CHEN Yunkai; LU Zhengding; LI Ruixuan; LI Yuhua; SUN Xiaolin

    2006-01-01

    Considering the constantly increasing of data in large databases such as wire transfer database, incremental clustering algorithms play a more and more important role in Data Mining (DM). However, Few of the traditional clustering algorithms can not only handle the categorical data, but also explain its output clearly. Based on the idea of dynamic clustering, an incremental conceptive clustering algorithm is proposed in this paper. Which introduces the Semantic Core Tree (SCT) to deal with large volume of categorical wire transfer data for the detecting money laundering. In addition, the rule generation algorithm is presented here to express the clustering result by the format of knowledge. When we apply this idea in financial data mining, the efficiency of searching the characters of money laundering data will be improved.

  16. Fuzzy Activation and Clustering of Nodes in a Hybrid Fibre Network Roll-out

    NARCIS (Netherlands)

    Kraak, J.J.; Phillipson, F.

    2015-01-01

    To design a Hybrid Fibre network, a selection of nodes is provided with active equipment and connected with fibre. If there is a need for a ring structure for high reliability, the activated nodes need to be clustered. In this paper a fuzzy method is proposed for this activation and clustering probl

  17. A novel artificial immune algorithm for spatial clustering with obstacle constraint and its applications.

    Science.gov (United States)

    Sun, Liping; Luo, Yonglong; Ding, Xintao; Zhang, Ji

    2014-01-01

    An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE) algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect.

  18. A Novel Artificial Immune Algorithm for Spatial Clustering with Obstacle Constraint and Its Applications

    Directory of Open Access Journals (Sweden)

    Liping Sun

    2014-01-01

    Full Text Available An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect.

  19. Implementation of Clustering Algorithms for real datasets in Medical Diagnostics using MATLAB

    Directory of Open Access Journals (Sweden)

    B. Venkataramana

    2017-03-01

    Full Text Available As in the medical field, for one disease there require samples given by diagnosis. The samples will be analyzed by a doctor or a pharmacist. As the no. of patients increases their samples also increases, there require more time to analyze samples for deciding the stage of the disease. To analyze the sample every time requires a skilled person. The samples can be classified by applying them to clustering algorithms. Data clustering has been considered as the most important raw data analysis method used in data mining technology. Most of the clustering techniques proved their efficiency in many applications such as decision making systems, medical sciences, earth sciences etc. Partition based clustering is one of the main approach in clustering. There are various algorithms of data clustering, every algorithm has its own advantages and disadvantages. This work reports the results of classification performance of three such widely used algorithms namely K-means (KM, Fuzzy c-means and Fuzzy Possibilistic c-Means (FPCM clustering algorithms. To analyze these algorithms three known data sets from UCI machine learning repository are taken such as thyroid data, liver and wine. The efficiency of clustering output is compared with the classification performance, percentage of correctness. The experimental results show that K-means and FCM give same performance for liver data. And FCM and FPCM are giving same performance for thyroid and wine data. FPCM has more efficient classification performance in all the given data sets.

  20. Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm

    Science.gov (United States)

    Frisca, Bustamam, Alhadi; Siswantining, Titin

    2017-03-01

    Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.

  1. Comparing the biological coherence of network clusters identified by different detection algorithms

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Protein-protein interaction networks serve to carry out basic molecular activity in the cell. Detecting the modular structures from the protein-protein interaction network is important for understanding the organization, function and dynamics of a biological system. In order to identify functional neighborhoods based on network topology, many network cluster identification algorithms have been developed. However, each algorithm might dissect a network from a different aspect and may provide different insight on the network partition. In order to objectively evaluate the performance of four commonly used cluster detection algorithms: molecular complex detection (MCODE), NetworkBlast, shortest-distance clustering (SDC) and Girvan-Newman (G-N) algorithm, we compared the biological coherence of the network clusters found by these algorithms through a uniform evaluation framework. Each algorithm was utilized to find network clusters in two different protein-protein interaction networks with various parameters. Comparison of the resulting network clusters indicates that clusters found by MCODE and SDC are of higher biological coherence than those by NetworkBlast and G-N algorithm.

  2. Genetic algorithm based two-mode clustering of metabolomics data

    NARCIS (Netherlands)

    Hageman, J.A.; Berg, R.A. van den; Westerhuis, J.A.; Werf, M.J. van der; Smilde, A.K.

    2008-01-01

    Metabolomics and other omics tools are generally characterized by large data sets with many variables obtained under different environmental conditions. Clustering methods and more specifically two-mode clustering methods are excellent tools for analyzing this type of data. Two-mode clustering metho

  3. Flexible Transmission Network Expansion Planning Considering Uncertain Renewable Generation and Load Demand Based on Hybrid Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Yun-Hao Li

    2015-12-01

    Full Text Available This paper presents a flexible transmission network expansion planning (TNEP approach considering uncertainty. A novel hybrid clustering technique, which integrates the graph partitioning method and rough fuzzy clustering, is proposed to cope with uncertain renewable generation and load demand. The proposed clustering method is capable of recognizing the actual cluster distribution of complex datasets and providing high-quality clustering results. By clustering the hourly data for renewable generation and load demand, a multi-scenario model is proposed to consider the corresponding uncertainties in TNEP. Furthermore, due to the peak distribution characteristics of renewable generation and heavy investment in transmission, the traditional TNEP, which caters to rated renewable power output, is usually uneconomic. To improve the economic efficiency, the multi-objective optimization is incorporated into the multi-scenario TNEP model, while the curtailment of renewable generation is considered as one of the optimization objectives. The solution framework applies a modified NSGA-II algorithm to obtain a set of Pareto optimal planning schemes with different levels of investment costs and renewable generation curtailments. Numerical results on the IEEE RTS-24 system demonstrated the robustness and effectiveness of the proposed approach.

  4. Text clustering based on fusion of ant colony and genetic algorithms

    Institute of Scientific and Technical Information of China (English)

    Yun ZHANG; Boqin FENG; Shouqiang MA; Lianmeng LIU

    2009-01-01

    Focusing on the problem that the ant colony algorithm gets into stagnation easily and cannot fully search in solution space,a text clustering approach based on the fusion of the ant colony and genetic algorithms is proposed.The four parameters that influence the performance of the ant colony algorithm are encoded as chromosomes,thereby the fitness function,selection,crossover and mutation operator are designed to find the combination of optimal parameters through a number of iteration,and then it is applied to text clustering.The simulation.results show that compared with the classical k-means clustering and the basic ant colony clustering algorithm,the proposed algorithm has better performance and the value of F-Measure is enhanced by 5.69%,48.60% and 69.60%,respectively,in 3 test datasets.Therefore,it is more suitable for processing a larger dataset.

  5. A highly efficient multi-core algorithm for clustering extremely large datasets

    Directory of Open Access Journals (Sweden)

    Kraus Johann M

    2010-04-01

    Full Text Available Abstract Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer.

  6. Titanium oxo-clusters: precursors for a Lego-like construction of nanostructured hybrid materials.

    Science.gov (United States)

    Rozes, Laurence; Sanchez, Clément

    2011-02-01

    Titanium oxo-clusters, well-defined monodispersed nano-objects, are appropriate nano-building blocks for the preparation of organic-inorganic materials by a bottom up approach. This critical review proposes to present the different structures of titanium oxo-clusters referenced in the literature and the different strategies followed to build up hybrid materials with these versatile building units. In particular, this critical review cites and reports on the most important papers in the literature, concentrating on recent developments in the field of synthesis, characterization, and the use of titanium oxo-clusters for the construction of advanced hybrid materials (137 references).

  7. An Efficient Data Aggregation Algorithm for Cluster-based Sensor Network

    Directory of Open Access Journals (Sweden)

    Mohammad Mostafizur Rahman Mozumdar

    2009-09-01

    Full Text Available Data aggregation in wireless sensor networks eliminates redundancy to improve bandwidth utilization and energyefficiency of sensor nodes. One node, called the cluster leader, collects data from surrounding nodes and then sends the summarized information to upstream nodes. In this paper, we propose an algorithm to select a cluster leader that will perform data aggregation in a partially connected sensor network. The algorithm reduces the traffic flow inside the network by adaptively selecting the shortest route for packet routing to the cluster leader. We also describe a simulation framework for functional analysis of WSN applications taking our proposed algorithm as an example.

  8. A Clustering Algorithm Using the Tabu Search Approach with Simulated Annealing for Vector Quantization

    Institute of Scientific and Technical Information of China (English)

    CHUShuchuan; JohnF.Roddick

    2003-01-01

    In this paper, a cluster generation algorithm for vector quantization using a tabu search approach with simulated annealing is proposed. The main iclea of this algorithm is to use the tabu search approach to gen-erate non-local moves for the clusters and apply the sim-ulated annealing technique to select the current best solu-tion, thus improving the cluster generation and reducing the mean squared error. Preliminary experimental results demonstrate that the proposed approach is superior to the tabu search approach with Generalised Lloyd algorithm.

  9. Scaling up the DBSCAN Algorithm for Clustering Large Spatial Databases Based on Sampling Technique

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Clustering, in data mining, is a useful technique for discoveringinte resting data distributions and patterns in the underlying data, and has many app lication fields, such as statistical data analysis, pattern recognition, image p rocessing, and etc. We combine sampling technique with DBSCAN alg orithm to cluster large spatial databases, and two sampling-based DBSCAN (SDBSC A N) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental resul ts demonstrate that our algorithms are effective and efficient in clustering lar ge-scale spatial databases.

  10. K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

    Directory of Open Access Journals (Sweden)

    Cheng Lu

    2015-01-01

    Full Text Available The Affinity Propagation (AP algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based on K-nearest neighbor intervals (KNNI for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

  11. A FLEXIBLE HYBRID GMRES ALGORITHM%一种灵活的混合GMRES算法

    Institute of Scientific and Technical Information of China (English)

    钟宝江

    2001-01-01

    A variant of the hybrid GMRES algorithm of N.M. Nachtigal, L. Reichel, and L. N. Trefethen for solving large nonsymmetric systems of linear equations is presented. This algorithm allows the GMRES polynomial re-applied later being constructed in the course of a restarted GMRES iteration. It is described how the new hybrid scheme may offer significant performance improvements over the old one.

  12. A Simple Sizing Algorithm for Stand-Alone PV/Wind/Battery Hybrid Microgrids

    OpenAIRE

    Jing Li; Wei Wei; Ji Xiang

    2012-01-01

    In this paper, we develop a simple algorithm to determine the required number of generating units of wind-turbine generator and photovoltaic array, and the associated storage capacity for stand-alone hybrid microgrid. The algorithm is based on the observation that the state of charge of battery should be periodically invariant. The optimal sizing of hybrid microgrid is given in the sense that the life cycle cost of system is minimized while the given load power demand can be satisfied without...

  13. Hybrid algorithm for accelerating the double series of Floquet vector modes

    Institute of Scientific and Technical Information of China (English)

    LI Weidong; HONG Wei; HAO Zhangcheng; ZHOU Houxing

    2006-01-01

    In this paper, a hybrid algorithm for accelerating the double series of Floquet vector modes arising in the analysis of frequency selective surfaces (FSS) is presented. The asymptotic terms with slow convergence in the double series are first accelerated by Poisson transformation and Ewald method, and then the remained series is accelerated by Shank transformation. It results in significant savings in memory and computing time. Numerical examples verify the validity of the hybrid acceleration algorithm.

  14. Photoelectron imaging of small aluminum clusters: quantifying s-p hybridization.

    Science.gov (United States)

    Melko, Joshua J; Castleman, A W

    2013-03-07

    Photoelectron imaging experiments and detailed calculations are conducted on Al(n)(-) clusters (n = 3-6) and a calibration method is developed for connecting experimental observations of photoelectron angular distributions to theoretical predictions. It is shown that this method can be used to quantify the degree to which the molecular orbitals are built from s- or p-like atomic orbitals. The highest occupied molecular orbitals of these small aluminum clusters are found to contain varying degrees of s-p mixing, with Al(3)(-) containing the "most hybridized" orbital and Al(4)(-) containing the "least hybridized" orbital. It is shown experimentally that s-p hybridization is already present for the trimer species and, similar to other properties of small metal clusters, oscillates with cluster size.

  15. An Allele Real-Coded Quantum Evolutionary Algorithm Based on Hybrid Updating Strategy.

    Science.gov (United States)

    Zhang, Yu-Xian; Qian, Xiao-Yi; Peng, Hui-Deng; Wang, Jian-Hui

    2016-01-01

    For improving convergence rate and preventing prematurity in quantum evolutionary algorithm, an allele real-coded quantum evolutionary algorithm based on hybrid updating strategy is presented. The real variables are coded with probability superposition of allele. A hybrid updating strategy balancing the global search and local search is presented in which the superior allele is defined. On the basis of superior allele and inferior allele, a guided evolutionary process as well as updating allele with variable scale contraction is adopted. And H ε gate is introduced to prevent prematurity. Furthermore, the global convergence of proposed algorithm is proved by Markov chain. Finally, the proposed algorithm is compared with genetic algorithm, quantum evolutionary algorithm, and double chains quantum genetic algorithm in solving continuous optimization problem, and the experimental results verify the advantages on convergence rate and search accuracy.

  16. An Allele Real-Coded Quantum Evolutionary Algorithm Based on Hybrid Updating Strategy

    Directory of Open Access Journals (Sweden)

    Yu-Xian Zhang

    2016-01-01

    Full Text Available For improving convergence rate and preventing prematurity in quantum evolutionary algorithm, an allele real-coded quantum evolutionary algorithm based on hybrid updating strategy is presented. The real variables are coded with probability superposition of allele. A hybrid updating strategy balancing the global search and local search is presented in which the superior allele is defined. On the basis of superior allele and inferior allele, a guided evolutionary process as well as updating allele with variable scale contraction is adopted. And Hε gate is introduced to prevent prematurity. Furthermore, the global convergence of proposed algorithm is proved by Markov chain. Finally, the proposed algorithm is compared with genetic algorithm, quantum evolutionary algorithm, and double chains quantum genetic algorithm in solving continuous optimization problem, and the experimental results verify the advantages on convergence rate and search accuracy.

  17. Parameter estimation for chaotic systems using a hybrid adaptive cuckoo search with simulated annealing algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Sheng, Zheng, E-mail: 19994035@sina.com [College of Meteorology and Oceanography, PLA University of Science and Technology, Nanjing 211101 (China); Wang, Jun; Zhou, Bihua [National Defense Key Laboratory on Lightning Protection and Electromagnetic Camouflage, PLA University of Science and Technology, Nanjing 210007 (China); Zhou, Shudao [College of Meteorology and Oceanography, PLA University of Science and Technology, Nanjing 211101 (China); Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing 210044 (China)

    2014-03-15

    This paper introduces a novel hybrid optimization algorithm to establish the parameters of chaotic systems. In order to deal with the weaknesses of the traditional cuckoo search algorithm, the proposed adaptive cuckoo search with simulated annealing algorithm is presented, which incorporates the adaptive parameters adjusting operation and the simulated annealing operation in the cuckoo search algorithm. Normally, the parameters of the cuckoo search algorithm are kept constant that may result in decreasing the efficiency of the algorithm. For the purpose of balancing and enhancing the accuracy and convergence rate of the cuckoo search algorithm, the adaptive operation is presented to tune the parameters properly. Besides, the local search capability of cuckoo search algorithm is relatively weak that may decrease the quality of optimization. So the simulated annealing operation is merged into the cuckoo search algorithm to enhance the local search ability and improve the accuracy and reliability of the results. The functionality of the proposed hybrid algorithm is investigated through the Lorenz chaotic system under the noiseless and noise condition, respectively. The numerical results demonstrate that the method can estimate parameters efficiently and accurately in the noiseless and noise condition. Finally, the results are compared with the traditional cuckoo search algorithm, genetic algorithm, and particle swarm optimization algorithm. Simulation results demonstrate the effectiveness and superior performance of the proposed algorithm.

  18. Parameter estimation for chaotic systems using a hybrid adaptive cuckoo search with simulated annealing algorithm

    Science.gov (United States)

    Sheng, Zheng; Wang, Jun; Zhou, Shudao; Zhou, Bihua

    2014-03-01

    This paper introduces a novel hybrid optimization algorithm to establish the parameters of chaotic systems. In order to deal with the weaknesses of the traditional cuckoo search algorithm, the proposed adaptive cuckoo search with simulated annealing algorithm is presented, which incorporates the adaptive parameters adjusting operation and the simulated annealing operation in the cuckoo search algorithm. Normally, the parameters of the cuckoo search algorithm are kept constant that may result in decreasing the efficiency of the algorithm. For the purpose of balancing and enhancing the accuracy and convergence rate of the cuckoo search algorithm, the adaptive operation is presented to tune the parameters properly. Besides, the local search capability of cuckoo search algorithm is relatively weak that may decrease the quality of optimization. So the simulated annealing operation is merged into the cuckoo search algorithm to enhance the local search ability and improve the accuracy and reliability of the results. The functionality of the proposed hybrid algorithm is investigated through the Lorenz chaotic system under the noiseless and noise condition, respectively. The numerical results demonstrate that the method can estimate parameters efficiently and accurately in the noiseless and noise condition. Finally, the results are compared with the traditional cuckoo search algorithm, genetic algorithm, and particle swarm optimization algorithm. Simulation results demonstrate the effectiveness and superior performance of the proposed algorithm.

  19. A Hybrid Bacterial Foraging Algorithm For Solving Job Shop Scheduling Problems

    OpenAIRE

    Narendhar, S.; T Amudha

    2012-01-01

    Bio-Inspired computing is the subset of Nature-Inspired computing. Job Shop Scheduling Problem is categorized under popular scheduling problems. In this research work, Bacterial Foraging Optimization was hybridized with Ant Colony Optimization and a new technique Hybrid Bacterial Foraging Optimization for solving Job Shop Scheduling Problem was proposed. The optimal solutions obtained by proposed Hybrid Bacterial Foraging Optimization algorithms are much better when compared with the solution...

  20. Hierarchical trie packet classification algorithm based on expectation-maximization clustering

    Science.gov (United States)

    Bi, Xia-an; Zhao, Junxia

    2017-01-01

    With the development of computer network bandwidth, packet classification algorithms which are able to deal with large-scale rule sets are in urgent need. Among the existing algorithms, researches on packet classification algorithms based on hierarchical trie have become an important packet classification research branch because of their widely practical use. Although hierarchical trie is beneficial to save large storage space, it has several shortcomings such as the existence of backtracking and empty nodes. This paper proposes a new packet classification algorithm, Hierarchical Trie Algorithm Based on Expectation-Maximization Clustering (HTEMC). Firstly, this paper uses the formalization method to deal with the packet classification problem by means of mapping the rules and data packets into a two-dimensional space. Secondly, this paper uses expectation-maximization algorithm to cluster the rules based on their aggregate characteristics, and thereby diversified clusters are formed. Thirdly, this paper proposes a hierarchical trie based on the results of expectation-maximization clustering. Finally, this paper respectively conducts simulation experiments and real-environment experiments to compare the performances of our algorithm with other typical algorithms, and analyzes the results of the experiments. The hierarchical trie structure in our algorithm not only adopts trie path compression to eliminate backtracking, but also solves the problem of low efficiency of trie updates, which greatly improves the performance of the algorithm. PMID:28704476

  1. CHAOS-REGULARIZATION HYBRID ALGORITHM FOR NONLINEAR TWO-DIMENSIONAL INVERSE HEAT CONDUCTION PROBLEM

    Institute of Scientific and Technical Information of China (English)

    王登刚; 刘迎曦; 李守巨

    2002-01-01

    A numerical model of nonlinear two-dimensional steady inverse heat conduction problem was established considering the thermal conductivity changing with temperature.Combining the chaos optimization algorithm with the gradient regularization method, a chaos-regularization hybrid algorithm was proposed to solve the established numerical model.The hybrid algorithm can give attention to both the advantages of chaotic optimization algorithm and those of gradient regularization method. The chaos optimization algorithm was used to help the gradient regalarization method to escape from local optima in the hybrid algorithm. Under the assumption of temperature-dependent thermal conductivity changing with temperature in linear rule, the thermal conductivity and the linear rule were estimated by using the present method with the aid of boundary temperature measurements. Numerical simulation results show that good estimation on the thermal conductivity and the linear function can be obtained with arbitrary initial guess values, and that the present hybrid algorithm is much more efficient than conventional genetic algorithm and chaos optimization algorithm.

  2. Effective hybrid evolutionary computational algorithms for global optimization and applied to construct prion AGAAAAGA fibril models

    CERN Document Server

    Zhang, Jiapu

    2010-01-01

    Evolutionary algorithms are parallel computing algorithms and simulated annealing algorithm is a sequential computing algorithm. This paper inserts simulated annealing into evolutionary computations and successful developed a hybrid Self-Adaptive Evolutionary Strategy $\\mu+\\lambda$ method and a hybrid Self-Adaptive Classical Evolutionary Programming method. Numerical results on more than 40 benchmark test problems of global optimization show that the hybrid methods presented in this paper are very effective. Lennard-Jones potential energy minimization is another benchmark for testing new global optimization algorithms. It is studied through the amyloid fibril constructions by this paper. To date, there is little molecular structural data available on the AGAAAAGA palindrome in the hydrophobic region (113-120) of prion proteins.This region belongs to the N-terminal unstructured region (1-123) of prion proteins, the structure of which has proved hard to determine using NMR spectroscopy or X-ray crystallography ...

  3. Clustering dynamic textures with the hierarchical em algorithm for modeling video.

    Science.gov (United States)

    Mumtaz, Adeel; Coviello, Emanuele; Lanckriet, Gert R G; Chan, Antoni B

    2013-07-01

    Dynamic texture (DT) is a probabilistic generative model, defined over space and time, that represents a video as the output of a linear dynamical system (LDS). The DT model has been applied to a wide variety of computer vision problems, such as motion segmentation, motion classification, and video registration. In this paper, we derive a new algorithm for clustering DT models that is based on the hierarchical EM algorithm. The proposed clustering algorithm is capable of both clustering DTs and learning novel DT cluster centers that are representative of the cluster members in a manner that is consistent with the underlying generative probabilistic model of the DT. We also derive an efficient recursive algorithm for sensitivity analysis of the discrete-time Kalman smoothing filter, which is used as the basis for computing expectations in the E-step of the HEM algorithm. Finally, we demonstrate the efficacy of the clustering algorithm on several applications in motion analysis, including hierarchical motion clustering, semantic motion annotation, and learning bag-of-systems (BoS) codebooks for dynamic texture recognition.

  4. Analysis of basic clustering algorithms for numerical estimation of statistical averages in biomolecules.

    Science.gov (United States)

    Anandakrishnan, Ramu; Onufriev, Alexey

    2008-03-01

    In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.

  5. Novel density-based and hierarchical density-based clustering algorithms for uncertain data.

    Science.gov (United States)

    Zhang, Xianchao; Liu, Han; Zhang, Xiaotong

    2017-09-01

    Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing

  6. Constructing a graph of connections in clustering algorithm of complex objects

    Directory of Open Access Journals (Sweden)

    Татьяна Шатовская

    2015-05-01

    Full Text Available The article describes the results of modifying the algorithm Chameleon. Hierarchical multi-level algorithm consists of several phases: the construction of the count, coarsening, the separation and recovery. Each phase can be used various approaches and algorithms. The main aim of the work is to study the quality of the clustering of different sets of data using a set of algorithms combinations at different stages of the algorithm and improve the stage of construction by the optimization algorithm of k choice in the graph construction of k of nearest neighbors

  7. A scalable and practical one-pass clustering algorithm for recommender system

    Science.gov (United States)

    Khalid, Asra; Ghazanfar, Mustansar Ali; Azam, Awais; Alahmari, Saad Ali

    2015-12-01

    KMeans clustering-based recommendation algorithms have been proposed claiming to increase the scalability of recommender systems. One potential drawback of these algorithms is that they perform training offline and hence cannot accommodate the incremental updates with the arrival of new data, making them unsuitable for the dynamic environments. From this line of research, a new clustering algorithm called One-Pass is proposed, which is a simple, fast, and accurate. We show empirically that the proposed algorithm outperforms K-Means in terms of recommendation and training time while maintaining a good level of accuracy.

  8. Two Parallel Swendsen-Wang Cluster Algorithms Using Message-Passing Paradigm

    CERN Document Server

    Lin, Shizeng

    2008-01-01

    In this article, we present two different parallel Swendsen-Wang Cluster(SWC) algorithms using message-passing interface(MPI). One is based on Master-Slave Parallel Model(MSPM) and the other is based on Data-Parallel Model(DPM). A speedup of 24 with 40 processors and 16 with 37 processors is achieved with the DPM and MSPM respectively. The speedup of both algorithms at different temperature and system size is carefully examined both experimentally and theoretically, and a comparison of their efficiency is made. In the last section, based on these two parallel SWC algorithms, two parallel probability changing cluster(PCC) algorithms are proposed.

  9. An Effective Hybrid Cuckoo Search Algorithm with Improved Shuffled Frog Leaping Algorithm for 0-1 Knapsack Problems

    OpenAIRE

    2014-01-01

    An effective hybrid cuckoo search algorithm (CS) with improved shuffled frog-leaping algorithm (ISFLA) is put forward for solving 0-1 knapsack problem. First of all, with the framework of SFLA, an improved frog-leap operator is designed with the effect of the global optimal information on the frog leaping and information exchange between frog individuals combined with genetic mutation with a small probability. Subsequently, in order to improve the ...

  10. Application of Hybrid Genetic Algorithm Routine in Optimizing Food and Bioengineering Processes.

    Science.gov (United States)

    Tumuluru, Jaya Shankar; McCulloch, Richard

    2016-11-09

    Optimization is a crucial step in the analysis of experimental results. Deterministic methods only converge on local optimums and require exponentially more time as dimensionality increases. Stochastic algorithms are capable of efficiently searching the domain space; however convergence is not guaranteed. This article demonstrates the novelty of the hybrid genetic algorithm (HGA), which combines both stochastic and deterministic routines for improved optimization results. The new hybrid genetic algorithm developed is applied to the Ackley benchmark function as well as case studies in food, biofuel, and biotechnology processes. For each case study, the hybrid genetic algorithm found a better optimum candidate than reported by the sources. In the case of food processing, the hybrid genetic algorithm improved the anthocyanin yield by 6.44%. Optimization of bio-oil production using HGA resulted in a 5.06% higher yield. In the enzyme production process, HGA predicted a 0.39% higher xylanase yield. Hybridization of the genetic algorithm with a deterministic algorithm resulted in an improved optimum compared to statistical methods.

  11. Size-dependent photoabsorption and photoemission of supported silver clusters and silver cluster-biomolecule hybrid systems

    Energy Technology Data Exchange (ETDEWEB)

    Mitric, Roland; Buergel, Christian; Petersen, Jens; Kulesza, Alexander; Bonacic-Koutecky, Vlasta [Humboldt-Universitaet zu Berlin, Institut fuer Chemie, Brook-Taylor-Str. 2, D-12489 Berlin (Germany)

    2008-07-01

    Silver clusters interacting with different environments such as surfaces or biomolecules exhibit fascinating absorption and emissive properties which can be exploited for biosensing and optoelectronic applications. We address theoretically size dependent structural and optical properties of silver clusters Ag{sub n} (n=2,4,6,8) suppported on MgO surface as well as optical properties of silver-cluster tryptophan hybrid systems Trp-Ag{sub n}{sup +} (n=1-9). Our results on supported silver clusters provide insight into the mechanism responsible for absorption and emission patterns arising from interaction between the excitation within the cluster and the environment. We demonstrate that small clusters such as Ag{sub 4} are good candidates for fluorescence centers in the visible regime. Furthermore, in the Trp-Ag{sub n}{sup +} hybrid system we identified different types of charge transfer between the silver and biomolecule subunits. Remarkably, we observe a strong reduction of the photofragmentation yield in Trp-Ag{sub 9}{sup +} in comparison with free Ag{sub 9}{sup +} which may be attributed to energy dissipation by fluorescence. Thus, the unique optical properties of supported silver nanoclusters combined with the specific bio-recognition of biomolecules will provide fundamentals for the future development of fluorescent nanocluster-based biochips.

  12. A Novel Automatic Detection System for ECG Arrhythmias Using Maximum Margin Clustering with Immune Evolutionary Algorithm

    Directory of Open Access Journals (Sweden)

    Bohui Zhu

    2013-01-01

    Full Text Available This paper presents a novel maximum margin clustering method with immune evolution (IEMMC for automatic diagnosis of electrocardiogram (ECG arrhythmias. This diagnostic system consists of signal processing, feature extraction, and the IEMMC algorithm for clustering of ECG arrhythmias. First, raw ECG signal is processed by an adaptive ECG filter based on wavelet transforms, and waveform of the ECG signal is detected; then, features are extracted from ECG signal to cluster different types of arrhythmias by the IEMMC algorithm. Three types of performance evaluation indicators are used to assess the effect of the IEMMC method for ECG arrhythmias, such as sensitivity, specificity, and accuracy. Compared with K-means and iterSVR algorithms, the IEMMC algorithm reflects better performance not only in clustering result but also in terms of global search ability and convergence ability, which proves its effectiveness for the detection of ECG arrhythmias.

  13. A fast SVM training algorithm based on the set segmentation and k-means clustering

    Institute of Scientific and Technical Information of China (English)

    YANG Xiaowei; LIN Daying; HAO Zhifeng; LIANG Yanchun; LIU Guirong; HAN Xu

    2003-01-01

    At present, studies on training algorithms for support vector machines (SVM) are important issues in the field of machine learning. It is a challenging task to improve the efficiency of the algorithm without reducing the generalization performance of SVM. To face this challenge, a new SVM training algorithm based on the set segmentation and k-means clustering is presented in this paper. The new idea is to divide all the original training data into many subsets, followed by clustering each subset using k-means clustering and finally train SVM using the new data set obtained from clustering centroids. Considering that the decomposition algorithm such as SVMlight is one of the major methods for solving support vector machines, the SVMlight is used in our experiments. Simulations on different types of problems show that the proposed method can solve efficiently not only large linear classification problems but also large nonlinear ones.

  14. Scheduling algorithm of dual-armed cluster tools with residency time and reentrant constraints

    Institute of Scientific and Technical Information of China (English)

    周炳海; 高忠顺; 陈佳

    2014-01-01

    To solve the scheduling problem of dual-armed cluster tools for wafer fabrications with residency time and reentrant constraints, a heuristic scheduling algorithm was developed. Firstly, on the basis of formulating scheduling problems domain of dual-armed cluster tools, a non-integer programming model was set up with a minimizing objective function of the makespan. Combining characteristics of residency time and reentrant constraints, a scheduling algorithm of searching the optimal operation path of dual-armed transport module was presented under many kinds of robotic scheduling paths for dual-armed cluster tools. Finally, the experiments were designed to evaluate the proposed algorithm. The results show that the proposed algorithm is feasible and efficient for obtaining an optimal scheduling solution of dual-armed cluster tools with residency time and reentrant constraints.

  15. A Load Balancing Algorithm Based on Maximum Entropy Methods in Homogeneous Clusters

    Directory of Open Access Journals (Sweden)

    Long Chen

    2014-10-01

    Full Text Available In order to solve the problems of ill-balanced task allocation, long response time, low throughput rate and poor performance when the cluster system is assigning tasks, we introduce the concept of entropy in thermodynamics into load balancing algorithms. This paper proposes a new load balancing algorithm for homogeneous clusters based on the Maximum Entropy Method (MEM. By calculating the entropy of the system and using the maximum entropy principle to ensure that each scheduling and migration is performed following the increasing tendency of the entropy, the system can achieve the load balancing status as soon as possible, shorten the task execution time and enable high performance. The result of simulation experiments show that this algorithm is more advanced when it comes to the time and extent of the load balance of the homogeneous cluster system compared with traditional algorithms. It also provides novel thoughts of solutions for the load balancing problem of the homogeneous cluster system.

  16. Optimization of Evolutionary Neural Networks Using Hybrid Learning Algorithms

    OpenAIRE

    Abraham, Ajith

    2004-01-01

    Evolutionary artificial neural networks (EANNs) refer to a special class of artificial neural networks (ANNs) in which evolution is another fundamental form of adaptation in addition to learning. Evolutionary algorithms are used to adapt the connection weights, network architecture and learning algorithms according to the problem environment. Even though evolutionary algorithms are well known as efficient global search algorithms, very often they miss the best local solutions in the complex s...

  17. Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Clusters

    KAUST Repository

    Wu, X.

    2011-07-18

    The NAS Parallel Benchmarks (NPB) are well-known applications with fixed algorithms for evaluating parallel systems and tools. Multicore clusters provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node, and MPI can be used with the communication between nodes. In this paper, we use Scalar Pentadiagonal (SP) and Block Tridiagonal (BT) benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore clusters, Intrepid (BlueGene/P) at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76 %, and the hybrid BT outperforms the MPI BT by up to 8.58 % on up to 10 000 cores on Intrepid and Jaguar. We also use performance tools and MPI trace libraries available on these clusters to further investigate the performance characteristics of the hybrid SP and BT. © 2011 The Author. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.

  18. Solving the Capacitated Vehicle Routing Problem Based on Improved Ant-clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Zhang Jiashan

    2015-01-01

    Full Text Available The capacitated vehicle routing problems (CVRP are NP-hard. Most approaches can solve small-scale case studies to optimality. Furthermore, they are time-consuming. To overcome the limitation, this paper presents a novel three-phase heuristic approach for the capacitated vehicle routing problem. The first phase aims to identify sets of cost-effective feasible clusters through an improved ant-clustering algorithm, in which the adaptive strategy is adopted. The second phase assigns clusters to vehicles and sequences them on each tour. The third phase orders nodes within clusters for every tour and genetic algorithm is used to order nodes within clusters. The simulation indicates the algorithm attains high quality results in a short time.

  19. Kernel Clustering with a Differential Harmony Search Algorithm for Scheme Classification

    Directory of Open Access Journals (Sweden)

    Yu Feng

    2017-01-01

    Full Text Available This paper presents a kernel fuzzy clustering with a novel differential harmony search algorithm to coordinate with the diversion scheduling scheme classification. First, we employed a self-adaptive solution generation strategy and differential evolution-based population update strategy to improve the classical harmony search. Second, we applied the differential harmony search algorithm to the kernel fuzzy clustering to help the clustering method obtain better solutions. Finally, the combination of the kernel fuzzy clustering and the differential harmony search is applied for water diversion scheduling in East Lake. A comparison of the proposed method with other methods has been carried out. The results show that the kernel clustering with the differential harmony search algorithm has good performance to cooperate with the water diversion scheduling problems.

  20. Optimization of self-interstitial clusters in 3C-SiC with genetic algorithm

    Science.gov (United States)

    Ko, Hyunseok; Kaczmarowski, Amy; Szlufarska, Izabela; Morgan, Dane

    2017-08-01

    Under irradiation, SiC develops damage commonly referred to as black spot defects, which are speculated to be self-interstitial atom clusters. To understand the evolution of these defect clusters and their impacts (e.g., through radiation induced swelling) on the performance of SiC in nuclear applications, it is important to identify the cluster composition, structure, and shape. In this work the genetic algorithm code StructOpt was utilized to identify groundstate cluster structures in 3C-SiC. The genetic algorithm was used to explore clusters of up to ∼30 interstitials of C-only, Si-only, and Si-C mixtures embedded in the SiC lattice. We performed the structure search using Hamiltonians from both density functional theory and empirical potentials. The thermodynamic stability of clusters was investigated in terms of their composition (with a focus on Si-only, C-only, and stoichiometric) and shape (spherical vs. planar), as a function of the cluster size (n). Our results suggest that large Si-only clusters are likely unstable, and clusters are predominantly C-only for n ≤ 10 and stoichiometric for n > 10. The results imply that there is an evolution of the shape of the most stable clusters, where small clusters are stable in more spherical geometries while larger clusters are stable in more planar configurations. We also provide an estimated energy vs. size relationship, E(n), for use in future analysis.