c-mean clustering model: Topics by WorldWideScience.org

Sample records for c-mean clustering model

Fuzzy C-Means Clustering Model Data Mining For Recognizing Stock Data Sampling Pattern

Directory of Open Access Journals (Sweden)

Sylvia Jane Annatje Sumarauw

2007-06-01

Full Text Available Abstract Capital market has been beneficial to companies and investor. For investors, the capital market provides two economical advantages, namely deviden and capital gain, and a non-economical one that is a voting .} hare in Shareholders General Meeting. But, it can also penalize the share owners. In order to prevent them from the risk, the investors should predict the prospect of their companies. As a consequence of having an abstract commodity, the share quality will be determined by the validity of their company profile information. Any information of stock value fluctuation from Jakarta Stock Exchange can be a useful consideration and a good measurement for data analysis. In the context of preventing the shareholders from the risk, this research focuses on stock data sample category or stock data sample pattern by using Fuzzy c-Me, MS Clustering Model which providing any useful information jar the investors. lite research analyses stock data such as Individual Index, Volume and Amount on Property and Real Estate Emitter Group at Jakarta Stock Exchange from January 1 till December 31 of 204. 'he mining process follows Cross Industry Standard Process model for Data Mining (CRISP,. DM in the form of circle with these steps: Business Understanding, Data Understanding, Data Preparation, Modelling, Evaluation and Deployment. At this modelling process, the Fuzzy c-Means Clustering Model will be applied. Data Mining Fuzzy c-Means Clustering Model can analyze stock data in a big database with many complex variables especially for finding the data sample pattern, and then building Fuzzy Inference System for stimulating inputs to be outputs that based on Fuzzy Logic by recognising the pattern. Keywords: Data Mining, AUz..:y c-Means Clustering Model, Pattern Recognition
Soil-landscape modelling using fuzzy c-means clustering of attribute data derived from a Digital Elevation Model (DEM).

NARCIS (Netherlands)

Bruin, de S.; Stein, A.

1998-01-01

This study explores the use of fuzzy c-means clustering of attribute data derived from a digital elevation model to represent transition zones in the soil-landscape. The conventional geographic model used for soil-landscape description is not able to properly deal with these. Fuzzy c-means
TOWARDS FINDING A NEW KERNELIZED FUZZY C-MEANS CLUSTERING ALGORITHM

Directory of Open Access Journals (Sweden)

Samarjit Das

2014-04-01

Full Text Available Kernelized Fuzzy C-Means clustering technique is an attempt to improve the performance of the conventional Fuzzy C-Means clustering technique. Recently this technique where a kernel-induced distance function is used as a similarity measure instead of a Euclidean distance which is used in the conventional Fuzzy C-Means clustering technique, has earned popularity among research community. Like the conventional Fuzzy C-Means clustering technique this technique also suffers from inconsistency in its performance due to the fact that here also the initial centroids are obtained based on the randomly initialized membership values of the objects. Our present work proposes a new method where we have applied the Subtractive clustering technique of Chiu as a preprocessor to Kernelized Fuzzy CMeans clustering technique. With this new method we have tried not only to remove the inconsistency of Kernelized Fuzzy C-Means clustering technique but also to deal with the situations where the number of clusters is not predetermined. We have also provided a comparison of our method with the Subtractive clustering technique of Chiu and Kernelized Fuzzy C-Means clustering technique using two validity measures namely Partition Coefficient and Clustering Entropy.
Soft Sensor Modeling Based on Multiple Gaussian Process Regression and Fuzzy C-mean Clustering

Directory of Open Access Journals (Sweden)

Xianglin ZHU

2014-06-01

Full Text Available In order to overcome the difficulties of online measurement of some crucial biochemical variables in fermentation processes, a new soft sensor modeling method is presented based on the Gaussian process regression and fuzzy C-mean clustering. With the consideration that the typical fermentation process can be distributed into 4 phases including lag phase, exponential growth phase, stable phase and dead phase, the training samples are classified into 4 subcategories by using fuzzy C- mean clustering algorithm. For each sub-category, the samples are trained using the Gaussian process regression and the corresponding soft-sensing sub-model is established respectively. For a new sample, the membership between this sample and sub-models are computed based on the Euclidean distance, and then the prediction output of soft sensor is obtained using the weighting sum. Taking the Lysine fermentation as example, the simulation and experiment are carried out and the corresponding results show that the presented method achieves better fitting and generalization ability than radial basis function neutral network and single Gaussian process regression model.
[Predicting Incidence of Hepatitis E in Chinausing Fuzzy Time Series Based on Fuzzy C-Means Clustering Analysis].

Science.gov (United States)

Luo, Yi; Zhang, Tao; Li, Xiao-song

2016-05-01

To explore the application of fuzzy time series model based on fuzzy c-means clustering in forecasting monthly incidence of Hepatitis E in mainland China. Apredictive model (fuzzy time series method based on fuzzy c-means clustering) was developed using Hepatitis E incidence data in mainland China between January 2004 and July 2014. The incidence datafrom August 2014 to November 2014 were used to test the fitness of the predictive model. The forecasting results were compared with those resulted from traditional fuzzy time series models. The fuzzy time series model based on fuzzy c-means clustering had 0.001 1 mean squared error (MSE) of fitting and 6.977 5 x 10⁻⁴ MSE of forecasting, compared with 0.0017 and 0.0014 from the traditional forecasting model. The results indicate that the fuzzy time series model based on fuzzy c-means clustering has a better performance in forecasting incidence of Hepatitis E.
Image Segmentation Method Using Fuzzy C Mean Clustering Based on Multi-Objective Optimization

Science.gov (United States)

Chen, Jinlin; Yang, Chunzhi; Xu, Guangkui; Ning, Li

2018-04-01

Image segmentation is not only one of the hottest topics in digital image processing, but also an important part of computer vision applications. As one kind of image segmentation algorithms, fuzzy C-means clustering is an effective and concise segmentation algorithm. However, the drawback of FCM is that it is sensitive to image noise. To solve the problem, this paper designs a novel fuzzy C-mean clustering algorithm based on multi-objective optimization. We add a parameter λ to the fuzzy distance measurement formula to improve the multi-objective optimization. The parameter λ can adjust the weights of the pixel local information. In the algorithm, the local correlation of neighboring pixels is added to the improved multi-objective mathematical model to optimize the clustering cent. Two different experimental results show that the novel fuzzy C-means approach has an efficient performance and computational time while segmenting images by different type of noises.
Comparative Performance Of Using PCA With K-Means And Fuzzy C Means Clustering For Customer Segmentation

Directory of Open Access Journals (Sweden)

Fahmida Afrin

2015-08-01

Full Text Available Abstract Data mining is the process of analyzing data and discovering useful information. Sometimes it is called knowledge Discovery. Clustering refers to groups whereas data are grouped in such a way that the data in one cluster are similar data in different clusters are dissimilar. Many data mining technologies are developed for customer segmentation. PCA is working as a preprocessor of Fuzzy C means and K- means for reducing the high dimensional and noisy data. There are many clustering method apply on customer segmentation. In this paper the performance of Fuzzy C means and K-means after implementing Principal Component Analysis is analyzed. We analyze the performance on a standard dataset for these algorithms. The results indicate that PCA based fuzzy clustering produces better results than PCA based K-means and is a more stable method for customer segmentation.
Drought Forecasting by SPI Index and ANFIS Model Using Fuzzy C-mean Clustering

Directory of Open Access Journals (Sweden)

mehdi Komasi

2013-08-01

Full Text Available Drought is the interaction between environment and water cycle in the world and affects natural environment of an area when it persists for a longer period. So, developing a suitable index to forecast the spatial and temporal distribution of drought plays an important role in the planning and management of natural resources and water resource systems. In this article, firstly, the drought concept and drought indexes were introduced and then the fuzzy neural networks and fuzzy C-mean clustering were applied to forecast drought via standardized precipitation index (SPI. The results of this research indicate that the SPI index is more capable than the other indexes such as PDSI (Palmer Drought Severity Index, PAI (Palfai Aridity Index and etc. in drought forecasting process. Moreover, application of adaptive nero-fuzzy network accomplished by C-mean clustering has high efficiency in the drought forecasting.
Fuzzy C-means method for clustering microarray data.

Science.gov (United States)

Dembélé, Doulaye; Kastner, Philippe

2003-05-22

Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes. A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster. Supplementary text and Matlab functions are available at http://www-igbmc.u-strasbg.fr/fcm/
Cluster radioactive decay within the preformed cluster model using relativistic mean-field theory densities

International Nuclear Information System (INIS)

Singh, BirBikram; Patra, S. K.; Gupta, Raj K.

2010-01-01

We have studied the (ground-state) cluster radioactive decays within the preformed cluster model (PCM) of Gupta and collaborators [R. K. Gupta, in Proceedings of the 5th International Conference on Nuclear Reaction Mechanisms, Varenna, edited by E. Gadioli (Ricerca Scientifica ed Educazione Permanente, Milano, 1988), p. 416; S. S. Malik and R. K. Gupta, Phys. Rev. C 39, 1992 (1989)]. The relativistic mean-field (RMF) theory is used to obtain the nuclear matter densities for the double folding procedure used to construct the cluster-daughter potential with M3Y nucleon-nucleon interaction including exchange effects. Following the PCM approach, we have deduced empirically the preformation probability P 0 emp from the experimental data on both the α- and exotic cluster-decays, specifically of parents in the trans-lead region having doubly magic 208 Pb or its neighboring nuclei as daughters. Interestingly, the RMF-densities-based nuclear potential supports the concept of preformation for both the α and heavier clusters in radioactive nuclei. P 0 α(emp) for α decays is almost constant (∼10 -2 -10 -3 ) for all the parent nuclei considered here, and P 0 c(emp) for cluster decays of the same parents decrease with the size of clusters emitted from different parents. The results obtained for P 0 c(emp) are reasonable and are within two to three orders of magnitude of the well-accepted phenomenological model of Blendowske-Walliser for light clusters.
KMEANS CLUSTERING FOR HIDDEN MARKOV MODEL

NARCIS (Netherlands)

Perrone, M.P.; Connell, S.D.

2004-01-01

An unsupervised kmeans clustering algorithm for hidden Markov models is described and applied to the task of generating subclass models for individual handwritten character classes. The algorithm is compared to a related clustering method and shown to give a relative change in the error rate of as
A Trajectory Regression Clustering Technique Combining a Novel Fuzzy C-Means Clustering Algorithm with the Least Squares Method

Directory of Open Access Journals (Sweden)

Xiangbing Zhou

2018-04-01

Full Text Available Rapidly growing GPS (Global Positioning System trajectories hide much valuable information, such as city road planning, urban travel demand, and population migration. In order to mine the hidden information and to capture better clustering results, a trajectory regression clustering method (an unsupervised trajectory clustering method is proposed to reduce local information loss of the trajectory and to avoid getting stuck in the local optimum. Using this method, we first define our new concept of trajectory clustering and construct a novel partitioning (angle-based partitioning method of line segments; second, the Lagrange-based method and Hausdorff-based K-means++ are integrated in fuzzy C-means (FCM clustering, which are used to maintain the stability and the robustness of the clustering process; finally, least squares regression model is employed to achieve regression clustering of the trajectory. In our experiment, the performance and effectiveness of our method is validated against real-world taxi GPS data. When comparing our clustering algorithm with the partition-based clustering algorithms (K-means, K-median, and FCM, our experimental results demonstrate that the presented method is more effective and generates a more reasonable trajectory.
Implementation of K-Means Clustering Method for Electronic Learning Model

Science.gov (United States)

Latipa Sari, Herlina; Suranti Mrs., Dewi; Natalia Zulita, Leni

2017-12-01

Teaching and Learning process at SMK Negeri 2 Bengkulu Tengah has applied e-learning system for teachers and students. The e-learning was based on the classification of normative, productive, and adaptive subjects. SMK Negeri 2 Bengkulu Tengah consisted of 394 students and 60 teachers with 16 subjects. The record of e-learning database was used in this research to observe students’ activity pattern in attending class. K-Means algorithm in this research was used to classify students’ learning activities using e-learning, so that it was obtained cluster of students’ activity and improvement of student’s ability. Implementation of K-Means Clustering method for electronic learning model at SMK Negeri 2 Bengkulu Tengah was conducted by observing 10 students’ activities, namely participation of students in the classroom, submit assignment, view assignment, add discussion, view discussion, add comment, download course materials, view article, view test, and submit test. In the e-learning model, the testing was conducted toward 10 students that yielded 2 clusters of membership data (C1 and C2). Cluster 1: with membership percentage of 70% and it consisted of 6 members, namely 1112438 Anggi Julian, 1112439 Anis Maulita, 1112441 Ardi Febriansyah, 1112452 Berlian Sinurat, 1112460 Dewi Anugrah Anwar and 1112467 Eka Tri Oktavia Sari. Cluster 2:with membership percentage of 30% and it consisted of 4 members, namely 1112463 Dosita Afriyani, 1112471 Erda Novita, 1112474 Eskardi and 1112477 Fachrur Rozi.
Clustering Batik Images using Fuzzy C-Means Algorithm Based on Log-Average Luminance

Directory of Open Access Journals (Sweden)

Ahmad Sanmorino

2012-06-01

Full Text Available Batik is a fabric or clothes that are made with a special staining technique called wax-resist dyeing and is one of the cultural heritage which has high artistic value. In order to improve the efficiency and give better semantic to the image, some researchers apply clustering algorithm for managing images before they can be retrieved. Image clustering is a process of grouping images based on their similarity. In this paper we attempt to provide an alternative method of grouping batik image using fuzzy c-means (FCM algorithm based on log-average luminance of the batik. FCM clustering algorithm is an algorithm that works using fuzzy models that allow all data from all cluster members are formed with different degrees of membership between 0 and 1. Log-average luminance (LAL is the average value of the lighting in an image. We can compare different image lighting from one image to another using LAL. From the experiments that have been made, it can be concluded that fuzzy c-means algorithm can be used for batik image clustering based on log-average luminance of each image possessed.
Discrimination of neutrons and γ-rays in liquid scintillators based of fuzzy c-means clustering

International Nuclear Information System (INIS)

Luo Xiaoliang; Liu Guofu; Yang Jun

2011-01-01

A novel method based on fuzzy c-means (FCM) clustering for the discrimination of neutrons and γ-rays in liquid scintillators was presented. The neutrons and γ-rays in the environment were firstly acquired by the portable real-time n-γ discriminator and then discriminated using fuzzy c-means clustering and pulse gradient analysis, respectively. By comparing the results with each other, it is shown that the discrimination results of the fuzzy c-means clustering are consistent with those of the pulse gradient analysis. The decrease in uncertainty and the improvement in discrimination performance of the fuzzy c-means clustering were also observed. (authors)
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

Science.gov (United States)

Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

2018-04-01

In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
α/β-particle radiation identification based on fuzzy C-means clustering

International Nuclear Information System (INIS)

Yang Yijianxia; Yang Lu; Li Wenqiang

2013-01-01

A pulse shape recognition method based on fuzzy C-means clustering for the discrimination of α/βparticle was presented. A detection circuit to isolate α/β-particles is designed. Using a single probe scintillating detector to acquire α/β particles. By comparing the results to pulse amplitude analysis, it is shown that by Fuzzy C-means clustering α-particle count rate increased by 42.9% and the cross-talk ratio of α-β is decreased by 15.9% for 6190 cps 0420 αsource; β-particle count rate increased by 31.8% and the cross -talk ratio of β-α is decreased by 7.7% for 05-05β source. (authors)
3D Building Models Segmentation Based on K-Means++ Cluster Analysis

Science.gov (United States)

Zhang, C.; Mao, B.

2016-10-01

3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model) 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid) and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.
3D BUILDING MODELS SEGMENTATION BASED ON K-MEANS++ CLUSTER ANALYSIS

Directory of Open Access Journals (Sweden)

C. Zhang

2016-10-01

Full Text Available 3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.
Subspace K-means clustering.

Science.gov (United States)

Timmerman, Marieke E; Ceulemans, Eva; De Roover, Kim; Van Leeuwen, Karla

2013-12-01

To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the existing related clustering methods, including deterministic, stochastic, and unsupervised learning approaches. To evaluate subspace K-means, we performed a comparative simulation study, in which we manipulated the overlap of subspaces, the between-cluster variance, and the error variance. The study shows that the subspace K-means algorithm is sensitive to local minima but that the problem can be reasonably dealt with by using partitions of various cluster procedures as a starting point for the algorithm. Subspace K-means performs very well in recovering the true clustering across all conditions considered and appears to be superior to its competitor methods: K-means, reduced K-means, factorial K-means, mixtures of factor analyzers (MFA), and MCLUST. The best competitor method, MFA, showed a performance similar to that of subspace K-means in easy conditions but deteriorated in more difficult ones. Using data from a study on parental behavior, we show that subspace K-means analysis provides a rich insight into the cluster characteristics, in terms of both the relative positions of the clusters (via the centroids) and the shape of the clusters (via the within-cluster residuals).

Subspace K-means clustering

NARCIS (Netherlands)

Timmerman, Marieke E.; Ceulemans, Eva; De Roover, Kim; Van Leeuwen, Karla

2013-01-01

To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the
Clinical assessment using an algorithm based on clustering Fuzzy c-means

NARCIS (Netherlands)

Guijarro-Rodriguez, A.; Cevallos-Torres, L.; Yepez-Holguin, J.; Botto-Tobar, M.; Valencia-García, R.; Lagos-Ortiz, K.; Alcaraz-Mármol, G.; Del Cioppo, J.; Vera-Lucio, N.; Bucaram-Leverone, M.

2017-01-01

The Fuzzy c-means (FCM) algorithms dene a grouping criterion from a function, which seeks to minimize iteratively the function up to an optimal fuzzy partition is obtained. In the execution of this algorithm relates each element to the clusters that were determined in the same n-dimensional space,
*K-means and cluster models for cancer signatures.

Science.gov (United States)

Kakushadze, Zura; Yu, Willie

2017-09-01

We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in https://ssrn.com/abstract=2802753 to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means' computational cost is a fraction of NMF's. Using 1389 published samples for 14 cancer types, we find that 3 cancers (liver cancer, lung cancer and renal cell carcinoma) stand out and do not have cluster-like structures. Two clusters have especially high within-cluster correlations with 11 other cancers indicating common underlying structures. Our approach opens a novel avenue for studying such structures. *K-means is universal and can be applied in other fields. We discuss some potential applications in quantitative finance.
Comparing clustering models in bank customers: Based on Fuzzy relational clustering approach

Directory of Open Access Journals (Sweden)

Ayad Hendalianpour

2016-11-01

Full Text Available Clustering is absolutely useful information to explore data structures and has been employed in many places. It organizes a set of objects into similar groups called clusters, and the objects within one cluster are both highly similar and dissimilar with the objects in other clusters. The K-mean, C-mean, Fuzzy C-mean and Kernel K-mean algorithms are the most popular clustering algorithms for their easy implementation and fast work, but in some cases we cannot use these algorithms. Regarding this, in this paper, a hybrid model for customer clustering is presented that is applicable in five banks of Fars Province, Shiraz, Iran. In this way, the fuzzy relation among customers is defined by using their features described in linguistic and quantitative variables. As follows, the customers of banks are grouped according to K-mean, C-mean, Fuzzy C-mean and Kernel K-mean algorithms and the proposed Fuzzy Relation Clustering (FRC algorithm. The aim of this paper is to show how to choose the best clustering algorithms based on density-based clustering and present a new clustering algorithm for both crisp and fuzzy variables. Finally, we apply the proposed approach to five datasets of customer's segmentation in banks. The result of the FCR shows the accuracy and high performance of FRC compared other clustering methods.
Gas load forecasting based on optimized fuzzy c-mean clustering analysis of selecting similar days

Directory of Open Access Journals (Sweden)

Qiu Jing

2017-08-01

Full Text Available Traditional fuzzy c-means (FCM clustering in short term load forecasting method is easy to fall into local optimum and is sensitive to the initial cluster center.In this paper,we propose to use global search feature of particle swarm optimization (PSO algorithm to avoid these shortcomings,and to use FCM optimization to select similar date of forecast as training sample of support vector machines.This will not only strengthen the data rule of training samples,but also ensure the consistency of data characteristics.Experimental results show that the prediction accuracy of this prediction model is better than that of BP neural network and support vector machine (SVM algorithms.
Model selection for semiparametric marginal mean regression accounting for within-cluster subsampling variability and informative cluster size.

Science.gov (United States)

Shen, Chung-Wei; Chen, Yi-Hau

2018-03-13

We propose a model selection criterion for semiparametric marginal mean regression based on generalized estimating equations. The work is motivated by a longitudinal study on the physical frailty outcome in the elderly, where the cluster size, that is, the number of the observed outcomes in each subject, is "informative" in the sense that it is related to the frailty outcome itself. The new proposal, called Resampling Cluster Information Criterion (RCIC), is based on the resampling idea utilized in the within-cluster resampling method (Hoffman, Sen, and Weinberg, 2001, Biometrika 88, 1121-1134) and accommodates informative cluster size. The implementation of RCIC, however, is free of performing actual resampling of the data and hence is computationally convenient. Compared with the existing model selection methods for marginal mean regression, the RCIC method incorporates an additional component accounting for variability of the model over within-cluster subsampling, and leads to remarkable improvements in selecting the correct model, regardless of whether the cluster size is informative or not. Applying the RCIC method to the longitudinal frailty study, we identify being female, old age, low income and life satisfaction, and chronic health conditions as significant risk factors for physical frailty in the elderly. © 2018, The International Biometric Society.
Effect of co-operative fuzzy c-means clustering on estimates of three ...

Indian Academy of Sciences (India)

infinite isotropic elastic media in concise matrix ... hydrate and free gas accumulation. 2. AVA method ... wave propagation across the boundaries of hori- zontally .... Flow chart showing the sequence of steps in the present scheme of fuzzy c-mean clustering adapted for AVA ... porosity 0.38, OIL API 28.5, brine salinity 0.07, ...
Modified fuzzy c-means applied to a Bragg grating-based spectral imager for material clustering

Science.gov (United States)

Rodríguez, Aida; Nieves, Juan Luis; Valero, Eva; Garrote, Estíbaliz; Hernández-Andrés, Javier; Romero, Javier

2012-01-01

We have modified the Fuzzy C-Means algorithm for an application related to segmentation of hyperspectral images. Classical fuzzy c-means algorithm uses Euclidean distance for computing sample membership to each cluster. We have introduced a different distance metric, Spectral Similarity Value (SSV), in order to have a more convenient similarity measure for reflectance information. SSV distance metric considers both magnitude difference (by the use of Euclidean distance) and spectral shape (by the use of Pearson correlation). Experiments confirmed that the introduction of this metric improves the quality of hyperspectral image segmentation, creating spectrally more dense clusters and increasing the number of correctly classified pixels.
*K-means and Cluster Models for Cancer Signatures

OpenAIRE

Kakushadze, Zura; Yu, Willie

2017-01-01

We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in https://ssrn.com/abstract=2802753 to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means’ computational cost is a fraction of NMF’s. Using 1389 published samples for 14 cancer types, we find that 3 cancer...
APPLICATION OF FUZZY C-MEANS CLUSTERING TECHNIQUE IN VEHICULAR POLLUTION

Directory of Open Access Journals (Sweden)

Samarjit Das

2013-07-01

Full Text Available Presently in most of the urban areas all over the world, due to the exponential increase in traffic, vehicular pollution has become one of the key contributors to air pollution. As uncertainty prevails in the process of designating the level of pollution of a particular region, a fuzzy method can be applied to see the membership values of that region to a number of predefined clusters. Also, due to the existence of different pollutants in vehicular pollution, the data used to represent it are in the form of numerical vectors. In our work, we shall apply the fuzzy c-means technique of Bezdek on a dataset representing vehicular pollution to obtain the membership values of pollution due to vehicular emission of a city to one or more of some predefined clusters. We shall try also to see the benefits of adopting a fuzzy approach over the traditional way of determining the level of pollution of the particular region
Determining the number of clusters for kernelized fuzzy C-means algorithms for automatic medical image segmentation

Directory of Open Access Journals (Sweden)

E.A. Zanaty

2012-03-01

Full Text Available In this paper, we determine the suitable validity criterion of kernelized fuzzy C-means and kernelized fuzzy C-means with spatial constraints for automatic segmentation of magnetic resonance imaging (MRI. For that; the original Euclidean distance in the FCM is replaced by a Gaussian radial basis function classifier (GRBF and the corresponding algorithms of FCM methods are derived. The derived algorithms are called as the kernelized fuzzy C-means (KFCM and kernelized fuzzy C-means with spatial constraints (SKFCM. These methods are implemented on eighteen indexes as validation to determine whether indexes are capable to acquire the optimal clusters number. The performance of segmentation is estimated by applying these methods independently on several datasets to prove which method can give good results and with which indexes. Our test spans various indexes covering the classical and the rather more recent indexes that have enjoyed noticeable success in that field. These indexes are evaluated and compared by applying them on various test images, including synthetic images corrupted with noise of varying levels, and simulated volumetric MRI datasets. Comparative analysis is also presented to show whether the validity index indicates the optimal clustering for our datasets.
Small traveling clusters in attractive and repulsive Hamiltonian mean-field models.

Science.gov (United States)

Barré, Julien; Yamaguchi, Yoshiyuki Y

2009-03-01

Long-lasting small traveling clusters are studied in the Hamiltonian mean-field model by comparing between attractive and repulsive interactions. Nonlinear Landau damping theory predicts that a Gaussian momentum distribution on a spatially homogeneous background permits the existence of traveling clusters in the repulsive case, as in plasma systems, but not in the attractive case. Nevertheless, extending the analysis to a two-parameter family of momentum distributions of Fermi-Dirac type, we theoretically predict the existence of traveling clusters in the attractive case; these findings are confirmed by direct N -body numerical simulations. The parameter region with the traveling clusters is much reduced in the attractive case with respect to the repulsive case.
A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering

Science.gov (United States)

Chahine, Firas Safwan

2012-01-01

Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…
A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters

Science.gov (United States)

Wang, Zhihao; Yi, Jing

2016-01-01

For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule n and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result. PMID:28042291
The effect of mining data k-means clustering toward students profile model drop out potential

Science.gov (United States)

Purba, Windania; Tamba, Saut; Saragih, Jepronel

2018-04-01

The high of student success and the low of student failure can reflect the quality of a college. One of the factors of fail students was drop out. To solve the problem, so mining data with K-means Clustering was applied. K-Means Clustering method would be implemented to clustering the drop out students potentially. Firstly the the result data would be clustering to get the information of all students condition. Based on the model taken was found that students who potentially drop out because of the unexciting students in learning, unsupported parents, diffident students and less of students behavior time. The result of process of K-Means Clustering could known that students who more potentially drop out were in Cluster 1 caused Credit Total System, Quality Total, and the lowest Grade Point Average (GPA) compared between cluster 2 and 3.
A simple and fast method to determine the parameters for fuzzy c-means cluster analysis

DEFF Research Database (Denmark)

Schwämmle, Veit; Jensen, Ole Nørregaard

2010-01-01

MOTIVATION: Fuzzy c-means clustering is widely used to identify cluster structures in high-dimensional datasets, such as those obtained in DNA microarray and quantitative proteomics experiments. One of its main limitations is the lack of a computationally fast method to set optimal values...... of algorithm parameters. Wrong parameter values may either lead to the inclusion of purely random fluctuations in the results or ignore potentially important data. The optimal solution has parameter values for which the clustering does not yield any results for a purely random dataset but which detects cluster...... formation with maximum resolution on the edge of randomness. RESULTS: Estimation of the optimal parameter values is achieved by evaluation of the results of the clustering procedure applied to randomized datasets. In this case, the optimal value of the fuzzifier follows common rules that depend only...
Automatic detection of multiple UXO-like targets using magnetic anomaly inversion and self-adaptive fuzzy c-means clustering

Science.gov (United States)

Yin, Gang; Zhang, Yingtang; Fan, Hongbo; Ren, Guoquan; Li, Zhining

2017-12-01

We have developed a method for automatically detecting UXO-like targets based on magnetic anomaly inversion and self-adaptive fuzzy c-means clustering. Magnetic anomaly inversion methods are used to estimate the initial locations of multiple UXO-like sources. Although these initial locations have some errors with respect to the real positions, they form dense clouds around the actual positions of the magnetic sources. Then we use the self-adaptive fuzzy c-means clustering algorithm to cluster these initial locations. The estimated number of cluster centroids represents the number of targets and the cluster centroids are regarded as the locations of magnetic targets. Effectiveness of the method has been demonstrated using synthetic datasets. Computational results show that the proposed method can be applied to the case of several UXO-like targets that are randomly scattered within in a confined, shallow subsurface, volume. A field test was carried out to test the validity of the proposed method and the experimental results show that the prearranged magnets can be detected unambiguously and located precisely.
Monopole excitations of the 12C nucleus in the cluster model

International Nuclear Information System (INIS)

Mikhelashvili, T.Ya.; Shirokov, A.M.; Smirnov, Yu.F.

1990-01-01

The monopole excitations of the 12 C nucleus are studied in the 3α-cluster model. The 3α-continuum is taken into account by means of scattering theory in the harmonic oscillator representation. Only the 'true' three-body scattering is considered. The role of the continuum is essential. Particularly, at excitation energies between 12-25 MeV, instead of a number of sharp resonances, a single smooth resonance on a broad pedestal arises. The pedestal may be easily misinterpreted as a 'background' in experimental studies. (author)
An improved K-means clustering method for cDNA microarray image segmentation.

Science.gov (United States)

Wang, T N; Li, T J; Shao, G F; Wu, S X

2015-07-14

Microarray technology is a powerful tool for human genetic research and other biomedical applications. Numerous improvements to the standard K-means algorithm have been carried out to complete the image segmentation step. However, most of the previous studies classify the image into two clusters. In this paper, we propose a novel K-means algorithm, which first classifies the image into three clusters, and then one of the three clusters is divided as the background region and the other two clusters, as the foreground region. The proposed method was evaluated on six different data sets. The analyses of accuracy, efficiency, expression values, special gene spots, and noise images demonstrate the effectiveness of our method in improving the segmentation quality.
Approximate fuzzy C-means (AFCM) cluster analysis of medical magnetic resonance image (MRI) data

International Nuclear Information System (INIS)

DelaPaz, R.L.; Chang, P.J.; Bernstein, R.; Dave, J.V.

1987-01-01

The authors describe the application of an approximate fuzzy C-means (AFCM) clustering algorithm as a data dimension reduction approach to medical magnetic resonance images (MRI). Image data consisted of one T1-weighted, two T2-weighted, and one T2*-weighted (magnetic susceptibility) image for each cranial study and a matrix of 10 images generated from 10 combinations of TE and TR for each body lymphoma study. All images were obtained with a 1.5 Tesla imaging system (GE Signa). Analyses were performed on over 100 MR image sets with a variety of pathologies. The cluster analysis was operated in an unsupervised mode and computational overhead was minimized by utilizing a table look-up approach without adversely affecting accuracy. Image data were first segmented into 2 coarse clusters, each of which was then subdivided into 16 fine clusters. The final tissue classifications were presented as color-coded anatomically-mapped images and as two and three dimensional displays of cluster center data in selected feature space (minimum spanning tree). Fuzzy cluster analysis appears to be a clinically useful dimension reduction technique which results in improved diagnostic specificity of medical magnetic resonance images

Analisis Perbandingan Algoritma Fuzzy C-Means dan K-Means

OpenAIRE

Yohannes, Yohannes

2016-01-01

Klasterisasi merupakan teknik pengelompokkan data berdasarkan kemiripan data. Teknik klasterisasi ini banyak digunakan pada bidang ilmu komputer khususnya pengolahan citra, pengenalan pola, dan data mining. Banyak sekali algoritma yang digunakan untuk klasterisasi data. Algoritma yang sering digunakan untuk klasterisasi data pada umumnya adalah Fuzzy C-Means dan K-Means. Algoritma Fuzzy C-Means merupakan algoritma klasterisasi dimana data dikelompokkan ke dalam suatu pusat cluster data denga...
A Fault Diagnosis Approach for Gas Turbine Exhaust Gas Temperature Based on Fuzzy C-Means Clustering and Support Vector Machine

Directory of Open Access Journals (Sweden)

Zhi-tao Wang

2015-01-01

Full Text Available As an important gas path performance parameter of gas turbine, exhaust gas temperature (EGT can represent the thermal health condition of gas turbine. In order to monitor and diagnose the EGT effectively, a fusion approach based on fuzzy C-means (FCM clustering algorithm and support vector machine (SVM classification model is proposed in this paper. Considering the distribution characteristics of gas turbine EGT, FCM clustering algorithm is used to realize clustering analysis and obtain the state pattern, on the basis of which the preclassification of EGT is completed. Then, SVM multiclassification model is designed to carry out the state pattern recognition and fault diagnosis. As an example, the historical monitoring data of EGT from an industrial gas turbine is analyzed and used to verify the performance of the fusion fault diagnosis approach presented in this paper. The results show that this approach can make full use of the unsupervised feature extraction ability of FCM clustering algorithm and the sample classification generalization properties of SVM multiclassification model, which offers an effective way to realize the online condition recognition and fault diagnosis of gas turbine EGT.
Normalization based K means Clustering Algorithm

OpenAIRE

Virmani, Deepali; Taneja, Shweta; Malhotra, Geetika

2015-01-01

K-means is an effective clustering technique used to separate similar data into groups based on initial centroids of clusters. In this paper, Normalization based K-means clustering algorithm(N-K means) is proposed. Proposed N-K means clustering algorithm applies normalization prior to clustering on the available data as well as the proposed approach calculates initial centroids based on weights. Experimental results prove the betterment of proposed N-K means clustering algorithm over existing...
Developing the fuzzy c-means clustering algorithm based on maximum entropy for multitarget tracking in a cluttered environment

Science.gov (United States)

Chen, Xiao; Li, Yaan; Yu, Jing; Li, Yuxing

2018-01-01

For fast and more effective implementation of tracking multiple targets in a cluttered environment, we propose a multiple targets tracking (MTT) algorithm called maximum entropy fuzzy c-means clustering joint probabilistic data association that combines fuzzy c-means clustering and the joint probabilistic data association (PDA) algorithm. The algorithm uses the membership value to express the probability of the target originating from measurement. The membership value is obtained through fuzzy c-means clustering objective function optimized by the maximum entropy principle. When considering the effect of the public measurement, we use a correction factor to adjust the association probability matrix to estimate the state of the target. As this algorithm avoids confirmation matrix splitting, it can solve the high computational load problem of the joint PDA algorithm. The results of simulations and analysis conducted for tracking neighbor parallel targets and cross targets in a different density cluttered environment show that the proposed algorithm can realize MTT quickly and efficiently in a cluttered environment. Further, the performance of the proposed algorithm remains constant with increasing process noise variance. The proposed algorithm has the advantages of efficiency and low computational load, which can ensure optimum performance when tracking multiple targets in a dense cluttered environment.
Optimal Sizing for Wind/PV/Battery System Using Fuzzy c-Means Clustering with Self-Adapted Cluster Number

Directory of Open Access Journals (Sweden)

Xin Liu

2017-01-01

Full Text Available Integrating wind generation, photovoltaic power, and battery storage to form hybrid power systems has been recognized to be promising in renewable energy development. However, considering the system complexity and uncertainty of renewable energies, such as wind and solar types, it is difficult to obtain practical solutions for these systems. In this paper, optimal sizing for a wind/PV/battery system is realized by trade-offs between technical and economic factors. Firstly, the fuzzy c-means clustering algorithm was modified with self-adapted parameters to extract useful information from historical data. Furthermore, the Markov model is combined to determine the chronological system states of natural resources and load. Finally, a power balance strategy is introduced to guide the optimization process with the genetic algorithm to establish the optimal configuration with minimized cost while guaranteeing reliability and environmental factors. A case of island hybrid power system is analyzed, and the simulation results are compared with the general FCM method and chronological method to validate the effectiveness of the mentioned method.
Android Malware Classification Using K-Means Clustering Algorithm

Science.gov (United States)

Hamid, Isredza Rahmi A.; Syafiqah Khalid, Nur; Azma Abdullah, Nurul; Rahman, Nurul Hidayah Ab; Chai Wen, Chuah

2017-08-01

Malware was designed to gain access or damage a computer system without user notice. Besides, attacker exploits malware to commit crime or fraud. This paper proposed Android malware classification approach based on K-Means clustering algorithm. We evaluate the proposed model in terms of accuracy using machine learning algorithms. Two datasets were selected to demonstrate the practicing of K-Means clustering algorithms that are Virus Total and Malgenome dataset. We classify the Android malware into three clusters which are ransomware, scareware and goodware. Nine features were considered for each types of dataset such as Lock Detected, Text Detected, Text Score, Encryption Detected, Threat, Porn, Law, Copyright and Moneypak. We used IBM SPSS Statistic software for data classification and WEKA tools to evaluate the built cluster. The proposed K-Means clustering algorithm shows promising result with high accuracy when tested using Random Forest algorithm.
Profiling Local Optima in K-Means Clustering: Developing a Diagnostic Technique

Science.gov (United States)

Steinley, Douglas

2006-01-01

Using the cluster generation procedure proposed by D. Steinley and R. Henson (2005), the author investigated the performance of K-means clustering under the following scenarios: (a) different probabilities of cluster overlap; (b) different types of cluster overlap; (c) varying samples sizes, clusters, and dimensions; (d) different multivariate…
Search for C+ C clustering in Mg ground state

Indian Academy of Sciences (India)

2017-01-04

Jan 4, 2017 ... Finite-range knockout theory predictions were much larger for (12C,212C) reaction, indicating a very small 12C−12C clustering in 24Mg. (g.s.) . Our present results contradict most of the proposed heavy cluster (12C+12C) structure models for the ground state of 24Mg. Keywords. Direct nuclear reactions ...
A Variational Level Set Model Combined with FCMS for Image Clustering Segmentation

Directory of Open Access Journals (Sweden)

Liming Tang

2014-01-01

Full Text Available The fuzzy C means clustering algorithm with spatial constraint (FCMS is effective for image segmentation. However, it lacks essential smoothing constraints to the cluster boundaries and enough robustness to the noise. Samson et al. proposed a variational level set model for image clustering segmentation, which can get the smooth cluster boundaries and closed cluster regions due to the use of level set scheme. However it is very sensitive to the noise since it is actually a hard C means clustering model. In this paper, based on Samson’s work, we propose a new variational level set model combined with FCMS for image clustering segmentation. Compared with FCMS clustering, the proposed model can get smooth cluster boundaries and closed cluster regions due to the use of level set scheme. In addition, a block-based energy is incorporated into the energy functional, which enables the proposed model to be more robust to the noise than FCMS clustering and Samson’s model. Some experiments on the synthetic and real images are performed to assess the performance of the proposed model. Compared with some classical image segmentation models, the proposed model has a better performance for the images contaminated by different noise levels.
K-means Clustering: Lloyd's algorithm

Indian Academy of Sciences (India)

First page Back Continue Last page Overview Graphics. K-means Clustering: Lloyd's algorithm. Refines clusters iteratively. Cluster points using Voronoi partitioning of the centers; Centroids of the clusters determine the new centers. Bad example k = 3, n =4.
Coexistence of Cluster Structure and Mean-field-type Structure in Medium-weight Nuclei

International Nuclear Information System (INIS)

Taniguchi, Yasutaka; Horiuchi, Hisashi; Kimura, Masaaki

2006-01-01

We have studied the coexistence of cluster structure and mean-field-type structure in 20Ne and 40Ca using Antisymmetrized Molecular Dynamics (AMD) + Generator Coordinate Method (GCM). By energy variation with new constraint for clustering, we calculate cluster structure wave function. Superposing cluster structure wave functions and mean-field-type structure wave function, we found that 8Be-12C, α-36Ar and 12C-28Si cluster structure are important components of K π = 0 3 + band of 20Ne, that of normal deformed band of 40Ca and that of super deformed band of 40Ca, respectively
Land cover classification using reformed fuzzy C-means

Indian Academy of Sciences (India)

This paper explains the task of land cover classiﬁcation using reformed fuzzy C means. Clustering is the assignment of objects into groups called clusters so that objects from the same cluster are more similar to each other than objects from different clusters. The most basic attribute for clustering of an image is its luminance ...
Query by example video based on fuzzy c-means initialized by fixed clustering center

Science.gov (United States)

Hou, Sujuan; Zhou, Shangbo; Siddique, Muhammad Abubakar

2012-04-01

Currently, the high complexity of video contents has posed the following major challenges for fast retrieval: (1) efficient similarity measurements, and (2) efficient indexing on the compact representations. A video-retrieval strategy based on fuzzy c-means (FCM) is presented for querying by example. Initially, the query video is segmented and represented by a set of shots, each shot can be represented by a key frame, and then we used video processing techniques to find visual cues to represent the key frame. Next, because the FCM algorithm is sensitive to the initializations, here we initialized the cluster center by the shots of query video so that users could achieve appropriate convergence. After an FCM cluster was initialized by the query video, each shot of query video was considered a benchmark point in the aforesaid cluster, and each shot in the database possessed a class label. The similarity between the shots in the database with the same class label and benchmark point can be transformed into the distance between them. Finally, the similarity between the query video and the video in database was transformed into the number of similar shots. Our experimental results demonstrated the performance of this proposed approach.
Brain vascular image segmentation based on fuzzy local information C-means clustering

Science.gov (United States)

Hu, Chaoen; Liu, Xia; Liang, Xiao; Hui, Hui; Yang, Xin; Tian, Jie

2017-02-01

Light sheet fluorescence microscopy (LSFM) is a powerful optical resolution fluorescence microscopy technique which enables to observe the mouse brain vascular network in cellular resolution. However, micro-vessel structures are intensity inhomogeneity in LSFM images, which make an inconvenience for extracting line structures. In this work, we developed a vascular image segmentation method by enhancing vessel details which should be useful for estimating statistics like micro-vessel density. Since the eigenvalues of hessian matrix and its sign describes different geometric structure in images, which enable to construct vascular similarity function and enhance line signals, the main idea of our method is to cluster the pixel values of the enhanced image. Our method contained three steps: 1) calculate the multiscale gradients and the differences between eigenvalues of Hessian matrix. 2) In order to generate the enhanced microvessels structures, a feed forward neural network was trained by 2.26 million pixels for dealing with the correlations between multi-scale gradients and the differences between eigenvalues. 3) The fuzzy local information c-means clustering (FLICM) was used to cluster the pixel values in enhance line signals. To verify the feasibility and effectiveness of this method, mouse brain vascular images have been acquired by a commercial light-sheet microscope in our lab. The experiment of the segmentation method showed that dice similarity coefficient can reach up to 85%. The results illustrated that our approach extracting line structures of blood vessels dramatically improves the vascular image and enable to accurately extract blood vessels in LSFM images.
Fractal dimension to classify the heart sound recordings with KNN and fuzzy c-mean clustering methods

Science.gov (United States)

Juniati, D.; Khotimah, C.; Wardani, D. E. K.; Budayasa, K.

2018-01-01

The heart abnormalities can be detected from heart sound. A heart sound can be heard directly with a stethoscope or indirectly by a phonocardiograph, a machine of the heart sound recording. This paper presents the implementation of fractal dimension theory to make a classification of phonocardiograms into a normal heart sound, a murmur, or an extrasystole. The main algorithm used to calculate the fractal dimension was Higuchi’s Algorithm. There were two steps to make a classification of phonocardiograms, feature extraction, and classification. For feature extraction, we used Discrete Wavelet Transform to decompose the signal of heart sound into several sub-bands depending on the selected level. After the decomposition process, the signal was processed using Fast Fourier Transform (FFT) to determine the spectral frequency. The fractal dimension of the FFT output was calculated using Higuchi Algorithm. The classification of fractal dimension of all phonocardiograms was done with KNN and Fuzzy c-mean clustering methods. Based on the research results, the best accuracy obtained was 86.17%, the feature extraction by DWT decomposition level 3 with the value of kmax 50, using 5-fold cross validation and the number of neighbors was 5 at K-NN algorithm. Meanwhile, for fuzzy c-mean clustering, the accuracy was 78.56%.
Stability-integrated Fuzzy C means segmentation for spatial ...

Indian Academy of Sciences (India)

V ROYNA DAISY

2018-03-16

Mar 16, 2018 ... clusters and including spatial information to basic Fuzzy C Means clustering are done in .... modify the objective function with Kernel distance function .... spatial information, thus making it sensitive to noise and outliers.
GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.

Science.gov (United States)

Schulz, Tizian; Stoye, Jens; Doerr, Daniel

2018-05-08

Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Optimizing Energy Consumption in Vehicular Sensor Networks by Clustering Using Fuzzy C-Means and Fuzzy Subtractive Algorithms

Science.gov (United States)

Ebrahimi, A.; Pahlavani, P.; Masoumi, Z.

2017-09-01

Traffic monitoring and managing in urban intelligent transportation systems (ITS) can be carried out based on vehicular sensor networks. In a vehicular sensor network, vehicles equipped with sensors such as GPS, can act as mobile sensors for sensing the urban traffic and sending the reports to a traffic monitoring center (TMC) for traffic estimation. The energy consumption by the sensor nodes is a main problem in the wireless sensor networks (WSNs); moreover, it is the most important feature in designing these networks. Clustering the sensor nodes is considered as an effective solution to reduce the energy consumption of WSNs. Each cluster should have a Cluster Head (CH), and a number of nodes located within its supervision area. The cluster heads are responsible for gathering and aggregating the information of clusters. Then, it transmits the information to the data collection center. Hence, the use of clustering decreases the volume of transmitting information, and, consequently, reduces the energy consumption of network. In this paper, Fuzzy C-Means (FCM) and Fuzzy Subtractive algorithms are employed to cluster sensors and investigate their performance on the energy consumption of sensors. It can be seen that the FCM algorithm and Fuzzy Subtractive have been reduced energy consumption of vehicle sensors up to 90.68% and 92.18%, respectively. Comparing the performance of the algorithms implies the 1.5 percent improvement in Fuzzy Subtractive algorithm in comparison.
The C4 clustering algorithm: Clusters of galaxies in the Sloan Digital Sky Survey

Energy Technology Data Exchange (ETDEWEB)

Miller, Christopher J.; Nichol, Robert; Reichart, Dan; Wechsler, Risa H.; Evrard, August; Annis, James; McKay, Timothy; Bahcall, Neta; Bernardi, Mariangela; Boehringer,; Connolly, Andrew; Goto, Tomo; Kniazev, Alexie; Lamb, Donald; Postman, Marc; Schneider, Donald; Sheth, Ravi; Voges, Wolfgang; /Cerro-Tololo InterAmerican Obs. /Portsmouth U.,

2005-03-01

We present the ''C4 Cluster Catalog'', a new sample of 748 clusters of galaxies identified in the spectroscopic sample of the Second Data Release (DR2) of the Sloan Digital Sky Survey (SDSS). The C4 cluster-finding algorithm identifies clusters as overdensities in a seven-dimensional position and color space, thus minimizing projection effects that have plagued previous optical cluster selection. The present C4 catalog covers {approx}2600 square degrees of sky and ranges in redshift from z = 0.02 to z = 0.17. The mean cluster membership is 36 galaxies (with redshifts) brighter than r = 17.7, but the catalog includes a range of systems, from groups containing 10 members to massive clusters with over 200 cluster members with redshifts. The catalog provides a large number of measured cluster properties including sky location, mean redshift, galaxy membership, summed r-band optical luminosity (L{sub r}), velocity dispersion, as well as quantitative measures of substructure and the surrounding large-scale environment. We use new, multi-color mock SDSS galaxy catalogs, empirically constructed from the {Lambda}CDM Hubble Volume (HV) Sky Survey output, to investigate the sensitivity of the C4 catalog to the various algorithm parameters (detection threshold, choice of passbands and search aperture), as well as to quantify the purity and completeness of the C4 cluster catalog. These mock catalogs indicate that the C4 catalog is {approx_equal}90% complete and 95% pure above M{sub 200} = 1 x 10{sup 14} h{sup -1}M{sub {circle_dot}} and within 0.03 {le} z {le} 0.12. Using the SDSS DR2 data, we show that the C4 algorithm finds 98% of X-ray identified clusters and 90% of Abell clusters within 0.03 {le} z {le} 0.12. Using the mock galaxy catalogs and the full HV dark matter simulations, we show that the L{sub r} of a cluster is a more robust estimator of the halo mass (M{sub 200}) than the galaxy line-of-sight velocity dispersion or the richness of the cluster
A curvature-based weighted fuzzy c-means algorithm for point clouds de-noising

Science.gov (United States)

Cui, Xin; Li, Shipeng; Yan, Xiutian; He, Xinhua

2018-04-01

In order to remove the noise of three-dimensional scattered point cloud and smooth the data without damnify the sharp geometric feature simultaneity, a novel algorithm is proposed in this paper. The feature-preserving weight is added to fuzzy c-means algorithm which invented a curvature weighted fuzzy c-means clustering algorithm. Firstly, the large-scale outliers are removed by the statistics of r radius neighboring points. Then, the algorithm estimates the curvature of the point cloud data by using conicoid parabolic fitting method and calculates the curvature feature value. Finally, the proposed clustering algorithm is adapted to calculate the weighted cluster centers. The cluster centers are regarded as the new points. The experimental results show that this approach is efficient to different scale and intensities of noise in point cloud with a high precision, and perform a feature-preserving nature at the same time. Also it is robust enough to different noise model.

Choosing the Number of Clusters in K-Means Clustering

Science.gov (United States)

Steinley, Douglas; Brusco, Michael J.

2011-01-01

Steinley (2007) provided a lower bound for the sum-of-squares error criterion function used in K-means clustering. In this article, on the basis of the lower bound, the authors propose a method to distinguish between 1 cluster (i.e., a single distribution) versus more than 1 cluster. Additionally, conditional on indicating there are multiple…
Implementasi KD-Tree K-Means Clustering untuk Klasterisasi Dokumen

Directory of Open Access Journals (Sweden)

Eric Budiman Gosno

2013-09-01

Full Text Available Klasterisasi dokumen adalah suatu proses pengelompokan dokumen secara otomatis dan unsupervised. Klasterisasi dokumen merupakan permasalahan yang sering ditemui dalam berbagai bidang seperti text mining dan sistem temu kembali informasi. Metode klasterisasi dokumen yang memiliki akurasi dan efisiensi waktu yang tinggi sangat diperlukan untuk meningkatkan hasil pada mesin pencari web, dan untuk proses filtering. Salah satu metode klasterisasi yang telah dikenal dan diaplikasikan dalam klasterisasi dokumen adalah K-Means Clustering. Tetapi K-Means Clustering sensitif terhadap pemilihan posisi awal dari titik tengah klaster sehingga pemilihan posisi awal dari titik tengah klaster yang buruk akan mengakibatkan K-Means Clustering terjebak dalam local optimum. KD-Tree K-Means Clustering merupakan perbaikan dari K-Means Clustering. KD-Tree K-Means Clustering menggunakan struktur data K-Dimensional Tree dan nilai kerapatan pada proses inisialisasi titik tengah klaster. Pada makalah ini diimplementasikan algoritma KD-Tree K-Means Clustering untuk permasalahan klasterisasi dokumen. Performa klasterisasi dokumen yang dihasilkan oleh metode KD-Tree K-Means Clustering pada data set 20 newsgroup memiliki nilai distorsi 3×105 lebih rendah dibandingkan dengan nilai rerata distorsi K-Means Clustering dan nilai NIG 0,09 lebih baik dibandingkan dengan nilai NIG K-Means Clustering.
A Hybrid Method for Image Segmentation Based on Artificial Fish Swarm Algorithm and Fuzzy c-Means Clustering

Directory of Open Access Journals (Sweden)

Li Ma

2015-01-01

Full Text Available Image segmentation plays an important role in medical image processing. Fuzzy c-means (FCM clustering is one of the popular clustering algorithms for medical image segmentation. However, FCM has the problems of depending on initial clustering centers, falling into local optimal solution easily, and sensitivity to noise disturbance. To solve these problems, this paper proposes a hybrid artificial fish swarm algorithm (HAFSA. The proposed algorithm combines artificial fish swarm algorithm (AFSA with FCM whose advantages of global optimization searching and parallel computing ability of AFSA are utilized to find a superior result. Meanwhile, Metropolis criterion and noise reduction mechanism are introduced to AFSA for enhancing the convergence rate and antinoise ability. The artificial grid graph and Magnetic Resonance Imaging (MRI are used in the experiments, and the experimental results show that the proposed algorithm has stronger antinoise ability and higher precision. A number of evaluation indicators also demonstrate that the effect of HAFSA is more excellent than FCM and suppressed FCM (SFCM.
Segmentation of Brain Tissues from Magnetic Resonance Images Using Adaptively Regularized Kernel-Based Fuzzy C-Means Clustering

Directory of Open Access Journals (Sweden)

Ahmed Elazab

2015-01-01

Full Text Available An adaptively regularized kernel-based fuzzy C-means clustering framework is proposed for segmentation of brain magnetic resonance images. The framework can be in the form of three algorithms for the local average grayscale being replaced by the grayscale of the average filter, median filter, and devised weighted images, respectively. The algorithms employ the heterogeneity of grayscales in the neighborhood and exploit this measure for local contextual information and replace the standard Euclidean distance with Gaussian radial basis kernel functions. The main advantages are adaptiveness to local context, enhanced robustness to preserve image details, independence of clustering parameters, and decreased computational costs. The algorithms have been validated against both synthetic and clinical magnetic resonance images with different types and levels of noises and compared with 6 recent soft clustering algorithms. Experimental results show that the proposed algorithms are superior in preserving image details and segmentation accuracy while maintaining a low computational complexity.
OPTIMIZING ENERGY CONSUMPTION IN VEHICULAR SENSOR NETWORKS BY CLUSTERING USING FUZZY C-MEANS AND FUZZY SUBTRACTIVE ALGORITHMS

Directory of Open Access Journals (Sweden)

A. Ebrahimi

2017-09-01

Full Text Available Traffic monitoring and managing in urban intelligent transportation systems (ITS can be carried out based on vehicular sensor networks. In a vehicular sensor network, vehicles equipped with sensors such as GPS, can act as mobile sensors for sensing the urban traffic and sending the reports to a traffic monitoring center (TMC for traffic estimation. The energy consumption by the sensor nodes is a main problem in the wireless sensor networks (WSNs; moreover, it is the most important feature in designing these networks. Clustering the sensor nodes is considered as an effective solution to reduce the energy consumption of WSNs. Each cluster should have a Cluster Head (CH, and a number of nodes located within its supervision area. The cluster heads are responsible for gathering and aggregating the information of clusters. Then, it transmits the information to the data collection center. Hence, the use of clustering decreases the volume of transmitting information, and, consequently, reduces the energy consumption of network. In this paper, Fuzzy C-Means (FCM and Fuzzy Subtractive algorithms are employed to cluster sensors and investigate their performance on the energy consumption of sensors. It can be seen that the FCM algorithm and Fuzzy Subtractive have been reduced energy consumption of vehicle sensors up to 90.68% and 92.18%, respectively. Comparing the performance of the algorithms implies the 1.5 percent improvement in Fuzzy Subtractive algorithm in comparison.
Identification of spatiotemporal nutrient patterns in a coastal bay via an integrated k-means clustering and gravity model.

Science.gov (United States)

Chang, Ni-Bin; Wimberly, Brent; Xuan, Zhemin

2012-03-01

This study presents an integrated k-means clustering and gravity model (IKCGM) for investigating the spatiotemporal patterns of nutrient and associated dissolved oxygen levels in Tampa Bay, Florida. By using a k-means clustering analysis to first partition the nutrient data into a user-specified number of subsets, it is possible to discover the spatiotemporal patterns of nutrient distribution in the bay and capture the inherent linkages of hydrodynamic and biogeochemical features. Such patterns may then be combined with a gravity model to link the nutrient source contribution from each coastal watershed to the generated clusters in the bay to aid in the source proportion analysis for environmental management. The clustering analysis was carried out based on 1 year (2008) water quality data composed of 55 sample stations throughout Tampa Bay collected by the Environmental Protection Commission of Hillsborough County. In addition, hydrological and river water quality data of the same year were acquired from the United States Geological Survey's National Water Information System to support the gravity modeling analysis. The results show that the k-means model with 8 clusters is the optimal choice, in which cluster 2 at Lower Tampa Bay had the minimum values of total nitrogen (TN) concentrations, chlorophyll a (Chl-a) concentrations, and ocean color values in every season as well as the minimum concentration of total phosphorus (TP) in three consecutive seasons in 2008. The datasets indicate that Lower Tampa Bay is an area with limited nutrient input throughout the year. Cluster 5, located in Middle Tampa Bay, displayed elevated TN concentrations, ocean color values, and Chl-a concentrations, suggesting that high values of colored dissolved organic matter are linked with some nutrient sources. The data presented by the gravity modeling analysis indicate that the Alafia River Basin is the major contributor of nutrients in terms of both TP and TN values in all seasons
Search for 12 C+ 12 C clustering in 24 Mg ground state

Indian Academy of Sciences (India)

In the backdrop of many models, the heavy cluster structure of the ground state of 24 Mg has been probed experimentally for the first time using the heavy cluster knockout reaction 24 Mg( 12 C, 212 C) 12 C in thequasifree scattering kinematic domain. In the ( 12 C, 212 C) reaction, the direct 12 C-knockout cross-section was ...
Comparative Investigation of Guided Fuzzy Clustering and Mean Shift Clustering for Edge Detection in Electrical Resistivity Tomography Images of Mineral Deposits

Science.gov (United States)

Ward, Wil; Wilkinson, Paul; Chambers, Jon; Bai, Li

2014-05-01

Geophysical surveying using electrical resistivity tomography (ERT) can be used as a rapid non-intrusive method to investigate mineral deposits [1]. One of the key challenges with this approach is to find a robust automated method to assess and characterise deposits on the basis of an ERT image. Recent research applying edge detection techniques has yielded a framework that can successfully locate geological interfaces in ERT images using a minimal assumption data clustering technique, the guided fuzzy clustering method (gfcm) [2]. Non-parametric clustering techniques are statistically grounded methods of image segmentation that do not require any assumptions about the distribution of data under investigation. This study is a comparison of two such methods to assess geological structure based on the resistivity images. In addition to gfcm, a method called mean-shift clustering [3] is investigated with comparisons directed at accuracy, computational expense, and degree of user interaction. Neither approach requires the number of clusters as input (a common parameter and often impractical), rather they are based on a similar theory that data can be clustered based on peaks in the probability density function (pdf) of the data. Each local maximum in these functions represents the modal value of a particular population corresponding to a cluster and as such the data are assigned based on their relationships to these model values. The two methods differ in that gfcm approximates the pdf using kernel density estimation and identifies population means, assigning cluster membership probabilities to each resistivity value in the model based on its distance from the distribution averages. Whereas, in mean-shift clustering, the density function is not calculated, but a gradient ascent method creates a vector that leads each datum towards high density distributions iteratively using weighted kernels to calculate locally dense regions. The only parameter needed in both methods
Single pass kernel k-means clustering method

Indian Academy of Sciences (India)

In unsupervised classiﬁcation, kernel -means clustering method has been shown to perform better than conventional -means clustering method in ... 518501, India; Department of Computer Science and Engineering, Jawaharlal Nehru Technological University, Anantapur College of Engineering, Anantapur 515002, India ...
K-means clustering versus validation measures: a data-distribution perspective.

Science.gov (United States)

Xiong, Hui; Wu, Junjie; Chen, Jian

2009-04-01

K-means is a well-known and widely used partitional clustering method. While there are considerable research efforts to characterize the key features of the K-means clustering algorithm, further investigation is needed to understand how data distributions can have impact on the performance of K-means clustering. To that end, in this paper, we provide a formal and organized study of the effect of skewed data distributions on K-means clustering. Along this line, we first formally illustrate that K-means tends to produce clusters of relatively uniform size, even if input data have varied "true" cluster sizes. In addition, we show that some clustering validation measures, such as the entropy measure, may not capture this uniform effect and provide misleading information on the clustering performance. Viewed in this light, we provide the coefficient of variation (CV) as a necessary criterion to validate the clustering results. Our findings reveal that K-means tends to produce clusters in which the variations of cluster sizes, as measured by CV, are in a range of about 0.3-1.0. Specifically, for data sets with large variation in "true" cluster sizes (e.g., CV > 1.0), K-means reduces variation in resultant cluster sizes to less than 1.0. In contrast, for data sets with small variation in "true" cluster sizes (e.g., CV K-means increases variation in resultant cluster sizes to greater than 0.3. In other words, for the earlier two cases, K-means produces the clustering results which are away from the "true" cluster distributions.
Segmentasi Citra USG (Ultrasonography Kanker Payudara Menggunakan Fuzzy C-Means Clustering

Directory of Open Access Journals (Sweden)

Ri Munarto

2018-01-01

Full Text Available Health is a valuable treasure in survival and can be used as a parameter of quality assurance of human life. Some people even tend to ignore of health, so don’t care about the disease that will them attack and finally to death. Noted the main disease that causes death in the world is cancer. Cancer has many types, but the greatest death in each year is caused by breast cancer. Indonesia found more than 80% of cases in advanced stage, it is estimated that the incidence get 12 people from 10000 women. These numbers will to grow when there is no such treatment as prevention or early diagnosis. Growing of breast cancer patients inversely proportional to the percentage of complaints patients to doctors diagnosis in USG (Ultrasonography breast cancer 20%. The problem is ultrasound imaging which is distorted by speckle noise. The solution is to help easier for doctors to diagnose the presence and form of breast cancer using USG. Speckle noise on USG is able to good reduce using SRAD (Speckle Reducing Anisotropic Diffusion. The filtering results are then well segmented using Fuzzy C-Means Clustering with an accuracy 91.43% of 35 samples USG image breast cancer.
Efficient privacy preserving K-means clustering in a three-party setting

NARCIS (Netherlands)

Beye, Michael; Erkin, Zekeriya; Erkin, Zekeriya; Lagendijk, Reginald L.

2011-01-01

User clustering is a common operation in online social networks, for example to recommend new friends. In previous work [5], Erkin et al. proposed a privacy-preserving K-means clustering algorithm for the semi-honest model, using homomorphic encryption and multi-party computation. This paper makes
Performance Evaluation of Incremental K-means Clustering Algorithm

OpenAIRE

Chakraborty, Sanjay; Nagwani, N. K.

2014-01-01

The incremental K-means clustering algorithm has already been proposed and analysed in paper [Chakraborty and Nagwani, 2011]. It is a very innovative approach which is applicable in periodically incremental environment and dealing with a bulk of updates. In this paper the performance evaluation is done for this incremental K-means clustering algorithm using air pollution database. This paper also describes the comparison on the performance evaluations between existing K-means clustering and i...
Single pass kernel k-means clustering method

Indian Academy of Sciences (India)

paper proposes a simple and faster version of the kernel k-means clustering ... It has been considered as an important tool ... On the other hand, kernel-based clustering methods, like kernel k-means clus- ..... able at the UCI machine learning repository (Murphy 1994). ... All the data sets have only numeric valued features.
Soil data clustering by using K-means and fuzzy K-means algorithm

Directory of Open Access Journals (Sweden)

E. Hot

2016-06-01

Full Text Available A problem of soil clustering based on the chemical characteristics of soil, and proper visual representation of the obtained results, is analysed in the paper. To that aim, K-means and fuzzy K-means algorithms are adapted for soil data clustering. A database of soil characteristics sampled in Montenegro is used for a comparative analysis of implemented algorithms. The procedure of setting proper values for control parameters of fuzzy K-means is illustrated on the used database. In addition, validation of clustering is made through visualisation. Classified soil data are presented on the static Google map and dynamic Open Street Map.
Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster

Science.gov (United States)

Syakur, M. A.; Khotimah, B. K.; Rochman, E. M. S.; Satoto, B. D.

2018-04-01

Clustering is a data mining technique used to analyse data that has variations and the number of lots. Clustering was process of grouping data into a cluster, so they contained data that is as similar as possible and different from other cluster objects. SMEs Indonesia has a variety of customers, but SMEs do not have the mapping of these customers so they did not know which customers are loyal or otherwise. Customer mapping is a grouping of customer profiling to facilitate analysis and policy of SMEs in the production of goods, especially batik sales. Researchers will use a combination of K-Means method with elbow to improve efficient and effective k-means performance in processing large amounts of data. K-Means Clustering is a localized optimization method that is sensitive to the selection of the starting position from the midpoint of the cluster. So choosing the starting position from the midpoint of a bad cluster will result in K-Means Clustering algorithm resulting in high errors and poor cluster results. The K-means algorithm has problems in determining the best number of clusters. So Elbow looks for the best number of clusters on the K-means method. Based on the results obtained from the process in determining the best number of clusters with elbow method can produce the same number of clusters K on the amount of different data. The result of determining the best number of clusters with elbow method will be the default for characteristic process based on case study. Measurement of k-means value of k-means has resulted in the best clusters based on SSE values on 500 clusters of batik visitors. The result shows the cluster has a sharp decrease is at K = 3, so K as the cut-off point as the best cluster.
MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering

Directory of Open Access Journals (Sweden)

Ashlock Daniel

2009-08-01

Full Text Available Abstract Background Uncovering subtypes of disease from microarray samples has important clinical implications such as survival time and sensitivity of individual patients to specific therapies. Unsupervised clustering methods have been used to classify this type of data. However, most existing methods focus on clusters with compact shapes and do not reflect the geometric complexity of the high dimensional microarray clusters, which limits their performance. Results We present a cluster-number-based ensemble clustering algorithm, called MULTI-K, for microarray sample classification, which demonstrates remarkable accuracy. The method amalgamates multiple k-means runs by varying the number of clusters and identifies clusters that manifest the most robust co-memberships of elements. In addition to the original algorithm, we newly devised the entropy-plot to control the separation of singletons or small clusters. MULTI-K, unlike the simple k-means or other widely used methods, was able to capture clusters with complex and high-dimensional structures accurately. MULTI-K outperformed other methods including a recently developed ensemble clustering algorithm in tests with five simulated and eight real gene-expression data sets. Conclusion The geometric complexity of clusters should be taken into account for accurate classification of microarray data, and ensemble clustering applied to the number of clusters tackles the problem very well. The C++ code and the data sets tested are available from the authors.
MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering.

Science.gov (United States)

Kim, Eun-Youn; Kim, Seon-Young; Ashlock, Daniel; Nam, Dougu

2009-08-22

Uncovering subtypes of disease from microarray samples has important clinical implications such as survival time and sensitivity of individual patients to specific therapies. Unsupervised clustering methods have been used to classify this type of data. However, most existing methods focus on clusters with compact shapes and do not reflect the geometric complexity of the high dimensional microarray clusters, which limits their performance. We present a cluster-number-based ensemble clustering algorithm, called MULTI-K, for microarray sample classification, which demonstrates remarkable accuracy. The method amalgamates multiple k-means runs by varying the number of clusters and identifies clusters that manifest the most robust co-memberships of elements. In addition to the original algorithm, we newly devised the entropy-plot to control the separation of singletons or small clusters. MULTI-K, unlike the simple k-means or other widely used methods, was able to capture clusters with complex and high-dimensional structures accurately. MULTI-K outperformed other methods including a recently developed ensemble clustering algorithm in tests with five simulated and eight real gene-expression data sets. The geometric complexity of clusters should be taken into account for accurate classification of microarray data, and ensemble clustering applied to the number of clusters tackles the problem very well. The C++ code and the data sets tested are available from the authors.
Multivariate spatial condition mapping using subtractive fuzzy cluster means.

Science.gov (United States)

Sabit, Hakilo; Al-Anbuky, Adnan

2014-10-13

Wireless sensor networks are usually deployed for monitoring given physical phenomena taking place in a specific space and over a specific duration of time. The spatio-temporal distribution of these phenomena often correlates to certain physical events. To appropriately characterise these events-phenomena relationships over a given space for a given time frame, we require continuous monitoring of the conditions. WSNs are perfectly suited for these tasks, due to their inherent robustness. This paper presents a subtractive fuzzy cluster means algorithm and its application in data stream mining for wireless sensor systems over a cloud-computing-like architecture, which we call sensor cloud data stream mining. Benchmarking on standard mining algorithms, the k-means and the FCM algorithms, we have demonstrated that the subtractive fuzzy cluster means model can perform high quality distributed data stream mining tasks comparable to centralised data stream mining.
Extended Traffic Crash Modelling through Precision and Response Time Using Fuzzy Clustering Algorithms Compared with Multi-layer Perceptron

Directory of Open Access Journals (Sweden)

Iman Aghayan

2012-11-01

Full Text Available This paper compares two fuzzy clustering algorithms – fuzzy subtractive clustering and fuzzy C-means clustering – to a multi-layer perceptron neural network for their ability to predict the severity of crash injuries and to estimate the response time on the traffic crash data. Four clustering algorithms – hierarchical, K-means, subtractive clustering, and fuzzy C-means clustering – were used to obtain the optimum number of clusters based on the mean silhouette coefficient and R-value before applying the fuzzy clustering algorithms. The best-fit algorithms were selected according to two criteria: precision (root mean square, R-value, mean absolute errors, and sum of square error and response time (t. The highest R-value was obtained for the multi-layer perceptron (0.89, demonstrating that the multi-layer perceptron had a high precision in traffic crash prediction among the prediction models, and that it was stable even in the presence of outliers and overlapping data. Meanwhile, in comparison with other prediction models, fuzzy subtractive clustering provided the lowest value for response time (0.284 second, 9.28 times faster than the time of multi-layer perceptron, meaning that it could lead to developing an on-line system for processing data from detectors and/or a real-time traffic database. The model can be extended through improvements based on additional data through induction procedure.

Cluster-cluster clustering

International Nuclear Information System (INIS)

Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)

1985-01-01

The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references
Quantum correlated cluster mean-field theory applied to the transverse Ising model.

Science.gov (United States)

Zimmer, F M; Schmidt, M; Maziero, Jonas

2016-06-01

Mean-field theory (MFT) is one of the main available tools for analytical calculations entailed in investigations regarding many-body systems. Recently, there has been a surge of interest in ameliorating this kind of method, mainly with the aim of incorporating geometric and correlation properties of these systems. The correlated cluster MFT (CCMFT) is an improvement that succeeded quite well in doing that for classical spin systems. Nevertheless, even the CCMFT presents some deficiencies when applied to quantum systems. In this article, we address this issue by proposing the quantum CCMFT (QCCMFT), which, in contrast to its former approach, uses general quantum states in its self-consistent mean-field equations. We apply the introduced QCCMFT to the transverse Ising model in honeycomb, square, and simple cubic lattices and obtain fairly good results both for the Curie temperature of thermal phase transition and for the critical field of quantum phase transition. Actually, our results match those obtained via exact solutions, series expansions or Monte Carlo simulations.
Model of cholera dissemination using geographic information systems and fuzzy clustering means: case study, Chabahar, Iran.

Science.gov (United States)

Pezeshki, Z; Tafazzoli-Shadpour, M; Mansourian, A; Eshrati, B; Omidi, E; Nejadqoli, I

2012-10-01

Cholera is spread by drinking water or eating food that is contaminated by bacteria, and is related to climate changes. Several epidemics have occurred in Iran, the most recent of which was in 2005 with 1133 cases and 12 deaths. This study investigated the incidence of cholera over a 10-year period in Chabahar district, a region with one of the highest incidence rates of cholera in Iran. Descriptive retrospective study on data of patients with Eltor and NAG cholera reported to the Iranian Centre of Disease Control between 1997 and 2006. Data on the prevalence of cholera were gathered through a surveillance system, and a spatial database was developed using geographic information systems (GIS) to describe the relation of spatial and climate variables to cholera incidences. Fuzzy clustering (fuzzy C) method and statistical analysis based on logistic regression were used to develop a model of cholera dissemination. The variables were demographic characteristics, specifications of cholera infection, climate conditions and some geographical parameters. The incidence of cholera was found to be significantly related to higher temperature and humidity, lower precipitation, shorter distance to the eastern border of Iran and local health centres, and longer distance to the district health centre. The fuzzy C means algorithm showed that clusters were geographically distributed in distinct regions. In order to plan, manage and monitor any public health programme, GIS provide ideal platforms for the convergence of disease-specific information, analysis and computation of new data for statistical analysis. Copyright © 2012 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.
Current and Future Tests of the Algebraic Cluster Model of12C

Science.gov (United States)

Gai, Moshe

2017-07-01

A new theoretical approach to clustering in the frame of the Algebraic Cluster Model (ACM) has been developed. It predicts, in12C, rotation-vibration structure with rotational bands of an oblate equilateral triangular symmetric spinning top with a D 3h symmetry characterized by the sequence of states: 0+, 2+, 3-, 4±, 5- with a degenerate 4+ and 4- (parity doublet) states. Our newly measured {2}2+ state in12C allows the first study of rotation-vibration structure in12C. The newly measured 5- state and 4- states fit very well the predicted ground state rotational band structure with the predicted sequence of states: 0+, 2+, 3-, 4±, 5- with almost degenerate 4+ and 4- (parity doublet) states. Such a D 3h symmetry is characteristic of triatomic molecules, but it is observed in the ground state rotational band of12C for the first time in a nucleus. We discuss predictions of the ACM of other rotation-vibration bands in12C such as the (0+) Hoyle band and the (1-) bending mode with prediction of (“missing 3- and 4-”) states that may shed new light on clustering in12C and light nuclei. In particular, the observation (or non observation) of the predicted (“missing”) states in the Hoyle band will allow us to conclude the geometrical arrangement of the three alpha particles composing the Hoyle state at 7.6542 MeV in12C. We discuss proposed research programs at the Darmstadt S- DALINAC and at the newly constructed ELI-NP facility near Bucharest to test the predictions of the ACM in isotopes of carbon.
Integrating an artificial intelligence approach with k-means clustering to model groundwater salinity: the case of Gaza coastal aquifer (Palestine)

Science.gov (United States)

Alagha, Jawad S.; Seyam, Mohammed; Md Said, Md Azlin; Mogheir, Yunes

2017-12-01

Artificial intelligence (AI) techniques have increasingly become efficient alternative modeling tools in the water resources field, particularly when the modeled process is influenced by complex and interrelated variables. In this study, two AI techniques—artificial neural networks (ANNs) and support vector machine (SVM)—were employed to achieve deeper understanding of the salinization process (represented by chloride concentration) in complex coastal aquifers influenced by various salinity sources. Both models were trained using 11 years of groundwater quality data from 22 municipal wells in Khan Younis Governorate, Gaza, Palestine. Both techniques showed satisfactory prediction performance, where the mean absolute percentage error (MAPE) and correlation coefficient ( R) for the test data set were, respectively, about 4.5 and 99.8% for the ANNs model, and 4.6 and 99.7% for SVM model. The performances of the developed models were further noticeably improved through preprocessing the wells data set using a k-means clustering method, then conducting AI techniques separately for each cluster. The developed models with clustered data were associated with higher performance, easiness and simplicity. They can be employed as an analytical tool to investigate the influence of input variables on coastal aquifer salinity, which is of great importance for understanding salinization processes, leading to more effective water-resources-related planning and decision making.
Determination System Of Food Vouchers For the Poor Based On Fuzzy C-Means Method

Science.gov (United States)

Anamisa, D. R.; Yusuf, M.; Syakur, M. A.

2018-01-01

Food vouchers are government programs to tackle the poverty of rural communities. This program aims to help the poor group in getting enough food and nutrients from carbohydrates. There are several factors that influence to receive the food voucher, such as: job, monthly income, Taxes, electricity bill, size of house, number of family member, education certificate and amount of rice consumption every week. In the execution for the distribution of vouchers is often a lot of problems, such as: the distribution of food vouchers has been misdirected and someone who receives is still subjective. Some of the solutions to decision making have not been done. The research aims to calculating the change of each partition matrix and each cluster using Fuzzy C-Means method. Hopefully this research makes contribution by providing higher result using Fuzzy C-Means comparing to other method for this case study. In this research, decision making is done by using Fuzzy C-Means method. The Fuzzy C-Means method is a clustering method that has an organized and scattered cluster structure with regular patterns on two-dimensional datasets. Furthermore, Fuzzy C-Means method used for calculates the change of each partition matrix. Each cluster will be sorted by the proximity of the data element to the centroid of the cluster to get the ranking. Various trials were conducted for grouping and ranking of proposed data that received food vouchers based on the quota of each village. This testing by Fuzzy C-Means method, is developed and abled for determining the recipient of the food voucher with satisfaction results. Fulfillment of the recipient of the food voucher is 80% to 90% and this testing using data of 115 Family Card from 6 Villages. The quality of success affected, has been using the number of iteration factors is 20 and the number of clusters is 3
ANALISIS CLUSTER K-MEANS DALAM PENGELOMPOKAN KEMAMPUAN MAHASISWA

Directory of Open Access Journals (Sweden)

B. Poerwanto

2016-12-01

Full Text Available Abstract. Cluster Analysis, K-Means Algorithm, Student Classification. This study aims to classify students based on learning outcomes for subject the basic of statistics (DDS, which is measured based on attendance, task, midterm (UTS, and final exams (UAS to further used to evaluate learning for subjects that require analysis of quantitative . This study uses k-means cluster analysis to classify the students into three groups based on learning outcomes. After grouped, there are 3 people in the low category, 27 in the medium category and over 70% in the high category.Abstrak. Analisis Cluster K-Means dalam Pengelompokan Kemampuan Mahasiswa. Pene-litian ini bertujuan untuk mengelompokkan mahasiswa berdasarkan hasil belajar mata kuliah dasar-dasar statistika (DDS yang diukur berdasarkan variabel nilai kehadiran, tugas, ujian tengah semester (UTS, dan ujian akhir semester (UAS untuk selanjutnya digunakan untuk mengevaluasi pembelajaran untuk mata kuliah yang membutuhkan kemampuan analisis kuantititatif yang baik. Penelitian ini menggunakan analisis cluster k-means dalam mengelompokkan mahasiswa ke dalam tiga kelompok berdasarkan hasil belajarnya. Seteleh dikelompokkan, terdapat 3 orang yang masuk pada kategori rendah, 27 orang pada kategori sedang dan lebih dari 70% pada kategori tinggi.Kata Kunci: Cluster Analysis, K-Means Algoritma, Klasifikasi Mahasiswa, Universitas Cokroaminoto Palopo
An image segmentation method based on fuzzy C-means clustering and Cuckoo search algorithm

Science.gov (United States)

Wang, Mingwei; Wan, Youchuan; Gao, Xianjun; Ye, Zhiwei; Chen, Maolin

2018-04-01

Image segmentation is a significant step in image analysis and machine vision. Many approaches have been presented in this topic; among them, fuzzy C-means (FCM) clustering is one of the most widely used methods for its high efficiency and ambiguity of images. However, the success of FCM could not be guaranteed because it easily traps into local optimal solution. Cuckoo search (CS) is a novel evolutionary algorithm, which has been tested on some optimization problems and proved to be high-efficiency. Therefore, a new segmentation technique using FCM and blending of CS algorithm is put forward in the paper. Further, the proposed method has been measured on several images and compared with other existing FCM techniques such as genetic algorithm (GA) based FCM and particle swarm optimization (PSO) based FCM in terms of fitness value. Experimental results indicate that the proposed method is robust, adaptive and exhibits the better performance than other methods involved in the paper.
The Mean and Scatter of the Velocity Dispersion-Optical Richness Relation for MaxBCG Galaxy Clusters

Energy Technology Data Exchange (ETDEWEB)

Becker, M.R.; McKay, T.A.; /Michigan U.; Koester, B.; /Chicago U., Astron. Astrophys. Ctr.; Wechsler, R.H.; /KIPAC, Menlo Park /SLAC /Stanford U., Phys. Dept.; Rozo, E.; /Ohio State U.; Evrard, A.; /Michigan U. /Michigan U., MCTP; Johnston, D.; /Caltech, JPL; Sheldon, E.; /New York U.; Annis, J.; /Fermilab; Lau, E.; /Chicago U., Astron. Astrophys. Ctr.; Nichol, R.; /Portsmouth U., ICG; Miller, C.; /Michigan U.

2007-06-05

The distribution of galaxies in position and velocity around the centers of galaxy clusters encodes important information about cluster mass and structure. Using the maxBCG galaxy cluster catalog identified from imaging data obtained in the Sloan Digital Sky Survey, we study the BCG--galaxy velocity correlation function. By modeling its non-Gaussianity, we measure the mean and scatter in velocity dispersion at fixed richness. The mean velocity dispersion increases from 202 {+-} 10 km s{sup -1} for small groups to more than 854 {+-} 102 km s{sup -1} for large clusters. We show the scatter to be at most 40.5{+-}3.5%, declining to 14.9{+-}9.4% in the richest bins. We test our methods in the C4 cluster catalog, a spectroscopic cluster catalog produced from the Sloan Digital Sky Survey DR2 spectroscopic sample, and in mock galaxy catalogs constructed from N-body simulations. Our methods are robust, measuring the scatter to well within one-sigma of the true value, and the mean to within 10%, in the mock catalogs. By convolving the scatter in velocity dispersion at fixed richness with the observed richness space density function, we measure the velocity dispersion function of the maxBCG galaxy clusters. Although velocity dispersion and richness do not form a true mass--observable relation, the relationship between velocity dispersion and mass is theoretically well characterized and has low scatter. Thus our results provide a key link between theory and observations up to the velocity bias between dark matter and galaxies.
Number of Clusters and the Quality of Hybrid Predictive Models in Analytical CRM

Directory of Open Access Journals (Sweden)

Łapczyński Mariusz

2014-08-01

Full Text Available Making more accurate marketing decisions by managers requires building effective predictive models. Typically, these models specify the probability of customer belonging to a particular category, group or segment. The analytical CRM categories refer to customers interested in starting cooperation with the company (acquisition models, customers who purchase additional products (cross- and up-sell models or customers intending to resign from the cooperation (churn models. During building predictive models researchers use analytical tools from various disciplines with an emphasis on their best performance. This article attempts to build a hybrid predictive model combining decision trees (C&RT algorithm and cluster analysis (k-means. During experiments five different cluster validity indices and eight datasets were used. The performance of models was evaluated by using popular measures such as: accuracy, precision, recall, G-mean, F-measure and lift in the first and in the second decile. The authors tried to find a connection between the number of clusters and models' quality.
Improved R2* liver iron concentration assessment using a novel fuzzy c-mean clustering scheme

International Nuclear Information System (INIS)

Saiviroonporn, Pairash; Viprakasit, Vip; Krittayaphong, Rungroj

2015-01-01

In thalassemia patients, R2* liver iron concentration (LIC) measurement is a common clinical tool for assessing iron overload and for determining necessary chelator dose and evaluating its efficacy. Despite the importance of accurate LIC measurement, existing methods suffer from LIC variability, especially at the severe iron overload range due to inclusion of vessel parts in LIC calculation. In this study, we build upon previous Fuzzy C-Mean (FCM) clustering work to formulate a scheme with superior performance in segmenting vessel pixels from the parenchyma. Our method (MIX-FCM) combines our novel 2D-FCM with the existing 1D-FCM algorithm. This study further assessed possible optimal clustering parameters (OP scheme) and proposed a semi-automatic (SA) scheme for routine clinical application. Segmentation of liver parenchyma and vessels was performed on T2* images and their LIC maps in 196 studies from 147 thalassemia major patients. We used manual segmentation as the reference. 1D-FCM clustering was performed on the acquired image alone and 2D-FCM used both the acquired image and its LIC data. To execute the MIX-FCM method, the best outcome (OP-MIX-FCM) was selected from the aforementioned methods and was compared to the SA-MIX-FCM scheme. We used the percent value of the normalized interquartile range (nIQR) to its median to evaluate the variability of all methods. 2D-FCM clustering is more effective than 1D-FCM clustering at the severe overload range only, but inferior for other ranges (where 1D-FCM provides suitable results). This complementary performance between the two methods allows MIX-FCM to improve results for all ranges. OP-MIX-FCM clustering error was 2.1 ± 2.3 %, compared with 10.3 ± 9.9 % and 7.0 ± 11.9 % from 1D- and 2D-FCM clustering, respectively. SA-MIX-FCM result was comparable to OP-MIX-FCM result, with both schemes showing ability to decrease overall nIQR by approximately 30 %. Our proposed 2D-FCM algorithm is not as superior to 1D-FCM as
Clustering Using Boosted Constrained k-Means Algorithm

Directory of Open Access Journals (Sweden)

Masayuki Okabe

2018-03-01

Full Text Available This article proposes a constrained clustering algorithm with competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowledge, although easy to implement and quick, has insufficient performance compared with metric learning-based methods. Since it simply adds a function into the data assignment process of the k-means algorithm to check for constraint violations, it often exploits only a small number of constraints. Metric learning-based methods, which exploit constraints to create a new metric for data similarity, have shown promising results although the methods proposed so far are often slow depending on the amount of data or number of feature dimensions. We present a method that exploits the advantages of the constrained k-means and metric learning approaches. It incorporates a mechanism for accepting constraint priorities and a metric learning framework based on the boosting principle into a constrained k-means algorithm. In the framework, a metric is learned in the form of a kernel matrix that integrates weak cluster hypotheses produced by the constrained k-means algorithm, which works as a weak learner under the boosting principle. Experimental results for 12 data sets from 3 data sources demonstrated that our method has performance competitive to those of state-of-the-art constrained clustering methods for most data sets and that it takes much less computation time. Experimental evaluation demonstrated the effectiveness of controlling the constraint priorities by using the boosting principle and that our constrained k-means algorithm functions correctly as a weak learner of boosting.
Predicting the mean cycle time as a function of throughput and product mix for cluster tool workstations using EPT-based aggregate modeling

NARCIS (Netherlands)

Veeger, C.P.L.; Etman, L.F.P.; Herk, van J.; Rooda, J.E.

2009-01-01

Predicting the mean cycle time as a function of throughput and product mix is helpful in making the production planning for cluster tools. To predict the mean cycle time, detailed simulation models may be used. However, detailed models require much development time, and it may not be possible to
Identify High-Quality Protein Structural Models by Enhanced K-Means.

Science.gov (United States)

Wu, Hongjie; Li, Haiou; Jiang, Min; Chen, Cheng; Lv, Qiang; Wu, Chuang

2017-01-01

Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K -means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K -means clustering ( SK -means), whereas the other employs squared distance to optimize the initial centroids ( K -means++). Our results showed that SK -means and K -means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K -means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK -means and K -means++ demonstrated substantial improvements relative to results from SPICKER and classical K -means.
Canonical PSO Based K-Means Clustering Approach for Real Datasets.

Science.gov (United States)

Dey, Lopamudra; Chakraborty, Sanjay

2014-01-01

"Clustering" the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different types of indexes are used to solve different types of problems and indices selection depends on the kind of available data. This paper first proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the effects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. This paper also describes the nature of the clusters and finally compares the performances of these clustering algorithms according to the validity assessment. It also defines which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms.
An extended k-means technique for clustering moving objects

Directory of Open Access Journals (Sweden)

Omnia Ossama

2011-03-01

Full Text Available k-means algorithm is one of the basic clustering techniques that is used in many data mining applications. In this paper we present a novel pattern based clustering algorithm that extends the k-means algorithm for clustering moving object trajectory data. The proposed algorithm uses a key feature of moving object trajectories namely, its direction as a heuristic to determine the different number of clusters for the k-means algorithm. In addition, we use the silhouette coefficient as a measure for the quality of our proposed approach. Finally, we present experimental results on both real and synthetic data that show the performance and accuracy of our proposed technique.
Reducing Earth Topography Resolution for SMAP Mission Ground Tracks Using K-Means Clustering

Science.gov (United States)

Rizvi, Farheen

2013-01-01

The K-means clustering algorithm is used to reduce Earth topography resolution for the SMAP mission ground tracks. As SMAP propagates in orbit, knowledge of the radar antenna footprints on Earth is required for the antenna misalignment calibration. Each antenna footprint contains a latitude and longitude location pair on the Earth surface. There are 400 pairs in one data set for the calibration model. It is computationally expensive to calculate corresponding Earth elevation for these data pairs. Thus, the antenna footprint resolution is reduced. Similar topographical data pairs are grouped together with the K-means clustering algorithm. The resolution is reduced to the mean of each topographical cluster called the cluster centroid. The corresponding Earth elevation for each cluster centroid is assigned to the entire group. Results show that 400 data points are reduced to 60 while still maintaining algorithm performance and computational efficiency. In this work, sensitivity analysis is also performed to show a trade-off between algorithm performance versus computational efficiency as the number of cluster centroids and algorithm iterations are increased.
Cluster Correlation in Mixed Models

Science.gov (United States)

Gardini, A.; Bonometto, S. A.; Murante, G.; Yepes, G.

2000-10-01

We evaluate the dependence of the cluster correlation length, rc, on the mean intercluster separation, Dc, for three models with critical matter density, vanishing vacuum energy (Λ=0), and COBE normalization: a tilted cold dark matter (tCDM) model (n=0.8) and two blue mixed models with two light massive neutrinos, yielding Ωh=0.26 and 0.14 (MDM1 and MDM2, respectively). All models approach the observational value of σ8 (and hence the observed cluster abundance) and are consistent with the observed abundance of damped Lyα systems. Mixed models have a motivation in recent results of neutrino physics; they also agree with the observed value of the ratio σ8/σ25, yielding the spectral slope parameter Γ, and nicely fit Las Campanas Redshift Survey (LCRS) reconstructed spectra. We use parallel AP3M simulations, performed in a wide box (of side 360 h-1 Mpc) and with high mass and distance resolution, enabling us to build artificial samples of clusters, whose total number and mass range allow us to cover the same Dc interval inspected through Automatic Plate Measuring Facility (APM) and Abell cluster clustering data. We find that the tCDM model performs substantially better than n=1 critical density CDM models. Our main finding, however, is that mixed models provide a surprisingly good fit to cluster clustering data.
Multiresolution edge detection using enhanced fuzzy c-means clustering for ultrasound image speckle reduction

Energy Technology Data Exchange (ETDEWEB)

Tsantis, Stavros [Department of Medical Physics, School of Medicine, University of Patras, Rion, GR 26504 (Greece); Spiliopoulos, Stavros; Karnabatidis, Dimitrios [Department of Radiology, School of Medicine, University of Patras, Rion, GR 26504 (Greece); Skouroliakou, Aikaterini [Department of Energy Technology Engineering, Technological Education Institute of Athens, Athens 12210 (Greece); Hazle, John D. [Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030 (United States); Kagadis, George C., E-mail: gkagad@gmail.com, E-mail: George.Kagadis@med.upatras.gr, E-mail: GKagadis@mdanderson.org [Department of Medical Physics, School of Medicine, University of Patras, Rion, GR 26504, Greece and Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030 (United States)

2014-07-15

Purpose: Speckle suppression in ultrasound (US) images of various anatomic structures via a novel speckle noise reduction algorithm. Methods: The proposed algorithm employs an enhanced fuzzy c-means (EFCM) clustering and multiresolution wavelet analysis to distinguish edges from speckle noise in US images. The edge detection procedure involves a coarse-to-fine strategy with spatial and interscale constraints so as to classify wavelet local maxima distribution at different frequency bands. As an outcome, an edge map across scales is derived whereas the wavelet coefficients that correspond to speckle are suppressed in the inverse wavelet transform acquiring the denoised US image. Results: A total of 34 thyroid, liver, and breast US examinations were performed on a Logiq 9 US system. Each of these images was subjected to the proposed EFCM algorithm and, for comparison, to commercial speckle reduction imaging (SRI) software and another well-known denoising approach, Pizurica's method. The quantification of the speckle suppression performance in the selected set of US images was carried out via Speckle Suppression Index (SSI) with results of 0.61, 0.71, and 0.73 for EFCM, SRI, and Pizurica's methods, respectively. Peak signal-to-noise ratios of 35.12, 33.95, and 29.78 and edge preservation indices of 0.94, 0.93, and 0.86 were found for the EFCM, SIR, and Pizurica's method, respectively, demonstrating that the proposed method achieves superior speckle reduction performance and edge preservation properties. Based on two independent radiologists’ qualitative evaluation the proposed method significantly improved image characteristics over standard baseline B mode images, and those processed with the Pizurica's method. Furthermore, it yielded results similar to those for SRI for breast and thyroid images significantly better results than SRI for liver imaging, thus improving diagnostic accuracy in both superficial and in-depth structures. Conclusions: A
Multiresolution edge detection using enhanced fuzzy c-means clustering for ultrasound image speckle reduction

International Nuclear Information System (INIS)

Tsantis, Stavros; Spiliopoulos, Stavros; Karnabatidis, Dimitrios; Skouroliakou, Aikaterini; Hazle, John D.; Kagadis, George C.

2014-01-01

Purpose: Speckle suppression in ultrasound (US) images of various anatomic structures via a novel speckle noise reduction algorithm. Methods: The proposed algorithm employs an enhanced fuzzy c-means (EFCM) clustering and multiresolution wavelet analysis to distinguish edges from speckle noise in US images. The edge detection procedure involves a coarse-to-fine strategy with spatial and interscale constraints so as to classify wavelet local maxima distribution at different frequency bands. As an outcome, an edge map across scales is derived whereas the wavelet coefficients that correspond to speckle are suppressed in the inverse wavelet transform acquiring the denoised US image. Results: A total of 34 thyroid, liver, and breast US examinations were performed on a Logiq 9 US system. Each of these images was subjected to the proposed EFCM algorithm and, for comparison, to commercial speckle reduction imaging (SRI) software and another well-known denoising approach, Pizurica's method. The quantification of the speckle suppression performance in the selected set of US images was carried out via Speckle Suppression Index (SSI) with results of 0.61, 0.71, and 0.73 for EFCM, SRI, and Pizurica's methods, respectively. Peak signal-to-noise ratios of 35.12, 33.95, and 29.78 and edge preservation indices of 0.94, 0.93, and 0.86 were found for the EFCM, SIR, and Pizurica's method, respectively, demonstrating that the proposed method achieves superior speckle reduction performance and edge preservation properties. Based on two independent radiologists’ qualitative evaluation the proposed method significantly improved image characteristics over standard baseline B mode images, and those processed with the Pizurica's method. Furthermore, it yielded results similar to those for SRI for breast and thyroid images significantly better results than SRI for liver imaging, thus improving diagnostic accuracy in both superficial and in-depth structures. Conclusions: A

Utility of K-Means clustering algorithm in differentiating apparent diffusion coefficient values between benign and malignant neck pathologies

Science.gov (United States)

Srinivasan, A.; Galbán, C.J.; Johnson, T.D.; Chenevert, T.L.; Ross, B.D.; Mukherji, S.K.

2014-01-01

Purpose The objective of our study was to analyze the differences between apparent diffusion coefficient (ADC) partitions (created using the K-Means algorithm) between benign and malignant neck lesions and evaluate its benefit in distinguishing these entities. Material and methods MRI studies of 10 benign and 10 malignant proven neck pathologies were post-processed on a PC using in-house software developed in MATLAB (The MathWorks, Inc., Natick, MA). Lesions were manually contoured by two neuroradiologists with the ADC values within each lesion clustered into two (low ADC-ADCL, high ADC-ADCH) and three partitions (ADCL, intermediate ADC-ADCI, ADCH) using the K-Means clustering algorithm. An unpaired two-tailed Student’s t-test was performed for all metrics to determine statistical differences in the means between the benign and malignant pathologies. Results Statistically significant difference between the mean ADCL clusters in benign and malignant pathologies was seen in the 3 cluster models of both readers (p=0.03, 0.022 respectively) and the 2 cluster model of reader 2 (p=0.04) with the other metrics (ADCH, ADCI, whole lesion mean ADC) not revealing any significant differences. Receiver operating characteristics curves demonstrated the quantitative difference in mean ADCH and ADCL in both the 2 and 3 cluster models to be predictive of malignancy (2 clusters: p=0.008, area under curve=0.850, 3 clusters: p=0.01, area under curve=0.825). Conclusion The K-Means clustering algorithm that generates partitions of large datasets may provide a better characterization of neck pathologies and may be of additional benefit in distinguishing benign and malignant neck pathologies compared to whole lesion mean ADC alone. PMID:20007723
Optimasi Pusat Cluster Awal K-Means dengan Algoritma Genetika Pada Pengelompokan Dokumen

OpenAIRE

Fauzi, Muhammad

2017-01-01

147038065 Clustering a data set of documents based on certain data points in documents are an easy way to organize document for extension to work. K-Means clustering algorithm is one of iterative cluster algorithm to partition a set of entities into K cluster. Unfortunately, resulting in K?Means cluster is depending on the initial cluster center that generally assigned randomly. In this reserach, determining initial cluster center K-Means for documents clustering are investi...
Cluster analysis of polymers using laser-induced breakdown spectroscopy with K-means

Science.gov (United States)

Yangmin, GUO; Yun, TANG; Yu, DU; Shisong, TANG; Lianbo, GUO; Xiangyou, LI; Yongfeng, LU; Xiaoyan, ZENG

2018-06-01

Laser-induced breakdown spectroscopy (LIBS) combined with K-means algorithm was employed to automatically differentiate industrial polymers under atmospheric conditions. The unsupervised learning algorithm K-means were utilized for the clustering of LIBS dataset measured from twenty kinds of industrial polymers. To prevent the interference from metallic elements, three atomic emission lines (C I 247.86 nm , H I 656.3 nm, and O I 777.3 nm) and one molecular line C–N (0, 0) 388.3 nm were used. The cluster analysis results were obtained through an iterative process. The Davies–Bouldin index was employed to determine the initial number of clusters. The average relative standard deviation values of characteristic spectral lines were used as the iterative criterion. With the proposed approach, the classification accuracy for twenty kinds of industrial polymers achieved 99.6%. The results demonstrated that this approach has great potential for industrial polymers recycling by LIBS.
Cluster-cluster correlations in the two-dimensional stationary Ising-model

International Nuclear Information System (INIS)

Klassmann, A.

1997-01-01

In numerical integration of the Cahn-Hillard equation, which describes Oswald rising in a two-phase matrix, N. Masbaum showed that spatial correlations between clusters scale with respect to the mean cluster size (itself a function of time). T. B. Liverpool showed by Monte Carlo simulations for the Ising model that the analogous correlations have a similar form. Both demonstrated that immediately around each cluster there is some depletion area followed by something like a ring of clusters of the same size as the original one. More precisely, it has been shown that the distribution of clusters around a given cluster looks like a sinus-curve decaying exponentially with respect to the distance to a constant value
Segmentation of dermatoscopic images by frequency domain filtering and k-means clustering algorithms.

Science.gov (United States)

Rajab, Maher I

2011-11-01

Since the introduction of epiluminescence microscopy (ELM), image analysis tools have been extended to the field of dermatology, in an attempt to algorithmically reproduce clinical evaluation. Accurate image segmentation of skin lesions is one of the key steps for useful, early and non-invasive diagnosis of coetaneous melanomas. This paper proposes two image segmentation algorithms based on frequency domain processing and k-means clustering/fuzzy k-means clustering. The two methods are capable of segmenting and extracting the true border that reveals the global structure irregularity (indentations and protrusions), which may suggest excessive cell growth or regression of a melanoma. As a pre-processing step, Fourier low-pass filtering is applied to reduce the surrounding noise in a skin lesion image. A quantitative comparison of the techniques is enabled by the use of synthetic skin lesion images that model lesions covered with hair to which Gaussian noise is added. The proposed techniques are also compared with an established optimal-based thresholding skin-segmentation method. It is demonstrated that for lesions with a range of different border irregularity properties, the k-means clustering and fuzzy k-means clustering segmentation methods provide the best performance over a range of signal to noise ratios. The proposed segmentation techniques are also demonstrated to have similar performance when tested on real skin lesions representing high-resolution ELM images. This study suggests that the segmentation results obtained using a combination of low-pass frequency filtering and k-means or fuzzy k-means clustering are superior to the result that would be obtained by using k-means or fuzzy k-means clustering segmentation methods alone. © 2011 John Wiley & Sons A/S.
Nanocomposite metal/plasma polymer films prepared by means of gas aggregation cluster source

Energy Technology Data Exchange (ETDEWEB)

Polonskyi, O.; Solar, P.; Kylian, O.; Drabik, M.; Artemenko, A.; Kousal, J.; Hanus, J.; Pesicka, J.; Matolinova, I. [Charles University in Prague, Faculty of Mathematics and Physics, V Holesovickach 2, 18000 Prague 8 (Czech Republic); Kolibalova, E. [Tescan, Libusina trida 21, 632 00 Brno (Czech Republic); Slavinska, D. [Charles University in Prague, Faculty of Mathematics and Physics, V Holesovickach 2, 18000 Prague 8 (Czech Republic); Biederman, H., E-mail: bieder@kmf.troja.mff.cuni.cz [Charles University in Prague, Faculty of Mathematics and Physics, V Holesovickach 2, 18000 Prague 8 (Czech Republic)

2012-04-02

Nanocomposite metal/plasma polymer films have been prepared by simultaneous plasma polymerization using a mixture of Ar/n-hexane and metal cluster beams. A simple compact cluster gas aggregation source is described and characterized with emphasis on the determination of the amount of charged clusters and their size distribution. It is shown that the fraction of neutral, positively and negatively charged nanoclusters leaving the gas aggregation source is largely influenced by used operational conditions. In addition, it is demonstrated that a large portion of Ag clusters is positively charged, especially when higher currents are used for their production. Deposition of nanocomposite Ag/C:H plasma polymer films is described in detail by means of cluster gas aggregation source. Basic characterization of the films is performed using transmission electron microscopy, ultraviolet-visible and Fourier-transform infrared spectroscopies. It is shown that the morphology, structure and optical properties of such prepared nanocomposites differ significantly from the ones fabricated by means of magnetron sputtering of Ag target in Ar/n-hexane mixture.
Prediction of settled water turbidity and optimal coagulant dosage in drinking water treatment plant using a hybrid model of k-means clustering and adaptive neuro-fuzzy inference system

Science.gov (United States)

Kim, Chan Moon; Parnichkun, Manukid

2017-11-01

Coagulation is an important process in drinking water treatment to attain acceptable treated water quality. However, the determination of coagulant dosage is still a challenging task for operators, because coagulation is nonlinear and complicated process. Feedback control to achieve the desired treated water quality is difficult due to lengthy process time. In this research, a hybrid of k-means clustering and adaptive neuro-fuzzy inference system ( k-means-ANFIS) is proposed for the settled water turbidity prediction and the optimal coagulant dosage determination using full-scale historical data. To build a well-adaptive model to different process states from influent water, raw water quality data are classified into four clusters according to its properties by a k-means clustering technique. The sub-models are developed individually on the basis of each clustered data set. Results reveal that the sub-models constructed by a hybrid k-means-ANFIS perform better than not only a single ANFIS model, but also seasonal models by artificial neural network (ANN). The finally completed model consisting of sub-models shows more accurate and consistent prediction ability than a single model of ANFIS and a single model of ANN based on all five evaluation indices. Therefore, the hybrid model of k-means-ANFIS can be employed as a robust tool for managing both treated water quality and production costs simultaneously.
Clustering performance comparison using K-means and expectation maximization algorithms.

Science.gov (United States)

Jung, Yong Gyu; Kang, Min Soo; Heo, Jun

2014-11-14

Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
The global kernel k-means algorithm for clustering in feature space.

Science.gov (United States)

Tzortzis, Grigorios F; Likas, Aristidis C

2009-07-01

Kernel k-means is an extension of the standard k -means clustering algorithm that identifies nonlinearly separable clusters. In order to overcome the cluster initialization problem associated with this method, we propose the global kernel k-means algorithm, a deterministic and incremental approach to kernel-based clustering. Our method adds one cluster at each stage, through a global search procedure consisting of several executions of kernel k-means from suitable initializations. This algorithm does not depend on cluster initialization, identifies nonlinearly separable clusters, and, due to its incremental nature and search procedure, locates near-optimal solutions avoiding poor local minima. Furthermore, two modifications are developed to reduce the computational cost that do not significantly affect the solution quality. The proposed methods are extended to handle weighted data points, which enables their application to graph partitioning. We experiment with several data sets and the proposed approach compares favorably to kernel k -means with random restarts.
Utility of the k-means clustering algorithm in differentiating apparent diffusion coefficient values of benign and malignant neck pathologies.

Science.gov (United States)

Srinivasan, A; Galbán, C J; Johnson, T D; Chenevert, T L; Ross, B D; Mukherji, S K

2010-04-01

Does the K-means algorithm do a better job of differentiating benign and malignant neck pathologies compared to only mean ADC? The objective of our study was to analyze the differences between ADC partitions to evaluate whether the K-means technique can be of additional benefit to whole-lesion mean ADC alone in distinguishing benign and malignant neck pathologies. MR imaging studies of 10 benign and 10 malignant proved neck pathologies were postprocessed on a PC by using in-house software developed in Matlab. Two neuroradiologists manually contoured the lesions, with the ADC values within each lesion clustered into 2 (low, ADC-ADC(L); high, ADC-ADC(H)) and 3 partitions (ADC(L); intermediate, ADC-ADC(I); ADC(H)) by using the K-means clustering algorithm. An unpaired 2-tailed Student t test was performed for all metrics to determine statistical differences in the means of the benign and malignant pathologies. A statistically significant difference between the mean ADC(L) clusters in benign and malignant pathologies was seen in the 3-cluster models of both readers (P = .03 and .022, respectively) and the 2-cluster model of reader 2 (P = .04), with the other metrics (ADC(H), ADC(I); whole-lesion mean ADC) not revealing any significant differences. ROC curves demonstrated the quantitative differences in mean ADC(H) and ADC(L) in both the 2- and 3-cluster models to be predictive of malignancy (2 clusters: P = .008, area under curve = 0.850; 3 clusters: P = .01, area under curve = 0.825). The K-means clustering algorithm that generates partitions of large datasets may provide a better characterization of neck pathologies and may be of additional benefit in distinguishing benign and malignant neck pathologies compared with whole-lesion mean ADC alone.
Worst-case and smoothed analysis of k-means clustering with Bregman divergences

NARCIS (Netherlands)

Manthey, Bodo; Röglin, H.

2013-01-01

The $k$-means method is the method of choice for clustering large-scale data sets and it performs exceedingly well in practice despite its exponential worst-case running-time. To narrow the gap between theory and practice, $k$-means has been studied in the semi-random input model of smoothed
Simultaneous determination of aquifer parameters and zone structures with fuzzy c-means clustering and meta-heuristic harmony search algorithm

Science.gov (United States)

Ayvaz, M. Tamer

2007-11-01

This study proposes an inverse solution algorithm through which both the aquifer parameters and the zone structure of these parameters can be determined based on a given set of observations on piezometric heads. In the zone structure identification problem fuzzy c-means ( FCM) clustering method is used. The association of the zone structure with the transmissivity distribution is accomplished through an optimization model. The meta-heuristic harmony search ( HS) algorithm, which is conceptualized using the musical process of searching for a perfect state of harmony, is used as an optimization technique. The optimum parameter zone structure is identified based on three criteria which are the residual error, parameter uncertainty, and structure discrimination. A numerical example given in the literature is solved to demonstrate the performance of the proposed algorithm. Also, a sensitivity analysis is performed to test the performance of the HS algorithm for different sets of solution parameters. Results indicate that the proposed solution algorithm is an effective way in the simultaneous identification of aquifer parameters and their corresponding zone structures.
Measuring the Mean and Scatter of the X-ray Luminosity -- Optical Richness Relation for maxBCG Galaxy Clusters

Energy Technology Data Exchange (ETDEWEB)

Rykoff, E.S.; McKay, T.A.; Becker, M.A.; Evrard, A.; Johnston, D.E.; Koester, B.P.; Rozo, E.; Sheldon, E.S.; Wechsler, Risa H.

2007-10-02

We interpret and model the statistical weak lensing measurements around 130,000 groups and clusters of galaxies in the Sloan Digital Sky Survey presented by Sheldon et al. (2007). We present non-parametric inversions of the 2D shear profiles to the mean 3D cluster density and mass profiles in bins of both optical richness and cluster i-band luminosity. Since the mean cluster density profile is proportional to the cluster-mass correlation function, the mean profile is spherically symmetric by the assumptions of large-scale homogeneity and isotropy. We correct the inferred 3D profiles for systematic effects, including non-linear shear and the fact that cluster halos are not all precisely centered on their brightest galaxies. We also model the measured cluster shear profile as a sum of contributions from the brightest central galaxy, the cluster dark matter halo, and neighboring halos. We infer the relations between mean cluster virial mass and optical richness and luminosity over two orders of magnitude in cluster mass; the virial mass at fixed richness or luminosity is determined with a precision of {approx} 13% including both statistical and systematic errors. We also constrain the halo concentration parameter and halo bias as a function of cluster mass; both are in good agreement with predictions from N-body simulations of LCDM models. The methods employed here will be applicable to deeper, wide-area optical surveys that aim to constrain the nature of the dark energy, such as the Dark Energy Survey, the Large Synoptic Survey Telescope and space-based surveys.
Automatic online spike sorting with singular value decomposition and fuzzy C-mean clustering

Directory of Open Access Journals (Sweden)

Oliynyk Andriy

2012-08-01

Full Text Available Abstract Background Understanding how neurons contribute to perception, motor functions and cognition requires the reliable detection of spiking activity of individual neurons during a number of different experimental conditions. An important problem in computational neuroscience is thus to develop algorithms to automatically detect and sort the spiking activity of individual neurons from extracellular recordings. While many algorithms for spike sorting exist, the problem of accurate and fast online sorting still remains a challenging issue. Results Here we present a novel software tool, called FSPS (Fuzzy SPike Sorting, which is designed to optimize: (i fast and accurate detection, (ii offline sorting and (iii online classification of neuronal spikes with very limited or null human intervention. The method is based on a combination of Singular Value Decomposition for fast and highly accurate pre-processing of spike shapes, unsupervised Fuzzy C-mean, high-resolution alignment of extracted spike waveforms, optimal selection of the number of features to retain, automatic identification the number of clusters, and quantitative quality assessment of resulting clusters independent on their size. After being trained on a short testing data stream, the method can reliably perform supervised online classification and monitoring of single neuron activity. The generalized procedure has been implemented in our FSPS spike sorting software (available free for non-commercial academic applications at the address: http://www.spikesorting.com using LabVIEW (National Instruments, USA. We evaluated the performance of our algorithm both on benchmark simulated datasets with different levels of background noise and on real extracellular recordings from premotor cortex of Macaque monkeys. The results of these tests showed an excellent accuracy in discriminating low-amplitude and overlapping spikes under strong background noise. The performance of our method is
Automatic online spike sorting with singular value decomposition and fuzzy C-mean clustering.

Science.gov (United States)

Oliynyk, Andriy; Bonifazzi, Claudio; Montani, Fernando; Fadiga, Luciano

2012-08-08

Understanding how neurons contribute to perception, motor functions and cognition requires the reliable detection of spiking activity of individual neurons during a number of different experimental conditions. An important problem in computational neuroscience is thus to develop algorithms to automatically detect and sort the spiking activity of individual neurons from extracellular recordings. While many algorithms for spike sorting exist, the problem of accurate and fast online sorting still remains a challenging issue. Here we present a novel software tool, called FSPS (Fuzzy SPike Sorting), which is designed to optimize: (i) fast and accurate detection, (ii) offline sorting and (iii) online classification of neuronal spikes with very limited or null human intervention. The method is based on a combination of Singular Value Decomposition for fast and highly accurate pre-processing of spike shapes, unsupervised Fuzzy C-mean, high-resolution alignment of extracted spike waveforms, optimal selection of the number of features to retain, automatic identification the number of clusters, and quantitative quality assessment of resulting clusters independent on their size. After being trained on a short testing data stream, the method can reliably perform supervised online classification and monitoring of single neuron activity. The generalized procedure has been implemented in our FSPS spike sorting software (available free for non-commercial academic applications at the address: http://www.spikesorting.com) using LabVIEW (National Instruments, USA). We evaluated the performance of our algorithm both on benchmark simulated datasets with different levels of background noise and on real extracellular recordings from premotor cortex of Macaque monkeys. The results of these tests showed an excellent accuracy in discriminating low-amplitude and overlapping spikes under strong background noise. The performance of our method is competitive with respect to other robust spike
Automatic detection of erythemato-squamous diseases using k-means clustering.

Science.gov (United States)

Ubeyli, Elif Derya; Doğdu, Erdoğan

2010-04-01

A new approach based on the implementation of k-means clustering is presented for automated detection of erythemato-squamous diseases. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. The studied domain contained records of patients with known diagnosis. The k-means clustering algorithm's task was to classify the data points, in this case the patients with attribute data, to one of the five clusters. The algorithm was used to detect the five erythemato-squamous diseases when 33 features defining five disease indications were used. The purpose is to determine an optimum classification scheme for this problem. The present research demonstrated that the features well represent the erythemato-squamous diseases and the k-means clustering algorithm's task achieved high classification accuracies for only five erythemato-squamous diseases.
Linear regression models and k-means clustering for statistical analysis of fNIRS data.

Science.gov (United States)

Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro

2015-02-01

We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.
C-Vine copula mixture model for clustering of residential electrical load pattern data

OpenAIRE

Sun, M; Konstantelos, I; Strbac, G

2016-01-01

The ongoing deployment of residential smart meters in numerous jurisdictions has led to an influx of electricity consumption data. This information presents a valuable opportunity to suppliers for better understanding their customer base and designing more effective tariff structures. In the past, various clustering methods have been proposed for meaningful customer partitioning. This paper presents a novel finite mixture modeling framework based on C-vine copulas (CVMM) for carrying out cons...
A Variable-Selection Heuristic for K-Means Clustering.

Science.gov (United States)

Brusco, Michael J.; Cradit, J. Dennis

2001-01-01

Presents a variable selection heuristic for nonhierarchical (K-means) cluster analysis based on the adjusted Rand index for measuring cluster recovery. Subjected the heuristic to Monte Carlo testing across more than 2,200 datasets. Results indicate that the heuristic is extremely effective at eliminating masking variables. (SLD)
Developing cluster strategy of apples dodol SMEs by integration K-means clustering and analytical hierarchy process method

Science.gov (United States)

Mustaniroh, S. A.; Effendi, U.; Silalahi, R. L. R.; Sari, T.; Ala, M.

2018-03-01

The purposes of this research were to determine the grouping of apples dodol small and medium enterprises (SMEs) in Batu City and to determine an appropriate development strategy for each cluster. The methods used for clustering SMEs was k-means. The Analytical Hierarchy Process (AHP) approach was then applied to determine the development strategy priority for each cluster. The variables used in grouping include production capacity per month, length of operation, investment value, average sales revenue per month, amount of SMEs assets, and the number of workers. Several factors were considered in AHP include industry cluster, government, as well as related and supporting industries. Data was collected using the methods of questionaire and interviews. SMEs respondents were selected among SMEs appels dodol in Batu City using purposive sampling. The result showed that two clusters were formed from five apples dodol SMEs. The 1stcluster of apples dodol SMEs, classified as small enterprises, included SME A, SME C, and SME D. The 2ndcluster of SMEs apples dodol, classified as medium enterprises, consisted of SME B and SME E. The AHP results indicated that the priority development strategy for the 1stcluster of apples dodol SMEs was improving quality and the product standardisation, while for the 2nd cluster was increasing the marketing access.

Cluster-based analysis of multi-model climate ensembles

Science.gov (United States)

Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.

2018-06-01

Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and
Merging K-means with hierarchical clustering for identifying general-shaped groups.

Science.gov (United States)

Peterson, Anna D; Ghosh, Arka P; Maitra, Ranjan

2018-01-01

Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and K -means clustering are two approaches but have different strengths and weaknesses. For instance, hierarchical clustering identifies groups in a tree-like structure but suffers from computational complexity in large datasets while K -means clustering is efficient but designed to identify homogeneous spherically-shaped clusters. We present a hybrid non-parametric clustering approach that amalgamates the two methods to identify general-shaped clusters and that can be applied to larger datasets. Specifically, we first partition the dataset into spherical groups using K -means. We next merge these groups using hierarchical methods with a data-driven distance measure as a stopping criterion. Our proposal has the potential to reveal groups with general shapes and structure in a dataset. We demonstrate good performance on several simulated and real datasets.
Cost-effectiveness of psychotherapy for cluster C personality disorders: a decision-analytic model in The Netherlands

NARCIS (Netherlands)

Soeteman, D.I.; Verheul, R.; Meerman, A.M.M.A.; Ziegler, U.; Rossum, B.V.; Delimon, J.; Rijnierse, P.; Thunnissen, M.; Busschbach, J.J.V.; Kim, J.J.

2011-01-01

Objective: To conduct a formal economic evaluation of various dosages of psychotherapy for patients with avoidant, dependent, and obsessive-compulsive (ie, cluster C) personality disorders (Structured Interview for DSM-IV Personality criteria). Method: We developed a decision-analytic model to
An improved K-means clustering algorithm in agricultural image segmentation

Science.gov (United States)

Cheng, Huifeng; Peng, Hui; Liu, Shanmei

Image segmentation is the first important step to image analysis and image processing. In this paper, according to color crops image characteristics, we firstly transform the color space of image from RGB to HIS, and then select proper initial clustering center and cluster number in application of mean-variance approach and rough set theory followed by clustering calculation in such a way as to automatically segment color component rapidly and extract target objects from background accurately, which provides a reliable basis for identification, analysis, follow-up calculation and process of crops images. Experimental results demonstrate that improved k-means clustering algorithm is able to reduce the computation amounts and enhance precision and accuracy of clustering.
On the shell model connection of the cluster model

International Nuclear Information System (INIS)

Cseh, J.; Levai, G.; Kato, K.

2000-01-01

Complete text of publication follows. The interrelation of basic nuclear structure models is a longstanding problem. The connection between the spherical shell model and the quadrupole collective model has been studied extensively, and symmetry considerations proved to be especially useful in this respect. A collective band was interpreted in the shell model language long ago as a set of states (of the valence nucleons) with a specific SU(3) symmetry. Furthermore, the energies of these rotational states are obtained to a good approximation as eigenvalues of an SU(3) dynamically symmetric shell model Hamiltonian. On the other hand the relation of the shell model and cluster model is less well explored. The connection of the harmonic oscillator (i.e. SU(3)) bases of the two approaches is known, but it was established only for the unrealistic harmonic oscillator interactions. Here we investigate the question: Can an SU(3) dynamically symmetric interaction provide a similar connection between the spherical shell model and the cluster model, like the one between the shell and collective models? In other words: whether or not the energy of the states of the cluster bands, defined by a specific SU(3) symmetries, can be obtained from a shell model Hamiltonian (with SU(3) dynamical symmetry). We carried out calculations within the framework of the semimicroscopic algebraic cluster model, in which not only the cluster model space is obtained from the full shell model space by an SU(3) symmetry-dictated truncation, but SU(3) dynamically symmetric interactions are also applied. Actually, Hamiltonians of this kind proved to be successful in describing the gross features of cluster states in a wide energy range. The novel feature of the present work is that we apply exclusively shell model interactions. The energies obtained from such a Hamiltonian for several bands of the ( 12 C, 14 C, 16 O, 20 Ne, 40 Ca) + α systems turn out to be in good agreement with the experimental
On the shell-model-connection of the cluster model

International Nuclear Information System (INIS)

Cseh, J.

2000-01-01

Complete text of publication follows. The interrelation of basic nuclear structure models is a longstanding problem. The connection between the spherical shell model and the quadrupole collective model has been studied extensively, and symmetry considerations proved to be especially useful in this respect. A collective band was interpreted in the shell model language long ago [1] as a set of states (of the valence nucleons) with a specific SU(3) symmetry. Furthermore, the energies of these rotational states are obtained to a good approximation as eigenvalues of an SU(3) dynamically symmetric shell model Hamiltonian. On the other hand the relation of the shell model and cluster model is less well explored. The connection of the harmonic oscillator (i.e. SU(3)) bases of the two approaches is known [2] but it was established only for the unrealistic harmonic oscillator interactions. Here we investigate the question: Can an SU(3) dynamically symmetric interaction provide a similar connection between the spherical shell model and the cluster model, like the one between the shell and collective models? In other words: whether or not the energy of the states of the cluster bands, defined by a specific SU(3) symmetries, can be obtained from a shell model Hamiltonian (with SU(3) dynamical symmetry). We carried out calculations within the framework of the semimicroscopic algebraic cluster model [3,4] in order to find an answer to this question, which seems to be affirmative. In particular, the energies obtained from such a Hamiltonian for several bands of the ( 12 C, 14 C, 16 O, 20 Ne, 40 Ca) + α systems turn out to be in good agreement with the experimental values. The present results show that the simple and transparent SU(3) connection between the spherical shell model and the cluster model is valid not only for the harmonic oscillator interactions, but for much more general (SU(3) dynamically symmetric) Hamiltonians as well, which result in realistic energy spectra. Via
Fine‐Grained Mobile Application Clustering Model Using Retrofitted Document Embedding

Directory of Open Access Journals (Sweden)

Yeo‐Chan Yoon

2017-08-01

Full Text Available In this paper, we propose a fine‐grained mobile application clustering model using retrofitted document embedding. To automatically determine the clusters and their numbers with no predefined categories, the proposed model initializes the clusters based on title keywords and then merges similar clusters. For improved clustering performance, the proposed model distinguishes between an accurate clustering step with titles and an expansive clustering step with descriptions. During the accurate clustering step, an automatically tagged set is constructed as a result. This set is utilized to learn a high‐performance document vector. During the expansive clustering step, more applications are then classified using this document vector. Experimental results showed that the purity of the proposed model increased by 0.19, and the entropy decreased by 1.18, compared with the K‐means algorithm. In addition, the mean average precision improved by more than 0.09 in a comparison with a support vector machine classifier.
Support Vector Data Descriptions and k-Means Clustering: One Class?

Science.gov (United States)

Gornitz, Nico; Lima, Luiz Alberto; Muller, Klaus-Robert; Kloft, Marius; Nakajima, Shinichi

2017-09-27

We present ClusterSVDD, a methodology that unifies support vector data descriptions (SVDDs) and k-means clustering into a single formulation. This allows both methods to benefit from one another, i.e., by adding flexibility using multiple spheres for SVDDs and increasing anomaly resistance and flexibility through kernels to k-means. In particular, our approach leads to a new interpretation of k-means as a regularized mode seeking algorithm. The unifying formulation further allows for deriving new algorithms by transferring knowledge from one-class learning settings to clustering settings and vice versa. As a showcase, we derive a clustering method for structured data based on a one-class learning scenario. Additionally, our formulation can be solved via a particularly simple optimization scheme. We evaluate our approach empirically to highlight some of the proposed benefits on artificially generated data, as well as on real-world problems, and provide a Python software package comprising various implementations of primal and dual SVDD as well as our proposed ClusterSVDD.
A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

Science.gov (United States)

Brusco, Michael J; Shireman, Emilie; Steinley, Douglas

2017-09-01

The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Fatigue Feature Extraction Analysis based on a K-Means Clustering Approach

Directory of Open Access Journals (Sweden)

M.F.M. Yunoh

2015-06-01

Full Text Available This paper focuses on clustering analysis using a K-means approach for fatigue feature dataset extraction. The aim of this study is to group the dataset as closely as possible (homogeneity for the scattered dataset. Kurtosis, the wavelet-based energy coefficient and fatigue damage are calculated for all segments after the extraction process using wavelet transform. Kurtosis, the wavelet-based energy coefficient and fatigue damage are used as input data for the K-means clustering approach. K-means clustering calculates the average distance of each group from the centroid and gives the objective function values. Based on the results, maximum values of the objective function can be seen in the two centroid clusters, with a value of 11.58. The minimum objective function value is found at 8.06 for five centroid clusters. It can be seen that the objective function with the lowest value for the number of clusters is equal to five; which is therefore the best cluster for the dataset.
Finding reproducible cluster partitions for the k-means algorithm.

Science.gov (United States)

Lisboa, Paulo J G; Etchells, Terence A; Jarman, Ian H; Chambers, Simon J

2013-01-01

K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e. within cluster variance, is both reproducible under repeated initialisations and also the closest that k-means can provide to true structure, when applied to synthetic data. We show that this is generally the case for small numbers of clusters, but for values of k that are still of theoretical and practical interest, similar values of SSQ can correspond to markedly different cluster partitions. This paper extends stability measures previously presented in the context of finding optimal values of cluster number, into a component of a 2-d map of the local minima found by the k-means algorithm, from which not only can values of k be identified for further analysis but, more importantly, it is made clear whether the best SSQ is a suitable solution or whether obtaining a consistently good partition requires further application of the stability index. The proposed method is illustrated by application to five synthetic datasets replicating a real world breast cancer dataset with varying data density, and a large bioinformatics dataset.
Performance Analysis of Entropy Methods on K Means in Clustering Process

Science.gov (United States)

Dicky Syahputra Lubis, Mhd.; Mawengkang, Herman; Suwilo, Saib

2017-12-01

K Means is a non-hierarchical data clustering method that attempts to partition existing data into one or more clusters / groups. This method partitions the data into clusters / groups so that data that have the same characteristics are grouped into the same cluster and data that have different characteristics are grouped into other groups.The purpose of this data clustering is to minimize the objective function set in the clustering process, which generally attempts to minimize variation within a cluster and maximize the variation between clusters. However, the main disadvantage of this method is that the number k is often not known before. Furthermore, a randomly chosen starting point may cause two points to approach the distance to be determined as two centroids. Therefore, for the determination of the starting point in K Means used entropy method where this method is a method that can be used to determine a weight and take a decision from a set of alternatives. Entropy is able to investigate the harmony in discrimination among a multitude of data sets. Using Entropy criteria with the highest value variations will get the highest weight. Given this entropy method can help K Means work process in determining the starting point which is usually determined at random. Thus the process of clustering on K Means can be more quickly known by helping the entropy method where the iteration process is faster than the K Means Standard process. Where the postoperative patient dataset of the UCI Repository Machine Learning used and using only 12 data as an example of its calculations is obtained by entropy method only with 2 times iteration can get the desired end result.
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm.

Science.gov (United States)

Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong

2016-01-01

In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis.
HIV infection and hepatitis C virus genotype 1a are associated with phylogenetic clustering among people with recently acquired hepatitis C virus infection.

Science.gov (United States)

Bartlett, Sofia R; Jacka, Brendan; Bull, Rowena A; Luciani, Fabio; Matthews, Gail V; Lamoury, Francois M J; Hellard, Margaret E; Hajarizadeh, Behzad; Teutsch, Suzy; White, Bethany; Maher, Lisa; Dore, Gregory J; Lloyd, Andrew R; Grebely, Jason; Applegate, Tanya L

2016-01-01

The aim of this study was to identify factors associated with phylogenetic clustering among people with recently acquired hepatitis C virus (HCV) infection. Participants with available sample at time of HCV detection were selected from three studies; the Australian Trial in Acute Hepatitis C, the Hepatitis C Incidence and Transmission Study - Prison and Community. HCV RNA was extracted and Core to E2 region of HCV sequenced. Clusters were identified from maximum likelihood trees with 1000 bootstrap replicates using 90% bootstrap and 5% genetic distance threshold. Among 225 participants with available Core-E2 sequence (ATAHC, n=113; HITS-p, n=90; and HITS-c, n=22), HCV genotype prevalence was: G1a: 38% (n=86), G1b: 5% (n=12), G2a: 1% (n=2), G2b: 5% (n=11), G3a: 48% (n=109), G6a: 1% (n=2) and G6l 1% (n=3). Of participants included in phylogenetic trees, 22% of participants were in a pair/cluster (G1a-35%, 30/85, mean maximum genetic distance=0.031; G3a-11%, 12/106, mean maximum genetic distance=0.021; other genotypes-21%, 6/28, mean maximum genetic distance=0.023). Among HCV/HIV co-infected participants, 50% (18/36) were in a pair/cluster, compared to 16% (30/183) with HCV mono-infection (P=infection [vs. HCV mono-infection; adjusted odds ratio (AOR) 4.24; 95%CI 1.91, 9.39], and HCV G1a infection (vs. other HCV genotypes; AOR 3.33, 95%CI 0.14, 0.61).HCV treatment and prevention strategies, including enhanced antiviral therapy, should be optimised. The impact of targeting of HCV treatment as prevention to populations with higher phylogenetic clustering, such as those with HIV co-infection, could be explored through mathematical modelling. Copyright © 2015 Elsevier B.V. All rights reserved.
Multi-cluster dynamics in CΛ13 and analogy to clustering in 12C

Directory of Open Access Journals (Sweden)

Y. Funaki

2017-10-01

Full Text Available We investigate structure of CΛ13 and discuss the difference and similarity between the structures of C12 and CΛ13 by answering the questions if the linear-chain and gaslike cluster states, which are proposed to appear in C12, survives, or new structure states appear or not. We introduce a microscopic cluster model called, Hyper-Tohsaki–Horiuchi–Schuck–Röpke (H-THSR wave function, which is an extended version of the THSR wave function so as to describe Λ hypernuclei. We obtained two bound states and two resonance (quasi-bound states for Jπ=0+ in CΛ13, corresponding to the four 0+ states in C12. However, the inversion of level ordering between the spectra of C12 and CΛ13, i.e. that the 03+ and 04+ states in CΛ13 correspond to the 04+ and 03+ states in C12, respectively, is shown to occur. The additional Λ particle reduces sizes of the 02+ and 03+ states in CΛ13 very much, but the shrinkage of the 04+ state is only a half of the other states, in spite of the fact that attractive Λ-N interaction makes nucleus contracted so much when the Λ particle occupies an S-orbit. In conclusion, the Hoyle state becomes quite a compact object with BeΛ9+α configuration in CΛ13 and is no more gaslike state composed of the 3α clusters. Instead, the 04+ state in CΛ13, coming from the C12(03+ state, appears as a gaslike state composed of α+α+Λ5He configuration, i.e. the Hoyle analog state. A linear-chain state in a Λ hypernucleus is for the first time predicted to exist as the 03+ state in CΛ13 with more shrunk arrangement of the 3α clusters along z-axis than the 3α linear-chain configuration realized in the C12(04+ state. All the excited states are shown to appear around the corresponding cluster-decay threshold, reflecting the threshold rule.
Spatial cluster modelling

CERN Document Server

Lawson, Andrew B

2002-01-01

Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space and space-time, spatial and spatio-temporal process modelling, nonparametric methods for clustering, and spatio-temporal ...
Comparison of K-means and fuzzy c-means algorithm performance for automated determination of the arterial input function.

Science.gov (United States)

Yin, Jiandong; Sun, Hongzan; Yang, Jiawen; Guo, Qiyong

2014-01-01

The arterial input function (AIF) plays a crucial role in the quantification of cerebral perfusion parameters. The traditional method for AIF detection is based on manual operation, which is time-consuming and subjective. Two automatic methods have been reported that are based on two frequently used clustering algorithms: fuzzy c-means (FCM) and K-means. However, it is still not clear which is better for AIF detection. Hence, we compared the performance of these two clustering methods using both simulated and clinical data. The results demonstrate that K-means analysis can yield more accurate and robust AIF results, although it takes longer to execute than the FCM method. We consider that this longer execution time is trivial relative to the total time required for image manipulation in a PACS setting, and is acceptable if an ideal AIF is obtained. Therefore, the K-means method is preferable to FCM in AIF detection.
What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm.

Science.gov (United States)

Raykov, Yordan P; Boukouvalas, Alexis; Baig, Fahd; Little, Max A

The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.
2-Way k-Means as a Model for Microbiome Samples.

Science.gov (United States)

Jackson, Weston J; Agarwal, Ipsita; Pe'er, Itsik

2017-01-01

Motivation . Microbiome sequencing allows defining clusters of samples with shared composition. However, this paradigm poorly accounts for samples whose composition is a mixture of cluster-characterizing ones and which therefore lie in between them in the cluster space. This paper addresses unsupervised learning of 2-way clusters. It defines a mixture model that allows 2-way cluster assignment and describes a variant of generalized k -means for learning such a model. We demonstrate applicability to microbial 16S rDNA sequencing data from the Human Vaginal Microbiome Project.
Support vector machine and fuzzy C-mean clustering-based comparative evaluation of changes in motor cortex electroencephalogram under chronic alcoholism.

Science.gov (United States)

Kumar, Surendra; Ghosh, Subhojit; Tetarway, Suhash; Sinha, Rakesh Kumar

2015-07-01

In this study, the magnitude and spatial distribution of frequency spectrum in the resting electroencephalogram (EEG) were examined to address the problem of detecting alcoholism in the cerebral motor cortex. The EEG signals were recorded from chronic alcoholic conditions (n = 20) and the control group (n = 20). Data were taken from motor cortex region and divided into five sub-bands (delta, theta, alpha, beta-1 and beta-2). Three methodologies were adopted for feature extraction: (1) absolute power, (2) relative power and (3) peak power frequency. The dimension of the extracted features is reduced by linear discrimination analysis and classified by support vector machine (SVM) and fuzzy C-mean clustering. The maximum classification accuracy (88 %) with SVM clustering was achieved with the EEG spectral features with absolute power frequency on F4 channel. Among the bands, relatively higher classification accuracy was found over theta band and beta-2 band in most of the channels when computed with the EEG features of relative power. Electrodes wise CZ, C3 and P4 were having more alteration. Considering the good classification accuracy obtained by SVM with relative band power features in most of the EEG channels of motor cortex, it can be suggested that the noninvasive automated online diagnostic system for the chronic alcoholic condition can be developed with the help of EEG signals.

Cluster dynamics modeling of the effect of high dose irradiation and helium on the microstructure of austenitic stainless steels

Energy Technology Data Exchange (ETDEWEB)

Brimbal, Daniel, E-mail: Daniel.brimbal@areva.com [AREVA NP, Tour AREVA, 1 Place Jean Millier, 92084 Paris La Défense (France); Fournier, Lionel [AREVA NP, Tour AREVA, 1 Place Jean Millier, 92084 Paris La Défense (France); Barbu, Alain [Alain Barbu Consultant, 6 Avenue Pasteur Martin Luther King, 78230 Le Pecq (France)

2016-01-15

A mean field cluster dynamics model has been developed in order to study the effect of high dose irradiation and helium on the microstructural evolution of metals. In this model, self-interstitial clusters, stacking-fault tetrahedra and helium-vacancy clusters are taken into account, in a configuration well adapted to austenitic stainless steels. For small helium-vacancy cluster sizes, the densities of each small cluster are calculated. However, for large sizes, only the mean number of helium atoms per cluster size is calculated. This aspect allows us to calculate the evolution of the microstructural features up to high irradiation doses in a few minutes. It is shown that the presence of stacking-fault tetrahedra notably reduces cavity sizes below 400 °C, but they have little influence on the microstructure above this temperature. The binding energies of vacancies to cavities are calculated using a new method essentially based on ab initio data. It is shown that helium has little effect on the cavity microstructure at 300 °C. However, at higher temperatures, even small helium production rates such as those typical of sodium-fast-reactors induce a notable increase in cavity density compared to an irradiation without helium. - Highlights: • Irradiation of steels with helium is studied through a new cluster dynamics model. • There is only a small effect of helium on cavity distributions in PWR conditions. • An increase in helium production causes an increase in cavity density over 500 °C. • The role of helium is to stabilize cavities via reduced emission of vacancies.
Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

International Nuclear Information System (INIS)

Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G.; Hummer, Gerhard

2014-01-01

Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space
Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

Energy Technology Data Exchange (ETDEWEB)

Nedialkova, Lilia V.; Amat, Miguel A. [Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544 (United States); Kevrekidis, Ioannis G., E-mail: yannis@princeton.edu, E-mail: gerhard.hummer@biophys.mpg.de [Department of Chemical and Biological Engineering and Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08544 (United States); Hummer, Gerhard, E-mail: yannis@princeton.edu, E-mail: gerhard.hummer@biophys.mpg.de [Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438 Frankfurt am Main (Germany)

2014-09-21

Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space.
Robust K-Median and K-Means Clustering Algorithms for Incomplete Data

Directory of Open Access Journals (Sweden)

Jinhua Li

2016-01-01

Full Text Available Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for complete data, such as K-median and K-means. However, in practice, it is often hard to obtain accurate estimation of the missing values, which deteriorates the performance of clustering. To enhance the robustness of clustering algorithms, this paper represents the missing values by interval data and introduces the concept of robust cluster objective function. A minimax robust optimization (RO formulation is presented to provide clustering results, which are insensitive to estimation errors. To solve the proposed RO problem, we propose robust K-median and K-means clustering algorithms with low time and space complexity. Comparisons and analysis of experimental results on both artificially generated and real-world incomplete data sets validate the robustness and effectiveness of the proposed algorithms.
Charge form factors and alpha-cluster internal structure of 12C

International Nuclear Information System (INIS)

Luk'yanov, V.K.; Zemlyanaya, E.V.; Kadrev, D.N.; Antonov, A.N.; Spasova, K.; Anagnostatos, G.S.; Ginis, P.; Giapitzakis, J.

1999-01-01

The transition densities and form factors of 0 + , 2 + , and 3 - states in 12 C are calculated in alpha-cluster model using the triangle frame with clusters in the vertices. The wave functions of nucleons in the alpha clusters are taken as they were obtained in the framework of the models used for the description of the 4 He form factor and momentum distribution which are based on the one-body density matrix construction. They contain effects of the short-range NN correlations, as well as the d-shell admixtures in 4 He. Calculations and the comparison with the experimental data show that visible effects on the form and magnitude of the 12 C form factors take place, especially at relatively large momentum transfers
Fuzzy Modeled K-Cluster Quality Mining of Hidden Knowledge for Decision Support

OpenAIRE

S. Parkash Kumar; K. S. Ramaswami

2011-01-01

Problem statement: The work presented Fuzzy Modeled K-means Cluster Quality Mining of hidden knowledge for Decision Support. Based on the number of clusters, number of objects in each cluster and its cohesiveness, precision and recall values, the cluster quality metrics is measured. The fuzzy k-means is adapted approach by using heuristic method which iterates the cluster to form an efficient valid cluster. With the obtained data clusters, quality assessment is made by predictive mining using...
Effects of cluster-shell competition and BCS-like pairing in 12C

Science.gov (United States)

Matsuno, H.; Itagaki, N.

2017-12-01

The antisymmetrized quasi-cluster model (AQCM) was proposed to describe α-cluster and jj-coupling shell models on the same footing. In this model, the cluster-shell transition is characterized by two parameters, R representing the distance between α clusters and Λ describing the breaking of α clusters, and the contribution of the spin-orbit interaction, very important in the jj-coupling shell model, can be taken into account starting with the α-cluster model wave function. Not only the closure configurations of the major shells but also the subclosure configurations of the jj-coupling shell model can be described starting with the α-cluster model wave functions; however, the particle-hole excitations of single particles have not been fully established yet. In this study we show that the framework of AQCM can be extended even to the states with the character of single-particle excitations. For ^{12}C, two-particle-two-hole (2p2h) excitations from the subclosure configuration of 0p_{3/2} corresponding to a BCS-like pairing are described, and these shell model states are coupled with the three α-cluster model wave functions. The correlation energy from the optimal configuration can be estimated not only in the cluster part but also in the shell model part. We try to pave the way to establish a generalized description of the nuclear structure.
4C radio sources in clusters of galaxies

International Nuclear Information System (INIS)

McHardy, I.M.

1979-01-01

Observations of a complete sample of 4C and 4CT radio sources in Abell clusters with the Cambridge One-Mile telescope are analysed. It is concluded that radio sources are strongly concentrated towards the cluster centres and are equally likely to be found in clusters of any richness. The probability of a galaxy of a given absolute magnitude producing a source above a given luminosity does not depend on cluster membership. 4C and 4CT radio sources in clusters, selected at 178 MHz, occur preferentially in Bautz-Morgan (BM) class 1 clusters, whereas those selected at 1.4 GHz do not. The most powerful radio source in the cluster is almost always associated with the optically brightest galaxy. The average spectrum of 4C sources in the range 408 to 1407 MHz is steeper in BM class 1 than in other classes. Spectra also steepen with cluster richness. the morphology of 4C sources in clusters depends strongly on BM class and, in particular, radio-trail sources occur only in BM classes II, II-III and III. (author)
A New Soft Computing Method for K-Harmonic Means Clustering.

Science.gov (United States)

Yeh, Wei-Chang; Jiang, Yunzhi; Chen, Yee-Fen; Chen, Zhe

2016-01-01

The K-harmonic means clustering algorithm (KHM) is a new clustering method used to group data such that the sum of the harmonic averages of the distances between each entity and all cluster centroids is minimized. Because it is less sensitive to initialization than K-means (KM), many researchers have recently been attracted to studying KHM. In this study, the proposed iSSO-KHM is based on an improved simplified swarm optimization (iSSO) and integrates a variable neighborhood search (VNS) for KHM clustering. As evidence of the utility of the proposed iSSO-KHM, we present extensive computational results on eight benchmark problems. From the computational results, the comparison appears to support the superiority of the proposed iSSO-KHM over previously developed algorithms for all experiments in the literature.
Complex time series analysis of PM10 and PM2.5 for a coastal site using artificial neural network modelling and k-means clustering

Science.gov (United States)

Elangasinghe, M. A.; Singhal, N.; Dirks, K. N.; Salmond, J. A.; Samarasinghe, S.

2014-09-01

This paper uses artificial neural networks (ANN), combined with k-means clustering, to understand the complex time series of PM10 and PM2.5 concentrations at a coastal location of New Zealand based on data from a single site. Out of available meteorological parameters from the network (wind speed, wind direction, solar radiation, temperature, relative humidity), key factors governing the pattern of the time series concentrations were identified through input sensitivity analysis performed on the trained neural network model. The transport pathways of particulate matter under these key meteorological parameters were further analysed through bivariate concentration polar plots and k-means clustering techniques. The analysis shows that the external sources such as marine aerosols and local sources such as traffic and biomass burning contribute equally to the particulate matter concentrations at the study site. These results are in agreement with the results of receptor modelling by the Auckland Council based on Positive Matrix Factorization (PMF). Our findings also show that contrasting concentration-wind speed relationships exist between marine aerosols and local traffic sources resulting in very noisy and seemingly large random PM10 concentrations. The inclusion of cluster rankings as an input parameter to the ANN model showed a statistically significant (p advanced air dispersion models.
Nuclear clustering - a cluster core model study

International Nuclear Information System (INIS)

Paul Selvi, G.; Nandhini, N.; Balasubramaniam, M.

2015-01-01

Nuclear clustering, similar to other clustering phenomenon in nature is a much warranted study, since it would help us in understanding the nature of binding of the nucleons inside the nucleus, closed shell behaviour when the system is highly deformed, dynamics and structure at extremes. Several models account for the clustering phenomenon of nuclei. We present in this work, a cluster core model study of nuclear clustering in light mass nuclei
Security and Correctness Analysis on Privacy-Preserving k-Means Clustering Schemes

Science.gov (United States)

Su, Chunhua; Bao, Feng; Zhou, Jianying; Takagi, Tsuyoshi; Sakurai, Kouichi

Due to the fast development of Internet and the related IT technologies, it becomes more and more easier to access a large amount of data. k-means clustering is a powerful and frequently used technique in data mining. Many research papers about privacy-preserving k-means clustering were published. In this paper, we analyze the existing privacy-preserving k-means clustering schemes based on the cryptographic techniques. We show those schemes will cause the privacy breach and cannot output the correct results due to the faults in the protocol construction. Furthermore, we analyze our proposal as an option to improve such problems but with intermediate information breach during the computation.
Clustering Educational Digital Library Usage Data: A Comparison of Latent Class Analysis and K-Means Algorithms

Science.gov (United States)

Xu, Beijie; Recker, Mimi; Qi, Xiaojun; Flann, Nicholas; Ye, Lei

2013-01-01

This article examines clustering as an educational data mining method. In particular, two clustering algorithms, the widely used K-means and the model-based Latent Class Analysis, are compared, using usage data from an educational digital library service, the Instructional Architect (IA.usu.edu). Using a multi-faceted approach and multiple data…
CLUSTERING PENENTUAN POTENSI KEJAHATAN DAERAH DI KOTA BANJARBARU DENGAN METODE K-MEANS

Directory of Open Access Journals (Sweden)

Sri Rahayu

2016-09-01

Full Text Available Abstract Within the scope of the police, the data held in the database can be used to make a crime report, the presumption of evil to come, and so on. With the data mining based on the amount of data stored so much, these data can be processed to find the useful knowledge for police. One technique that is known in the data mining clustering techniques. The purpose of the job grouping (clustering the data can be divided into two, namely grouping for understanding and grouping to use. Methods K-Means clustering is a method for engineering the most simple and common. KMeans clustering is one method of data non-hierarchy (partition which seeks to partition the existing data in the form of two or more groups. This method of partitioning data into groups so that the same characteristic of data put into the same group and a different characteristic data are grouped into another group. The purpose of this grouping is to minimize the objective function is set in the grouping process, which generally seek to minimize the variation within a group and maximize the variation between groups. The data mined to determine the potential clustering of crime in the city area of crime data Banjarbaru is owned by the city police in the Police Banjarbaru. Thus this study aims to assess the stage of clustering techniques and build clustering determination of potential crime areas in the city Banjarbaru. Keywords:Clustering, Data mining, K-Means, K-Means Clustering ABSTRAK Dalam ruang lingkup kepolisian, data-data yang dimiliki pada basis data dapat dimanfaatkan untuk pembuatan laporan kejahatan, praduga kejahatan yang akan datang, dan sebagainya.Dengan adanya data mining yang didasarkan pada jumlah data yang tersimpan begitu banyak, data-data tersebut dapat diproses untuk menemukan suatu pengetahuan yang berguna bagi pihak kepolisian.Salah satu teknik yang dikenal dalam data mining yaitu teknik clustering.Tujuan pekerjaan pengelompokan (clustering data dapat dibedakan
Distributed k-Means Algorithm and Fuzzy c-Means Algorithm for Sensor Networks Based on Multiagent Consensus Theory.

Science.gov (United States)

Qin, Jiahu; Fu, Weiming; Gao, Huijun; Zheng, Wei Xing

2016-03-03

This paper is concerned with developing a distributed k-means algorithm and a distributed fuzzy c-means algorithm for wireless sensor networks (WSNs) where each node is equipped with sensors. The underlying topology of the WSN is supposed to be strongly connected. The consensus algorithm in multiagent consensus theory is utilized to exchange the measurement information of the sensors in WSN. To obtain a faster convergence speed as well as a higher possibility of having the global optimum, a distributed k-means++ algorithm is first proposed to find the initial centroids before executing the distributed k-means algorithm and the distributed fuzzy c-means algorithm. The proposed distributed k-means algorithm is capable of partitioning the data observed by the nodes into measure-dependent groups which have small in-group and large out-group distances, while the proposed distributed fuzzy c-means algorithm is capable of partitioning the data observed by the nodes into different measure-dependent groups with degrees of membership values ranging from 0 to 1. Simulation results show that the proposed distributed algorithms can achieve almost the same results as that given by the centralized clustering algorithms.
A Deep Learning Prediction Model Based on Extreme-Point Symmetric Mode Decomposition and Cluster Analysis

OpenAIRE

Li, Guohui; Zhang, Songling; Yang, Hong

2017-01-01

Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed ...
Clustering for Binary Data Sets by Using Genetic Algorithm-Incremental K-means

Science.gov (United States)

Saharan, S.; Baragona, R.; Nor, M. E.; Salleh, R. M.; Asrah, N. M.

2018-04-01

This research was initially driven by the lack of clustering algorithms that specifically focus in binary data. To overcome this gap in knowledge, a promising technique for analysing this type of data became the main subject in this research, namely Genetic Algorithms (GA). For the purpose of this research, GA was combined with the Incremental K-means (IKM) algorithm to cluster the binary data streams. In GAIKM, the objective function was based on a few sufficient statistics that may be easily and quickly calculated on binary numbers. The implementation of IKM will give an advantage in terms of fast convergence. The results show that GAIKM is an efficient and effective new clustering algorithm compared to the clustering algorithms and to the IKM itself. In conclusion, the GAIKM outperformed other clustering algorithms such as GCUK, IKM, Scalable K-means (SKM) and K-means clustering and paves the way for future research involving missing data and outliers.
An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve Replicability of Cluster Assignments for Mapping Application

OpenAIRE

Khan, Fouad

2016-01-01

K-means is one of the most widely used clustering algorithms in various disciplines, especially for large datasets. However the method is known to be highly sensitive to initial seed selection of cluster centers. K-means++ has been proposed to overcome this problem and has been shown to have better accuracy and computational efficiency than k-means. In many clustering problems though -such as when classifying georeferenced data for mapping applications- standardization of clustering methodolo...
Data Clustering

Science.gov (United States)

Wagstaff, Kiri L.

2012-03-01

clustering, in which some partial information about item assignments or other components of the resulting output are already known and must be accommodated by the solution. Some algorithms seek a partition of the data set into distinct clusters, while others build a hierarchy of nested clusters that can capture taxonomic relationships. Some produce a single optimal solution, while others construct a probabilistic model of cluster membership. More formally, clustering algorithms operate on a data set X composed of items represented by one or more features (dimensions). These could include physical location, such as right ascension and declination, as well as other properties such as brightness, color, temporal change, size, texture, and so on. Let D be the number of dimensions used to represent each item, xi ∈ RD. The clustering goal is to produce an organization P of the items in X that optimizes an objective function f : P -> R, which quantifies the quality of solution P. Often f is defined so as to maximize similarity within a cluster and minimize similarity between clusters. To that end, many algorithms make use of a measure d : X x X -> R of the distance between two items. A partitioning algorithm produces a set of clusters P = {c1, . . . , ck} such that the clusters are nonoverlapping (c_i intersected with c_j = empty set, i != j) subsets of the data set (Union_i c_i=X). Hierarchical algorithms produce a series of partitions P = {p1, . . . , pn }. For a complete hierarchy, the number of partitions n’= n, the number of items in the data set; the top partition is a single cluster containing all items, and the bottom partition contains n clusters, each containing a single item. For model-based clustering, each cluster c_j is represented by a model m_j , such as the cluster center or a Gaussian distribution. The wide array of available clustering algorithms may seem bewildering, and covering all of them is beyond the scope of this chapter. Choosing among them for a
Quantum mean-field approximation for lattice quantum models: Truncating quantum correlations and retaining classical ones

Science.gov (United States)

Malpetti, Daniele; Roscilde, Tommaso

2017-02-01

The mean-field approximation is at the heart of our understanding of complex systems, despite its fundamental limitation of completely neglecting correlations between the elementary constituents. In a recent work [Phys. Rev. Lett. 117, 130401 (2016), 10.1103/PhysRevLett.117.130401], we have shown that in quantum many-body systems at finite temperature, two-point correlations can be formally separated into a thermal part and a quantum part and that quantum correlations are generically found to decay exponentially at finite temperature, with a characteristic, temperature-dependent quantum coherence length. The existence of these two different forms of correlation in quantum many-body systems suggests the possibility of formulating an approximation, which affects quantum correlations only, without preventing the correct description of classical fluctuations at all length scales. Focusing on lattice boson and quantum Ising models, we make use of the path-integral formulation of quantum statistical mechanics to introduce such an approximation, which we dub quantum mean-field (QMF) approach, and which can be readily generalized to a cluster form (cluster QMF or cQMF). The cQMF approximation reduces to cluster mean-field theory at T =0 , while at any finite temperature it produces a family of systematically improved, semi-classical approximations to the quantum statistical mechanics of the lattice theory at hand. Contrary to standard MF approximations, the correct nature of thermal critical phenomena is captured by any cluster size. In the two exemplary cases of the two-dimensional quantum Ising model and of two-dimensional quantum rotors, we study systematically the convergence of the cQMF approximation towards the exact result, and show that the convergence is typically linear or sublinear in the boundary-to-bulk ratio of the clusters as T →0 , while it becomes faster than linear as T grows. These results pave the way towards the development of semiclassical numerical

Clusters of galaxies associated with quasars. I. 3C 206

International Nuclear Information System (INIS)

Ellingson, E.; Yee, H.K.C.; Green, R.F.; Kinman, T.D.

1989-01-01

Multislit spectroscopy and three-color CCD photometry of the galaxies in the cluster associated with the quasar 3C 206 (PKS 0837-12) at z = 0.198 are presented. This cluster is the richest environment of any low-redshift quasar observed in an Abell richness class 1 cluster. The cluster has a very flattened structure and a very concentrated core about the quasar. Most of the galaxies in this field have colors and luminosities consistent with normal galaxies at this redshift. The background-corrected blue fraction of galaxies is consistent with values for other rich clusters. The existence of several blue galaxies in the concentrated cluster core is an anomaly for a region of such high galaxy density, however, suggesting the absence of a substantial intracluster medium. This claim is supported by the Fanaroff-Riley (1974) class II morphology of the radio source. The velocity dispersion calculated from 11 spectroscopically confirmed cluster members is 500 + or - 110 km/s, which is slightly lower than the average for Abell class 1 clusters. A high frequency of interaction between the quasar host galaxy and cluster core members at low relative velocities, and a low intracluster gas pressure, may comprise a favorable environment for quasar activity. The properties of the cluster of galaxies associated with 3C 206 are consistent with this model. 59 refs
Integrating K-means Clustering with Kernel Density Estimation for the Development of a Conditional Weather Generation Downscaling Model

Science.gov (United States)

Chen, Y.; Ho, C.; Chang, L.

2011-12-01

In previous decades, the climate change caused by global warming increases the occurrence frequency of extreme hydrological events. Water supply shortages caused by extreme events create great challenges for water resource management. To evaluate future climate variations, general circulation models (GCMs) are the most wildly known tools which shows possible weather conditions under pre-defined CO2 emission scenarios announced by IPCC. Because the study area of GCMs is the entire earth, the grid sizes of GCMs are much larger than the basin scale. To overcome the gap, a statistic downscaling technique can transform the regional scale weather factors into basin scale precipitations. The statistic downscaling technique can be divided into three categories include transfer function, weather generator and weather type. The first two categories describe the relationships between the weather factors and precipitations respectively based on deterministic algorithms, such as linear or nonlinear regression and ANN, and stochastic approaches, such as Markov chain theory and statistical distributions. In the weather type, the method has ability to cluster weather factors, which are high dimensional and continuous variables, into weather types, which are limited number of discrete states. In this study, the proposed downscaling model integrates the weather type, using the K-means clustering algorithm, and the weather generator, using the kernel density estimation. The study area is Shihmen basin in northern of Taiwan. In this study, the research process contains two steps, a calibration step and a synthesis step. Three sub-steps were used in the calibration step. First, weather factors, such as pressures, humidities and wind speeds, obtained from NCEP and the precipitations observed from rainfall stations were collected for downscaling. Second, the K-means clustering grouped the weather factors into four weather types. Third, the Markov chain transition matrixes and the
A new method of spatio-temporal topographic mapping by correlation coefficient of K-means cluster.

Science.gov (United States)

Li, Ling; Yao, Dezhong

2007-01-01

It would be of the utmost interest to map correlated sources in the working human brain by Event-Related Potentials (ERPs). This work is to develop a new method to map correlated neural sources based on the time courses of the scalp ERPs waveforms. The ERP data are classified first by k-means cluster analysis, and then the Correlation Coefficients (CC) between the original data of each electrode channel and the time course of each cluster centroid are calculated and utilized as the mapping variable on the scalp surface. With a normalized 4-concentric-sphere head model with radius 1, the performance of the method is evaluated by simulated data. CC, between simulated four sources (s (1)-s (4)) and the estimated cluster centroids (c (1)-c (4)), and the distances (Ds), between the scalp projection points of the s (1)-s (4) and that of the c (1)-c (4), are utilized as the evaluation indexes. Applied to four sources with two of them partially correlated (with maximum mutual CC = 0.4892), CC (Ds) between s (1)-s (4) and c (1)-c (4) are larger (smaller) than 0.893 (0.108) for noise levels NSRclusters located at left, right occipital and frontal. The estimated vectors of the contra-occipital area demonstrate that attention to the stimulus location produces increased amplitude of the P1 and N1 components over the contra-occipital scalp. The estimated vector in the frontal area displays two large processing negativity waves around 100 ms and 250 ms when subjects are attentive, and there is a small negative wave around 140 ms and a P300 when subjects are unattentive. The results of simulations and real Visual Evoked Potentials (VEPs) data demonstrate the validity of the method in mapping correlated sources. This method may be an objective, heuristic and important tool to study the properties of cerebral, neural networks in cognitive and clinical neurosciences.
Hybrid K-means Dan Particle Swarm Optimization Untuk Clustering Nasabah Kredit

Directory of Open Access Journals (Sweden)

Yusuf Priyo Anggodo

2017-05-01

Credit is the biggest revenue for the bank. However, banks have to be selective in deciding which clients can receive the credit. This issue is becoming increasingly complex because when the bank was wrong to give credit to customers can do harm, apart of that a large number of deciding parameter in determining customer credit. Clustering is one way to be able to resolve this issue. K-means is a simple and popular method for solving clustering. However, K-means pure can’t provide optimum solutions so that needs to be done to get the optimum solution to improve. One method of optimization that can solve the problems of optimization with particle swarm optimization is good (PSO. PSO is very helpful in the process of clustering to perform optimization on the central point of each cluster. To improve better results on PSO there are some that do improve. The first use of time-variant inertia to make the dynamic value of inertial w each iteration. Both control the speed of the particle velocity or clamping to get the best position. Besides to overcome premature convergence do hybrid PSO with random injection. The results of this research provide the optimum results for solving clustering of customer credits. The test results showed the hybrid PSO K-means provide the greatest results than K-means and PSO K-means, where the silhouette of the K-means, PSO K-means, and hybrid PSO K-means respectively 0.57343, 0.792045, 1. Keywords: Credit, Clustering, PSO, K-means, Random Injection
Fitting Latent Cluster Models for Networks with latentnet

Directory of Open Access Journals (Sweden)

Pavel N. Krivitsky

2007-12-01

Full Text Available latentnet is a package to fit and evaluate statistical latent position and cluster models for networks. Hoﬀ, Raftery, and Handcock (2002 suggested an approach to modeling networks based on positing the existence of an latent space of characteristics of the actors. Relationships form as a function of distances between these characteristics as well as functions of observed dyadic level covariates. In latentnet social distances are represented in a Euclidean space. It also includes a variant of the extension of the latent position model to allow for clustering of the positions developed in Handcock, Raftery, and Tantrum (2007.The package implements Bayesian inference for the models based on an Markov chain Monte Carlo algorithm. It can also compute maximum likelihood estimates for the latent position model and a two-stage maximum likelihood method for the latent position cluster model. For latent position cluster models, the package provides a Bayesian way of assessing how many groups there are, and thus whether or not there is any clustering (since if the preferred number of groups is 1, there is little evidence for clustering. It also estimates which cluster each actor belongs to. These estimates are probabilistic, and provide the probability of each actor belonging to each cluster. It computes four types of point estimates for the coefficients and positions: maximum likelihood estimate, posterior mean, posterior mode and the estimator which minimizes Kullback-Leibler divergence from the posterior. You can assess the goodness-of-fit of the model via posterior predictive checks. It has a function to simulate networks from a latent position or latent position cluster model.
Vertebra identification using template matching modelmp and K-means clustering.

Science.gov (United States)

Larhmam, Mohamed Amine; Benjelloun, Mohammed; Mahmoudi, Saïd

2014-03-01

Accurate vertebra detection and segmentation are essential steps for automating the diagnosis of spinal disorders. This study is dedicated to vertebra alignment measurement, the first step in a computer-aided diagnosis tool for cervical spine trauma. Automated vertebral segment alignment determination is a challenging task due to low contrast imaging and noise. A software tool for segmenting vertebrae and detecting subluxations has clinical significance. A robust method was developed and tested for cervical vertebra identification and segmentation that extracts parameters used for vertebra alignment measurement. Our contribution involves a novel combination of a template matching method and an unsupervised clustering algorithm. In this method, we build a geometric vertebra mean model. To achieve vertebra detection, manual selection of the region of interest is performed initially on the input image. Subsequent preprocessing is done to enhance image contrast and detect edges. Candidate vertebra localization is then carried out by using a modified generalized Hough transform (GHT). Next, an adapted cost function is used to compute local voted centers and filter boundary data. Thereafter, a K-means clustering algorithm is applied to obtain clusters distribution corresponding to the targeted vertebrae. These clusters are combined with the vote parameters to detect vertebra centers. Rigid segmentation is then carried out by using GHT parameters. Finally, cervical spine curves are extracted to measure vertebra alignment. The proposed approach was successfully applied to a set of 66 high-resolution X-ray images. Robust detection was achieved in 97.5 % of the 330 tested cervical vertebrae. An automated vertebral identification method was developed and demonstrated to be robust to noise and occlusion. This work presents a first step toward an automated computer-aided diagnosis system for cervical spine trauma detection.
flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding.

Science.gov (United States)

Ge, Yongchao; Sealfon, Stuart C

2012-08-01

For flow cytometry data, there are two common approaches to the unsupervised clustering problem: one is based on the finite mixture model and the other on spatial exploration of the histograms. The former is computationally slow and has difficulty to identify clusters of irregular shapes. The latter approach cannot be applied directly to high-dimensional data as the computational time and memory become unmanageable and the estimated histogram is unreliable. An algorithm without these two problems would be very useful. In this article, we combine ideas from the finite mixture model and histogram spatial exploration. This new algorithm, which we call flowPeaks, can be applied directly to high-dimensional data and identify irregular shape clusters. The algorithm first uses K-means algorithm with a large K to partition the cell population into many small clusters. These partitioned data allow the generation of a smoothed density function using the finite mixture model. All local peaks are exhaustively searched by exploring the density function and the cells are clustered by the associated local peak. The algorithm flowPeaks is automatic, fast and reliable and robust to cluster shape and outliers. This algorithm has been applied to flow cytometry data and it has been compared with state of the art algorithms, including Misty Mountain, FLOCK, flowMeans, flowMerge and FLAME. The R package flowPeaks is available at https://github.com/yongchao/flowPeaks. yongchao.ge@mssm.edu Supplementary data are available at Bioinformatics online.
Solvable single-species aggregation-annihilation model for chain-shaped cluster growth

International Nuclear Information System (INIS)

Ke Jianhong; Lin Zhenquan; Zheng Yizhuang; Chen Xiaoshuang; Lu Wei

2007-01-01

We propose a single-species aggregation-annihilation model, in which an aggregation reaction between two clusters produces an active cluster and an annihilation reaction produces an inert one. By means of the mean-field rate equation, we respectively investigate the kinetic scaling behaviours of three distinct systems. The results exhibit that: (i) for the general aggregation-annihilation system, the size distribution of active clusters consistently approaches the conventional scaling form; (ii) for the system with the self-degeneration of the cluster's activities, it takes the modified scaling form; and (iii) for the system with the self-closing of active clusters, it does not scale. Moreover, the size distribution of inert clusters with small size takes a power-law form, while that of large inert clusters obeys the scaling law. The results also show that all active clusters will eventually transform into inert ones and the inert clusters of any size can be produced by such an aggregation-annihilation process. This model can be used to mimic the chain-shaped cluster growth and can provide some useful predictions for the kinetic behaviour of the system
Parallel k-means++

Energy Technology Data Exchange (ETDEWEB)

2017-04-04

A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.
A Comparison of AOP Classification Based on Difficulty, Importance, and Frequency by Cluster Analysis and Standardized Mean

International Nuclear Information System (INIS)

Choi, Sun Yeong; Jung, Wondea

2014-01-01

In Korea, there are plants that have more than one-hundred kinds of abnormal operation procedures (AOPs). Therefore, operators have started to recognize the importance of classifying the AOPs. They should pay attention to those AOPs required to take emergency measures against an abnormal status that has a more serious effect on plant safety and/or often occurs. We suggested a measure of prioritizing AOPs for a training purpose based on difficulty, importance, and frequency. A DIF analysis based on how difficult the task is, how important it is, and how frequently they occur is a well-known method of assessing the performance, prioritizing training needs and planning. We used an SDIF-mean (Standardized DIF-mean) to prioritize AOPs in the previous paper. For the SDIF-mean, we standardized the three kinds of data respectively. The results of this research will be utilized not only to understand the AOP characteristics at a job analysis level but also to develop an effective AOP training program. The purpose of this paper is to perform a cluster analysis for an AOP classification and compare the results through a cluster analysis with that by a standardized mean based on difficulty, importance, and frequency. In this paper, we categorized AOPs into three groups by a cluster analysis based on D, I, and F. Clustering is the classification of similar objects into groups so that each group shares some common characteristics. In addition, we compared the result by the cluster analysis in this paper with the classification result by the SDIF-mean in the previous paper. From the comparison, we found that a reevaluation can be required to assign a training interval for the AOPs of group C' in the previous paper those have lower SDIF-mean. The reason for this is that some of the AOPs of group C' have quite high D and I values while they have the lowest frequencies. From an educational point of view, AOPs in group which have the highest difficulty and importance, but
A Comparison of AOP Classification Based on Difficulty, Importance, and Frequency by Cluster Analysis and Standardized Mean

Energy Technology Data Exchange (ETDEWEB)

Choi, Sun Yeong; Jung, Wondea [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2014-05-15

In Korea, there are plants that have more than one-hundred kinds of abnormal operation procedures (AOPs). Therefore, operators have started to recognize the importance of classifying the AOPs. They should pay attention to those AOPs required to take emergency measures against an abnormal status that has a more serious effect on plant safety and/or often occurs. We suggested a measure of prioritizing AOPs for a training purpose based on difficulty, importance, and frequency. A DIF analysis based on how difficult the task is, how important it is, and how frequently they occur is a well-known method of assessing the performance, prioritizing training needs and planning. We used an SDIF-mean (Standardized DIF-mean) to prioritize AOPs in the previous paper. For the SDIF-mean, we standardized the three kinds of data respectively. The results of this research will be utilized not only to understand the AOP characteristics at a job analysis level but also to develop an effective AOP training program. The purpose of this paper is to perform a cluster analysis for an AOP classification and compare the results through a cluster analysis with that by a standardized mean based on difficulty, importance, and frequency. In this paper, we categorized AOPs into three groups by a cluster analysis based on D, I, and F. Clustering is the classification of similar objects into groups so that each group shares some common characteristics. In addition, we compared the result by the cluster analysis in this paper with the classification result by the SDIF-mean in the previous paper. From the comparison, we found that a reevaluation can be required to assign a training interval for the AOPs of group C' in the previous paper those have lower SDIF-mean. The reason for this is that some of the AOPs of group C' have quite high D and I values while they have the lowest frequencies. From an educational point of view, AOPs in group which have the highest difficulty and importance, but
[Time for cluster C personality disorders: state of the art].

Science.gov (United States)

Hutsebaut, J; Willemsen, E M C; Van, H L

Compared to cluster B personality disorders, the assessment and treatment of people with obsessive-compulsive, dependent, and avoidant personality disorders (cluster C) is given little attention in the field of research and clinical practice. Presenting the current state of affairs in regard to cluster C personality disorders. A systematic literature search was conducted using the main data bases. Cluster C personality disorders are present in approximately 3-9% of the general population. In about half of the cases of mood, anxiety, and eating disorders, there is co-morbid cluster C pathology. This has a major influence on the progression of symptoms, treatment effectiveness and potential relapse. There are barely any well conducted randomized studies on the treatment of cluster-C in existence. Open cohort studies, however, show strong, lasting treatment effects. Given the frequent occurrence of cluster C personality disorders, the burden of disease, associated societal costs and the prognostic implications in case of a co-morbid cluster C personality disorder, early detection and treatment of these disorders is warranted.
Detection of sensor degradation using K-means clustering and support vector regression in nuclear power plant

International Nuclear Information System (INIS)

Seo, Inyong; Ha, Bokam; Lee, Sungwoo; Shin, Changhoon; Lee, Jaeyong; Kim, Seongjun

2011-01-01

In a nuclear power plant (NPP), periodic sensor calibrations are required to assure sensors are operating correctly. However, only a few faulty sensors are found to be rectified. For the safe operation of an NPP and the reduction of unnecessary calibration, on-line calibration monitoring is needed. In this study, an on-line calibration monitoring called KPCSVR using k-means clustering and principal component based Auto-Associative support vector regression (PCSVR) is proposed for nuclear power plant. To reduce the training time of the model, k-means clustering method was used. Response surface methodology is employed to efficiently determine the optimal values of support vector regression hyperparameters. The proposed KPCSVR model was confirmed with actual plant data of Kori Nuclear Power Plant Unit 3 which were measured from the primary and secondary systems of the plant, and compared with the PCSVR model. By using data clustering, the average accuracy of PCSVR improved from 1.228×10 -4 to 0.472×10 -4 and the average sensitivity of PCSVR from 0.0930 to 0.0909, which results in good detection of sensor drift. Moreover, the training time is greatly reduced from 123.5 to 31.5 sec. (author)
Privacy-Preserving k-Means Clustering under Multiowner Setting in Distributed Cloud Environments

Directory of Open Access Journals (Sweden)

Hong Rong

2017-01-01

Full Text Available With the advent of big data era, clients who lack computational and storage resources tend to outsource data mining tasks to cloud service providers in order to improve efficiency and reduce costs. It is also increasingly common for clients to perform collaborative mining to maximize profits. However, due to the rise of privacy leakage issues, the data contributed by clients should be encrypted using their own keys. This paper focuses on privacy-preserving k-means clustering over the joint datasets encrypted under multiple keys. Unfortunately, existing outsourcing k-means protocols are impractical because not only are they restricted to a single key setting, but also they are inefficient and nonscalable for distributed cloud computing. To address these issues, we propose a set of privacy-preserving building blocks and outsourced k-means clustering protocol under Spark framework. Theoretical analysis shows that our scheme protects the confidentiality of the joint database and mining results, as well as access patterns under the standard semihonest model with relatively small computational overhead. Experimental evaluations on real datasets also demonstrate its efficiency improvements compared with existing approaches.
Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis

Science.gov (United States)

de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.

2006-01-01

K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…
Neuro-fuzzy system modeling based on automatic fuzzy clustering

Institute of Scientific and Technical Information of China (English)

Yuangang TANG; Fuchun SUN; Zengqi SUN

2005-01-01

A neuro-fuzzy system model based on automatic fuzzy clustering is proposed.A hybrid model identification algorithm is also developed to decide the model structure and model parameters.The algorithm mainly includes three parts:1) Automatic fuzzy C-means (AFCM),which is applied to generate fuzzy rules automatically,and then fix on the size of the neuro-fuzzy network,by which the complexity of system design is reducesd greatly at the price of the fitting capability;2) Recursive least square estimation (RLSE).It is used to update the parameters of Takagi-Sugeno model,which is employed to describe the behavior of the system;3) Gradient descent algorithm is also proposed for the fuzzy values according to the back propagation algorithm of neural network.Finally,modeling the dynamical equation of the two-link manipulator with the proposed approach is illustrated to validate the feasibility of the method.
Single-cluster dynamics for the random-cluster model

NARCIS (Netherlands)

Deng, Y.; Qian, X.; Blöte, H.W.J.

2009-01-01

We formulate a single-cluster Monte Carlo algorithm for the simulation of the random-cluster model. This algorithm is a generalization of the Wolff single-cluster method for the q-state Potts model to noninteger values q>1. Its results for static quantities are in a satisfactory agreement with those
C=C bond cleavage on neutral VO3(V2O5)n clusters.

Science.gov (United States)

Dong, Feng; Heinbuch, Scott; Xie, Yan; Bernstein, Elliot R; Rocca, Jorge J; Wang, Zhe-Chen; Ding, Xun-Lei; He, Sheng-Gui

2009-01-28

The reactions of neutral vanadium oxide clusters with alkenes (ethylene, propylene, 1-butene, and 1,3-butadiene) are investigated by experiments and density function theory (DFT) calculations. Single photon ionization through extreme ultraviolet radiation (EUV, 46.9 nm, 26.5 eV) is used to detect neutral cluster distributions and reaction products. In the experiments, we observe products (V(2)O(5))(n)VO(2)CH(2), (V(2)O(5))(n)VO(2)C(2)H(4), (V(2)O(5))(n)VO(2)C(3)H(4), and (V(2)O(5))(n)VO(2)C(3)H(6), for neural V(m)O(n) clusters in reactions with C(2)H(4), C(3)H(6), C(4)H(6), and C(4)H(8), respectively. The observation of these products indicates that the C=C bonds of alkenes can be broken on neutral oxygen rich vanadium oxide clusters with the general structure VO(3)(V(2)O(5))(n=0,1,2...). DFT calculations demonstrate that the reaction VO(3) + C(3)H(6) --> VO(2)C(2)H(4) + H(2)CO is thermodynamically favorable and overall barrierless at room temperature. They also provide a mechanistic explanation for the general reaction in which the C=C double bond of alkenes is broken on VO(3)(V(2)O(5))(n=0,1,2...) clusters. A catalytic cycle for alkene oxidation on vanadium oxide is suggested based on our experimental and theoretical investigations. The reactions of V(m)O(n) with C(6)H(6) and C(2)F(4) are also investigated by experiments. The products VO(2)(V(2)O(5))(n)C(6)H(4) are observed for dehydration reactions between V(m)O(n) clusters and C(6)H(6). No product is detected for V(m)O(n) clusters reacting with C(2)F(4). The mechanisms of the reactions between VO(3) and C(2)F(4)/C(6)H(6) are also investigated by calculations at the B3LYP/TZVP level.
Group analyses of connectivity-based cortical parcellation using repeated k-means clustering.

Science.gov (United States)

Nanetti, Luca; Cerliani, Leonardo; Gazzola, Valeria; Renken, Remco; Keysers, Christian

2009-10-01

K-means clustering has become a popular tool for connectivity-based cortical segmentation using Diffusion Weighted Imaging (DWI) data. A sometimes ignored issue is, however, that the output of the algorithm depends on the initial placement of starting points, and that different sets of starting points therefore could lead to different solutions. In this study we explore this issue. We apply k-means clustering a thousand times to the same DWI dataset collected in 10 individuals to segment two brain regions: the SMA-preSMA on the medial wall, and the insula. At the level of single subjects, we found that in both brain regions, repeatedly applying k-means indeed often leads to a variety of rather different cortical based parcellations. By assessing the similarity and frequency of these different solutions, we show that approximately 256 k-means repetitions are needed to accurately estimate the distribution of possible solutions. Using nonparametric group statistics, we then propose a method to employ the variability of clustering solutions to assess the reliability with which certain voxels can be attributed to a particular cluster. In addition, we show that the proportion of voxels that can be attributed significantly to either cluster in the SMA and preSMA is relatively higher than in the insula and discuss how this difference may relate to differences in the anatomy of these regions.
Cluster model of the nucleus

International Nuclear Information System (INIS)

Horiuchi, H.; Ikeda, K.

1986-01-01

This article reviews the development of the cluster model study. The stress is put on two points; one is how the cluster structure has come to be regarded as a fundamental structure in light nuclei together with the shell-model structure, and the other is how at present the cluster model is extended to and connected with the studies of the various subjects many of which are in the neighbouring fields. The authors the present the main theme with detailed explanations of the fundamentals of the microscopic cluster model which have promoted the development of the cluster mode. Examples of the microscopic cluster model study of light nuclear structure are given

Implementasi Pendekatan Rule-Of-Thumb untuk Optimasi Algoritma K-Means Clustering

Directory of Open Access Journals (Sweden)

M Nishom

2018-05-01

Full Text Available In the big data era, the clustering of data or so-called clustering has attracted great interest or attention from researchers in conducting various studies, many grouping algorithms have been proposed in recent times. However, as technology evolves, data volumes continue to grow and data formats are increasingly varied, thus making massive data grouping into a huge and challenging task. To overcome this problem, various research related methods for data grouping have been done, among them is K-Means. However, this method still has some shortcomings, among them is the sensitivity issue in determining the value of cluster (K. In this paper we discuss the implementation of the rule-of-thumb approach and the normalization of data on the K-Means method to determine the number of clusters or K values dynamically in the data groupings. The results show that the implementation of the approach has a significant impact (related to time, number of iterations, and no outliers in the data grouping.
A hybrid sales forecasting scheme by combining independent component analysis with K-means clustering and support vector regression.

Science.gov (United States)

Lu, Chi-Jie; Chang, Chi-Chang

2014-01-01

Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.
Differential Spatio-temporal Multiband Satellite Image Clustering using K-means Optimization With Reinforcement Programming

Directory of Open Access Journals (Sweden)

Irene Erlyn Wina Rachmawan

2015-06-01

Full Text Available Deforestration is one of the crucial issues in Indonesia because now Indonesia has world's highest deforestation rate. In other hand, multispectral image delivers a great source of data for studying spatial and temporal changeability of the environmental such as deforestration area. This research present differential image processing methods for detecting nature change of deforestration. Our differential image processing algorithms extract and indicating area automatically. The feature of our proposed idea produce extracted information from multiband satellite image and calculate the area of deforestration by years with calculating data using temporal dataset. Yet, multiband satellite image consists of big data size that were difficult to be handled for segmentation. Commonly, K- Means clustering is considered to be a powerfull clustering algorithm because of its ability to clustering big data. However K-Means has sensitivity of its first generated centroids, which could lead into a bad performance. In this paper we propose a new approach to optimize K-Means clustering using Reinforcement Programming in order to clustering multispectral image. We build a new mechanism for generating initial centroids by implementing exploration and exploitation knowledge from Reinforcement Programming. This optimization will lead a better result for K-means data cluster. We select multispectral image from Landsat 7 in past ten years in Medawai, Borneo, Indonesia, and apply two segmentation areas consist of deforestration land and forest field. We made series of experiments and compared the experimental results of K-means using Reinforcement Programming as optimizing initiate centroid and normal K-means without optimization process. Keywords: Deforestration, Multispectral images, landsat, automatic clustering, K-means.
Balanced Cluster Head Selection Based on Modified k-Means in a Distributed Wireless Sensor Network

OpenAIRE

Periyasamy, Sasikumar; Khara, Sibaram; Thangavelu, Shankar

2016-01-01

A major problem with Wireless Sensor Networks (WSNs) is the maximization of effective network lifetime through minimization of energy usage in the network nodes. A modified k-means (Mk-means) algorithm for clustering was proposed which includes three cluster heads (simultaneously chosen) for each cluster. These cluster heads (CHs) use a load sharing mechanism to rotate as the active cluster head, which conserves residual energy of the nodes, thereby extending network lifetime. Moreover, it re...
An effective correlated mean-field theory applied in the spin-1/2 Ising ferromagnetic model

Energy Technology Data Exchange (ETDEWEB)

Roberto Viana, J.; Salmon, Octávio R. [Universidade Federal do Amazonas – UFAM, Manaus 69077-000, AM (Brazil); Ricardo de Sousa, J. [Universidade Federal do Amazonas – UFAM, Manaus 69077-000, AM (Brazil); National Institute of Science and Technology for Complex Systems, Universidade Federal do Amazonas, 3000, Japiim, 69077-000 Manaus, AM (Brazil); Neto, Minos A.; Padilha, Igor T. [Universidade Federal do Amazonas – UFAM, Manaus 69077-000, AM (Brazil)

2014-11-15

We developed a new treatment for mean-field theory applied in spins systems, denominated effective correlated mean-field (ECMF). We apply this theory to study the spin-1/2 Ising ferromagnetic model with nearest-neighbor interactions on a square lattice. We use clusters of finite sizes and study the criticality of the ferromagnetic system, where we obtain a convergence of critical temperature for the value k{sub B}T{sub c}/J≃2.27905±0.00141. Also the behavior of magnetic and thermodynamic properties, using the condition of minimum energy of the physical system is obtained. - Highlights: • We developed spin models to study real magnetic systems. • We study the thermodynamic and magnetic properties of the ferromagnetism. • We enhanced a mean-field theory applied in spins models.
Sleep stages identification in patients with sleep disorder using k-means clustering

Science.gov (United States)

Fadhlullah, M. U.; Resahya, A.; Nugraha, D. F.; Yulita, I. N.

2018-05-01

Data mining is a computational intelligence discipline where a large dataset processed using a certain method to look for patterns within the large dataset. This pattern then used for real time application or to develop some certain knowledge. This is a valuable tool to solve a complex problem, discover new knowledge, data analysis and decision making. To be able to get the pattern that lies inside the large dataset, clustering method is used to get the pattern. Clustering is basically grouping data that looks similar so a certain pattern can be seen in the large data set. Clustering itself has several algorithms to group the data into the corresponding cluster. This research used data from patients who suffer sleep disorders and aims to help people in the medical world to reduce the time required to classify the sleep stages from a patient who suffers from sleep disorders. This study used K-Means algorithm and silhouette evaluation to find out that 3 clusters are the optimal cluster for this dataset which means can be divided to 3 sleep stages.
Wave failure at strong coupling in intracellular C a2 + signaling system with clustered channels

Science.gov (United States)

Li, Xiang; Wu, Yuning; Gao, Xuejuan; Cai, Meichun; Shuai, Jianwei

2018-01-01

As an important intracellular signal, C a2 + ions control diverse cellular functions. In this paper, we discuss the C a2 + signaling with a two-dimensional model in which the inositol 1,4,5-trisphosphate (I P3 ) receptor channels are distributed in clusters on the endoplasmic reticulum membrane. The wave failure at large C a2 + diffusion coupling is discussed in detail in the model. We show that with varying model parameters the wave failure is a robust behavior with either deterministic or stochastic channel dynamics. We suggest that the wave failure should be a general behavior in inhomogeneous diffusing systems with clustered excitable regions and may occur in biological C a2 + signaling systems.
Covalent functionalization of octagraphene with magnetic octahedral B6- and non-planar C6- clusters

Science.gov (United States)

Chigo-Anota, E.; Cárdenas-Jirón, G.; Salazar Villanueva, M.; Bautista Hernández, A.; Castro, M.

2017-10-01

The interaction between the magnetic boron octahedral (B6-) and non-planar (C6-) carbon clusters with semimetal nano-sheet of octa-graphene (C64H24) in the gas phase is studied by means of DFT calculations. These results reveal that non-planar-1 (anion) carbon cluster exhibits structural stability, low chemical reactivity, magnetic (1.0 magneton bohr) and semiconductor behavior. On the other hand, there is chemisorption phenomena when the stable B6- and C6- clusters are absorbed on octa-graphene nanosheets. Such absorption generates high polarity and the low-reactivity remains as on the individual pristine cases. Electronic charge transference occurs from the clusters toward the nanosheets, producing a reduction of the work function for the complexes and also induces a magnetic behavior on the functionalized sheets. The quantum descriptors obtained for these systems reveal that they are feasible candidates for the design of molecular circuits, magnetic devices, and nano-vehicles for drug delivery.
Cluster Based Text Classification Model

DEFF Research Database (Denmark)

Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

2011-01-01

We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...
Towards enhancement of performance of K-means clustering using nature-inspired optimization algorithms.

Science.gov (United States)

Fong, Simon; Deb, Suash; Yang, Xin-She; Zhuang, Yan

2014-01-01

Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.
Model of spatial analysis of electric power market using the Fuzzy C-Means technical classification; Modelo de analise espacial de mercado de energia eletrica utilizando a tecnica de classificacao Fuzzy C-Means

Energy Technology Data Exchange (ETDEWEB)

Neto, J.C. [Companhia Energetica de Goias (CELG-D), Goiania, GO (Brazil)], E-mail: joao.cn@celg.com.br; Lima, W.S. [Votorantim Siderurgia, Resende, Rio de Janeiro, RJ (Brazil). Gerencia Geral de Tecnologia], E-mail: wagner.lima@vmetais.com.br

2009-07-01

The power distribution companies live with an antagonistic reality: an increasing energy demand, due to steady economic and population growth, and a limitation on their financial resources to expand its network. Therefore, it is essential an improvement in activity of planning of the power distribution system trying to improve the application of available resources. In this context fits the application of Geographic Information System combined with clustering techniques and classification in order to enhance the planning process, giving the planner a more complete picture of the consumer market by the distributor. This paper presents a system that makes use of Geographic Information System combined with the technique of clustering and classification Fuzzy C-Means, with the aim of analyzing the distribution of network load and the performance of the technique. Each group performed leads to a spatial representation (scenario). This, together with an index measuring the performance of the group (intra-group and inter-group) implemented in this work, provides a favorable environment for spatial analysis of the electric power market.
Analysis of the dynamical cluster approximation for the Hubbard model

OpenAIRE

Aryanpour, K.; Hettler, M. H.; Jarrell, M.

2002-01-01

We examine a central approximation of the recently introduced Dynamical Cluster Approximation (DCA) by example of the Hubbard model. By both analytical and numerical means we study non-compact and compact contributions to the thermodynamic potential. We show that approximating non-compact diagrams by their cluster analogs results in a larger systematic error as compared to the compact diagrams. Consequently, only the compact contributions should be taken from the cluster, whereas non-compact ...
Hierarchical modularization of biochemical pathways using fuzzy-c means clustering.

Science.gov (United States)

de Luis Balaguer, Maria A; Williams, Cranos M

2014-08-01

Biological systems that are representative of regulatory, metabolic, or signaling pathways can be highly complex. Mathematical models that describe such systems inherit this complexity. As a result, these models can often fail to provide a path toward the intuitive comprehension of these systems. More coarse information that allows a perceptive insight of the system is sometimes needed in combination with the model to understand control hierarchies or lower level functional relationships. In this paper, we present a method to identify relationships between components of dynamic models of biochemical pathways that reside in different functional groups. We find primary relationships and secondary relationships. The secondary relationships reveal connections that are present in the system, which current techniques that only identify primary relationships are unable to show. We also identify how relationships between components dynamically change over time. This results in a method that provides the hierarchy of the relationships among components, which can help us to understand the low level functional structure of the system and to elucidate potential hierarchical control. As a proof of concept, we apply the algorithm to the epidermal growth factor signal transduction pathway, and to the C3 photosynthesis pathway. We identify primary relationships among components that are in agreement with previous computational decomposition studies, and identify secondary relationships that uncover connections among components that current computational approaches were unable to reveal.
K-Means Clustering for Problems with Periodic Attributes

Czech Academy of Sciences Publication Activity Database

Vejmelka, Martin; Musílek, P.; Paluš, Milan; Pelikán, Emil

2009-01-01

Roč. 23, č. 4 (2009), s. 721-743 ISSN 0218-0014 R&D Projects: GA AV ČR 1ET400300513 EU Projects: European Commission(XE) 517133 - BRACCIA Institutional research plan: CEZ:AV0Z10300504 Keywords : clustering algorithms * similarity measures * K- means * periodic attributes Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.512, year: 2009
Cluster Mean-Field Approach to the Steady-State Phase Diagram of Dissipative Spin Systems

Directory of Open Access Journals (Sweden)

Jiasen Jin

2016-07-01

Full Text Available We show that short-range correlations have a dramatic impact on the steady-state phase diagram of quantum driven-dissipative systems. This effect, never observed in equilibrium, follows from the fact that ordering in the steady state is of dynamical origin, and is established only at very long times, whereas in thermodynamic equilibrium it arises from the properties of the (free energy. To this end, by combining the cluster methods extensively used in equilibrium phase transitions to quantum trajectories and tensor-network techniques, we extend them to nonequilibrium phase transitions in dissipative many-body systems. We analyze in detail a model of spin-1/2 on a lattice interacting through an XYZ Hamiltonian, each of them coupled to an independent environment that induces incoherent spin flips. In the steady-state phase diagram derived from our cluster approach, the location of the phase boundaries and even its topology radically change, introducing reentrance of the paramagnetic phase as compared to the single-site mean field where correlations are neglected. Furthermore, a stability analysis of the cluster mean field indicates a susceptibility towards a possible incommensurate ordering, not present if short-range correlations are ignored.
Probable alpha and 14C cluster emission from hyper Ac nuclei

International Nuclear Information System (INIS)

Santhosh, K.P.

2013-01-01

A systematic study on the probability for the emission of 4 He and 14 C cluster from hyper Λ 207-234 Ac and non-strange normal 207-234 Ac nuclei are performed for the first time using our fission model, the Coulomb and proximity potential model (CPPM). The predicted half lives show that hyper Λ 207-234 Ac nuclei are unstable against 4 He emission and 14 C emission from hyper Λ 217-228 Ac are favorable for measurement. Our study also show that hyper Λ 207-234 Ac are stable against hyper Λ 4 He and Λ 14 C emission. The role of neutron shell closure (N = 126) in hyper Λ 214 Fr daughter and role of proton/neutron shell closure (Z ∼ 82, N = 126) in hyper Λ 210 Bi daughter are also revealed. As hyper-nuclei decays to normal nuclei by mesonic/non-mesonic decay and since most of the predicted half lives for 4 He and 14 C emission from normal Ac nuclei are favourable for measurement, we presume that alpha and 14 C cluster emission from hyper Ac nuclei can be detected in laboratory in a cascade (two-step) process. (orig.)
On the Equivalence of Nonnegative Matrix Factorization and K-means- Spectral Clustering

Energy Technology Data Exchange (ETDEWEB)

Ding, Chris; He, Xiaofeng; Simon, Horst D.; Jin, Rong

2005-12-04

We provide a systematic analysis of nonnegative matrix factorization (NMF) relating to data clustering. We generalize the usual X = FG{sup T} decomposition to the symmetric W = HH{sup T} and W = HSH{sup T} decompositions. We show that (1) W = HH{sup T} is equivalent to Kernel K-means clustering and the Laplacian-based spectral clustering. (2) X = FG{sup T} is equivalent to simultaneous clustering of rows and columns of a bipartite graph. We emphasizes the importance of orthogonality in NMF and soft clustering nature of NMF. These results are verified with experiments on face images and newsgroups.
Study of ^{14}C Cluster Decay Half-Lives of Heavy Deformed Nuclei

Science.gov (United States)

Shamami, S. Rahimi; Pahlavani, M. R.

2018-01-01

A theoretical model based on deformed Woods-Saxon, Coulomb and centrifugal terms are constructed to evaluate the half-lives for the cluster radioactivity of various super heavy nuclei. Deformation have been applied on all parts of their potential containing nuclear barrier for cluster decay. Also, both parent and daughter nuclei are considered to be deformed. The calculated results of ^{14}C cluster radioactivity half-lives are compared with available experimental data. A satisfactory agreement between theoretical and measured data is achieved. Also, obtained half-lives for each decay family is agreed with Geiger-Nuttall law.
Long-term surface EMG monitoring using K-means clustering and compressive sensing

Science.gov (United States)

Balouchestani, Mohammadreza; Krishnan, Sridhar

2015-05-01

In this work, we present an advanced K-means clustering algorithm based on Compressed Sensing theory (CS) in combination with the K-Singular Value Decomposition (K-SVD) method for Clustering of long-term recording of surface Electromyography (sEMG) signals. The long-term monitoring of sEMG signals aims at recording of the electrical activity produced by muscles which are very useful procedure for treatment and diagnostic purposes as well as for detection of various pathologies. The proposed algorithm is examined for three scenarios of sEMG signals including healthy person (sEMG-Healthy), a patient with myopathy (sEMG-Myopathy), and a patient with neuropathy (sEMG-Neuropathr), respectively. The proposed algorithm can easily scan large sEMG datasets of long-term sEMG recording. We test the proposed algorithm with Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) dimensionality reduction methods. Then, the output of the proposed algorithm is fed to K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers in order to calclute the clustering performance. The proposed algorithm achieves a classification accuracy of 99.22%. This ability allows reducing 17% of Average Classification Error (ACE), 9% of Training Error (TE), and 18% of Root Mean Square Error (RMSE). The proposed algorithm also reduces 14% clustering energy consumption compared to the existing K-Means clustering algorithm.
Towards Enhancement of Performance of K-Means Clustering Using Nature-Inspired Optimization Algorithms

Directory of Open Access Journals (Sweden)

Simon Fong

2014-01-01

Full Text Available Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.

Towards Enhancement of Performance of K-Means Clustering Using Nature-Inspired Optimization Algorithms

Science.gov (United States)

Deb, Suash; Yang, Xin-She

2014-01-01

Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario. PMID:25202730
Combining symmetry collective states with coupled-cluster theory: Lessons from the Agassi model Hamiltonian

Science.gov (United States)

Hermes, Matthew R.; Dukelsky, Jorge; Scuseria, Gustavo E.

2017-06-01

The failures of single-reference coupled-cluster theory for strongly correlated many-body systems is flagged at the mean-field level by the spontaneous breaking of one or more physical symmetries of the Hamiltonian. Restoring the symmetry of the mean-field determinant by projection reveals that coupled-cluster theory fails because it factorizes high-order excitation amplitudes incorrectly. However, symmetry-projected mean-field wave functions do not account sufficiently for dynamic (or weak) correlation. Here we pursue a merger of symmetry projection and coupled-cluster theory, following previous work along these lines that utilized the simple Lipkin model system as a test bed [J. Chem. Phys. 146, 054110 (2017), 10.1063/1.4974989]. We generalize the concept of a symmetry-projected mean-field wave function to the concept of a symmetry projected state, in which the factorization of high-order excitation amplitudes in terms of low-order ones is guided by symmetry projection and is not exponential, and combine them with coupled-cluster theory in order to model the ground state of the Agassi Hamiltonian. This model has two separate channels of correlation and two separate physical symmetries which are broken under strong correlation. We show how the combination of symmetry collective states and coupled-cluster theory is effective in obtaining correlation energies and order parameters of the Agassi model throughout its phase diagram.
Forecasting hourly global solar radiation using hybrid k-means and nonlinear autoregressive neural network models

International Nuclear Information System (INIS)

Benmouiza, Khalil; Cheknane, Ali

2013-01-01

Highlights: • An unsupervised clustering algorithm with a neural network model was explored. • The forecasting results of solar radiation time series and the comparison of their performance was simulated. • A new method was proposed combining k-means algorithm and NAR network to provide better prediction results. - Abstract: In this paper, we review our work for forecasting hourly global horizontal solar radiation based on the combination of unsupervised k-means clustering algorithm and artificial neural networks (ANN). k-Means algorithm focused on extracting useful information from the data with the aim of modeling the time series behavior and find patterns of the input space by clustering the data. On the other hand, nonlinear autoregressive (NAR) neural networks are powerful computational models for modeling and forecasting nonlinear time series. Taking the advantage of both methods, a new method was proposed combining k-means algorithm and NAR network to provide better forecasting results
Are judgments a form of data clustering? Reexamining contrast effects with the k-means algorithm.

Science.gov (United States)

Boillaud, Eric; Molina, Guylaine

2015-04-01

A number of theories have been proposed to explain in precise mathematical terms how statistical parameters and sequential properties of stimulus distributions affect category ratings. Various contextual factors such as the mean, the midrange, and the median of the stimuli; the stimulus range; the percentile rank of each stimulus; and the order of appearance have been assumed to influence judgmental contrast. A data clustering reinterpretation of judgmental relativity is offered wherein the influence of the initial choice of centroids on judgmental contrast involves 2 combined frequency and consistency tendencies. Accounts of the k-means algorithm are provided, showing good agreement with effects observed on multiple distribution shapes and with a variety of interaction effects relating to the number of stimuli, the number of response categories, and the method of skewing. Experiment 1 demonstrates that centroid initialization accounts for contrast effects obtained with stretched distributions. Experiment 2 demonstrates that the iterative convergence inherent to the k-means algorithm accounts for the contrast reduction observed across repeated blocks of trials. The concept of within-cluster variance minimization is discussed, as is the applicability of a backward k-means calculation method for inferring, from empirical data, the values of the centroids that would serve as a representation of the judgmental context. (c) 2015 APA, all rights reserved.
Optimization Approach for Multi-scale Segmentation of Remotely Sensed Imagery under k-means Clustering Guidance

Directory of Open Access Journals (Sweden)

WANG Huixian

2015-05-01

Full Text Available In order to adapt different scale land cover segmentation, an optimized approach under the guidance of k-means clustering for multi-scale segmentation is proposed. At first, small scale segmentation and k-means clustering are used to process the original images; then the result of k-means clustering is used to guide objects merging procedure, in which Otsu threshold method is used to automatically select the impact factor of k-means clustering; finally we obtain the segmentation results which are applicable to different scale objects. FNEA method is taken for an example and segmentation experiments are done using a simulated image and a real remote sensing image from GeoEye-1 satellite, qualitative and quantitative evaluation demonstrates that the proposed method can obtain high quality segmentation results.
Re-weighted Discriminatively Embedded K-Means for Multi-view Clustering.

Science.gov (United States)

Xu, Jinglin; Han, Junwei; Nie, Feiping; Li, Xuelong

2017-02-08

Recent years, more and more multi-view data are widely used in many real world applications. This kind of data (such as image data) are high dimensional and obtained from different feature extractors, which represents distinct perspectives of the data. How to cluster such data efficiently is a challenge. In this paper, we propose a novel multi-view clustering framework, called Re-weighted Discriminatively Embedded KMeans (RDEKM), for this task. The proposed method is a multiview least-absolute residual model which induces robustness to efficiently mitigates the influence of outliers and realizes dimension reduction during multi-view clustering. Specifically, the proposed model is an unsupervised optimization scheme which utilizes Iterative Re-weighted Least Squares to solve leastabsolute residual and adaptively controls the distribution of multiple weights in a re-weighted manner only based on its own low-dimensional subspaces and a common clustering indicator matrix. Furthermore, theoretical analysis (including optimality and convergence analysis) and the optimization algorithm are also presented. Compared to several state-of-the-art multi-view clustering methods, the proposed method substantially improves the accuracy of the clustering results on widely used benchmark datasets, which demonstrates the superiority of the proposed work.
High-conductance states in a mean-field cortical network model

DEFF Research Database (Denmark)

Lerchner, Alexander; Ahmadi, Mandana; Hertz, John

2004-01-01

cortical network model with random connectivity and conductance-based synapses. We employ mean-field theory with correctly colored noise to describe temporal correlations in the neuronal activity. Our results illuminate the connection between two independent experimental findings: high-conductance states......Measured responses from visual cortical neurons show that spike times tend to be correlated rather than exactly Poisson distributed. Fano factors vary and are usually greater than 1, indicating a tendency toward spikes being clustered. We show that this behavior emerges naturally in a balanced...... of cortical neurons in their natural environment, and variable non-Poissonian spike statistics with Fano factors greater than 1. (C) 2004 Elsevier B.V. All rights reserved....
Open source clustering software.

Science.gov (United States)

de Hoon, M J L; Imoto, S; Nolan, J; Miyano, S

2004-06-12

We have implemented k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs. Using this library, we have created an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C. The C Clustering Library and the corresponding Python C extension module Pycluster were released under the Python License, while the Perl module Algorithm::Cluster was released under the Artistic License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well as the corresponding command-line program, were released under the same license as the original Cluster code. The complete source code is available at http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster. Alternatively, Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available as part of the Biopython distribution.
Ckmeans.1d.dp: Optimal k-means Clustering in One Dimension by Dynamic Programming.

Science.gov (United States)

Wang, Haizhou; Song, Mingzhou

2011-12-01

The heuristic k -means algorithm, widely used for cluster analysis, does not guarantee optimality. We developed a dynamic programming algorithm for optimal one-dimensional clustering. The algorithm is implemented as an R package called Ckmeans.1d.dp . We demonstrate its advantage in optimality and runtime over the standard iterative k -means algorithm.
Mathematical modelling of complex contagion on clustered networks

Science.gov (United States)

O'sullivan, David J.; O'Keeffe, Gary; Fennell, Peter; Gleeson, James

2015-09-01

The spreading of behavior, such as the adoption of a new innovation, is influenced bythe structure of social networks that interconnect the population. In the experiments of Centola (Science, 2010), adoption of new behavior was shown to spread further and faster across clustered-lattice networks than across corresponding random networks. This implies that the “complex contagion” effects of social reinforcement are important in such diffusion, in contrast to “simple” contagion models of disease-spread which predict that epidemics would grow more efficiently on random networks than on clustered networks. To accurately model complex contagion on clustered networks remains a challenge because the usual assumptions (e.g. of mean-field theory) regarding tree-like networks are invalidated by the presence of triangles in the network; the triangles are, however, crucial to the social reinforcement mechanism, which posits an increased probability of a person adopting behavior that has been adopted by two or more neighbors. In this paper we modify the analytical approach that was introduced by Hebert-Dufresne et al. (Phys. Rev. E, 2010), to study disease-spread on clustered networks. We show how the approximation method can be adapted to a complex contagion model, and confirm the accuracy of the method with numerical simulations. The analytical results of the model enable us to quantify the level of social reinforcement that is required to observe—as in Centola’s experiments—faster diffusion on clustered topologies than on random networks.
Mathematical modelling of complex contagion on clustered networks

Directory of Open Access Journals (Sweden)

David J. P. O'Sullivan

2015-09-01

Full Text Available The spreading of behavior, such as the adoption of a new innovation, is influenced bythe structure of social networks that interconnect the population. In the experiments of Centola (Science, 2010, adoption of new behavior was shown to spread further and faster across clustered-lattice networks than across corresponding random networks. This implies that the complex contagion effects of social reinforcement are important in such diffusion, in contrast to simple contagion models of disease-spread which predict that epidemics would grow more efficiently on random networks than on clustered networks. To accurately model complex contagion on clustered networks remains a challenge because the usual assumptions (e.g. of mean-field theory regarding tree-like networks are invalidated by the presence of triangles in the network; the triangles are, however, crucial to the social reinforcement mechanism, which posits an increased probability of a person adopting behavior that has been adopted by two or more neighbors. In this paper we modify the analytical approach that was introduced by Hebert-Dufresne et al. (Phys. Rev. E, 2010, to study disease-spread on clustered networks. We show how the approximation method can be adapted to a complex contagion model, and confirm the accuracy of the method with numerical simulations. The analytical results of the model enable us to quantify the level of social reinforcement that is required to observe—as in Centola’s experiments—faster diffusion on clustered topologies than on random networks.
Effect of clustering on the mechanical properties of SiC particulate-reinforced aluminum alloy 2024 metal matrix composites

International Nuclear Information System (INIS)

Hong, Soon-Jik; Kim, Hong-Moule; Huh, Dae; Suryanarayana, C.; Chun, Byong Sun

2003-01-01

Al 2024-SiC metal matrix composite (MMC) powders produced by centrifugal atomization were hot extruded to investigate the effect of clustering on their mechanical properties. Fracture toughness and tension tests were conducted on specimens reinforced with different volume fractions of SiC. A model was proposed to suggest that the strength of the MMCs could be estimated from the load transfer model approach that takes into consideration the extent of clustering. This model has been successful in predicting the experimentally observed strength and fracture toughness values of the Al 2024-SiC MMCs. On the basis of experimental observations, it is suggested that the strength of particulate-reinforced MMCs may be calculated from the relation: σ y =σ m V m +σ r (V r -V c )-σ r V c , where σ and V represent the yield strength and volume fraction, respectively, and the subscripts m, r, and c represent the matrix, reinforcement, and clusters, respectively
Nonlinear damage effect in graphene synthesis by C-cluster ion implantation

International Nuclear Information System (INIS)

Zhang Rui; Zhang Zaodi; Wang Zesong; Wang Shixu; Wang Wei; Fu Dejun; Liu Jiarui

2012-01-01

We present few-layer graphene synthesis by negative carbon cluster ion implantation with C 1 , C 2 , and C 4 at energies below 20 keV. The small C-clusters were produced by a source of negative ion by cesium sputtering with medium beam current. We show that the nonlinear effect in cluster-induced damage is favorable for graphene precipitation compared with monomer carbon ions. The nonlinear damage effect in cluster ion implantation shows positive impact on disorder reduction, film uniformity, and the surface smoothness in graphene synthesis.
Percolation technique for galaxy clustering

Science.gov (United States)

Klypin, Anatoly; Shandarin, Sergei F.

1993-01-01

We study percolation in mass and galaxy distributions obtained in 3D simulations of the CDM, C + HDM, and the power law (n = -1) models in the Omega = 1 universe. Percolation statistics is used here as a quantitative measure of the degree to which a mass or galaxy distribution is of a filamentary or cellular type. The very fast code used calculates the statistics of clusters along with the direct detection of percolation. We found that the two parameters mu(infinity), characterizing the size of the largest cluster, and mu-squared, characterizing the weighted mean size of all clusters excluding the largest one, are extremely useful for evaluating the percolation threshold. An advantage of using these parameters is their low sensitivity to boundary effects. We show that both the CDM and the C + HDM models are extremely filamentary both in mass and galaxy distribution. The percolation thresholds for the mass distributions are determined.
A hybrid sequential approach for data clustering using K-Means and ...

African Journals Online (AJOL)

Experiments on four kinds of data sets have been conducted. The obtained results are compared with K-Means, PSO, Hybrid, K-Means+Genetic Algorithm and it has been found that the proposed algorithm generates more accurate, robust and better clustering results. International Journal of Engineering, Science and ...
An Improved Fuzzy C-Means Algorithm for the Implementation of Demand Side Management Measures

Directory of Open Access Journals (Sweden)

Ioannis Panapakidis

2017-09-01

Full Text Available Load profiling refers to a procedure that leads to the formulation of daily load curves and consumer classes regarding the similarity of the curve shapes. This procedure incorporates a set of unsupervised machine learning algorithms. While many crisp clustering algorithms have been proposed for grouping load curves into clusters, only one soft clustering algorithm is utilized for the aforementioned purpose, namely the Fuzzy C-Means (FCM algorithm. Since the benefits of soft clustering are demonstrated in a variety of applications, the potential of introducing a novel modification of the FCM in the electricity consumer clustering process is examined. Additionally, this paper proposes a novel Demand Side Management (DSM strategy for load management of consumers that are eligible for the implementation of Real-Time Pricing (RTP schemes. The DSM strategy is formulated as a constrained optimization problem that can be easily solved and therefore, making it a useful tool for retailers’ decision-making framework in competitive electricity markets.
Fuzzy Clustering

DEFF Research Database (Denmark)

Berks, G.; Keyserlingk, Diedrich Graf von; Jantzen, Jan

2000-01-01

A symptom is a condition indicating the presence of a disease, especially, when regarded as an aid in diagnosis.Symptoms are the smallest units indicating the existence of a disease. A syndrome on the other hand is an aggregate, set or cluster of concurrent symptoms which together indicate...... and clustering are the basic concerns in medicine. Classification depends on definitions of the classes and their required degree of participant of the elements in the cases' symptoms. In medicine imprecise conditions are the rule and therefore fuzzy methods are much more suitable than crisp ones. Fuzzy c......-mean clustering is an easy and well improved tool, which has been applied in many medical fields. We used c-mean fuzzy clustering after feature extraction from an aphasia database. Factor analysis was applied on a correlation matrix of 26 symptoms of language disorders and led to five factors. The factors...
Plasma cluster acceleration by means of external magnetic fields

International Nuclear Information System (INIS)

Kracik, J.; Maloch, J.; Sobra, K.

1975-01-01

The electromagnetic shock tubes are used not only for shock wave creation and study but also for pulse plasma acceleration. By applying the rail acceleration the external magnetic field perpendicular to the plasma cluster velocity can be increased. In the present work is theoretically and experimentally confirmed the external magnetic field influence on the plasma cluster acceleration when the 'snow plough' model is used. (Auth.)
A Cluster-based Approach Towards Detecting and Modeling Network Dictionary Attacks

Directory of Open Access Journals (Sweden)

A. Tajari Siahmarzkooh

2016-12-01

Full Text Available In this paper, we provide an approach to detect network dictionary attacks using a data set collected as flows based on which a clustered graph is resulted. These flows provide an aggregated view of the network traffic in which the exchanged packets in the network are considered so that more internally connected nodes would be clustered. We show that dictionary attacks could be detected through some parameters namely the number and the weight of clusters in time series and their evolution over the time. Additionally, the Markov model based on the average weight of clusters,will be also created. Finally, by means of our suggested model, we demonstrate that artificial clusters of the flows are created for normal and malicious traffic. The results of the proposed approach on CAIDA 2007 data set suggest a high accuracy for the model and, therefore, it provides a proper method for detecting the dictionary attack.
K-means-clustering-based fiber nonlinearity equalization techniques for 64-QAM coherent optical communication system.

Science.gov (United States)

Zhang, Junfeng; Chen, Wei; Gao, Mingyi; Shen, Gangxiang

2017-10-30

In this work, we proposed two k-means-clustering-based algorithms to mitigate the fiber nonlinearity for 64-quadrature amplitude modulation (64-QAM) signal, the training-sequence assisted k-means algorithm and the blind k-means algorithm. We experimentally demonstrated the proposed k-means-clustering-based fiber nonlinearity mitigation techniques in 75-Gb/s 64-QAM coherent optical communication system. The proposed algorithms have reduced clustering complexity and low data redundancy and they are able to quickly find appropriate initial centroids and select correctly the centroids of the clusters to obtain the global optimal solutions for large k value. We measured the bit-error-ratio (BER) performance of 64-QAM signal with different launched powers into the 50-km single mode fiber and the proposed techniques can greatly mitigate the signal impairments caused by the amplified spontaneous emission noise and the fiber Kerr nonlinearity and improve the BER performance.

Elastic K-means using posterior probability.

Science.gov (United States)

Zheng, Aihua; Jiang, Bo; Li, Yan; Zhang, Xuehan; Ding, Chris

2017-01-01

The widely used K-means clustering is a hard clustering algorithm. Here we propose a Elastic K-means clustering model (EKM) using posterior probability with soft capability where each data point can belong to multiple clusters fractionally and show the benefit of proposed Elastic K-means. Furthermore, in many applications, besides vector attributes information, pairwise relations (graph information) are also available. Thus we integrate EKM with Normalized Cut graph clustering into a single clustering formulation. Finally, we provide several useful matrix inequalities which are useful for matrix formulations of learning models. Based on these results, we prove the correctness and the convergence of EKM algorithms. Experimental results on six benchmark datasets demonstrate the effectiveness of proposed EKM and its integrated model.
Paternal age related schizophrenia (PARS): Latent subgroups detected by k-means clustering analysis.

Science.gov (United States)

Lee, Hyejoo; Malaspina, Dolores; Ahn, Hongshik; Perrin, Mary; Opler, Mark G; Kleinhaus, Karine; Harlap, Susan; Goetz, Raymond; Antonius, Daniel

2011-05-01

Paternal age related schizophrenia (PARS) has been proposed as a subgroup of schizophrenia with distinct etiology, pathophysiology and symptoms. This study uses a k-means clustering analysis approach to generate hypotheses about differences between PARS and other cases of schizophrenia. We studied PARS (operationally defined as not having any family history of schizophrenia among first and second-degree relatives and fathers' age at birth ≥ 35 years) in a series of schizophrenia cases recruited from a research unit. Data were available on demographic variables, symptoms (Positive and Negative Syndrome Scale; PANSS), cognitive tests (Wechsler Adult Intelligence Scale-Revised; WAIS-R) and olfaction (University of Pennsylvania Smell Identification Test; UPSIT). We conducted a series of k-means clustering analyses to identify clusters of cases containing high concentrations of PARS. Two analyses generated clusters with high concentrations of PARS cases. The first analysis (N=136; PARS=34) revealed a cluster containing 83% PARS cases, in which the patients showed a significant discrepancy between verbal and performance intelligence. The mean paternal and maternal ages were 41 and 33, respectively. The second analysis (N=123; PARS=30) revealed a cluster containing 71% PARS cases, of which 93% were females; the mean age of onset of psychosis, at 17.2, was significantly early. These results strengthen the evidence that PARS cases differ from other patients with schizophrenia. Hypothesis-generating findings suggest that features of PARS may include a discrepancy between verbal and performance intelligence, and in females, an early age of onset. These findings provide a rationale for separating these phenotypes from others in future clinical, genetic and pathophysiologic studies of schizophrenia and in considering responses to treatment. Copyright © 2011 Elsevier B.V. All rights reserved.
Evaluation of stability of k-means cluster ensembles with respect to random initialization.

Science.gov (United States)

Kuncheva, Ludmila I; Vetrov, Dmitry P

2006-11-01

Many clustering algorithms, including cluster ensembles, rely on a random component. Stability of the results across different runs is considered to be an asset of the algorithm. The cluster ensembles considered here are based on k-means clusterers. Each clusterer is assigned a random target number of clusters, k and is started from a random initialization. Here, we use 10 artificial and 10 real data sets to study ensemble stability with respect to random k, and random initialization. The data sets were chosen to have a small number of clusters (two to seven) and a moderate number of data points (up to a few hundred). Pairwise stability is defined as the adjusted Rand index between pairs of clusterers in the ensemble, averaged across all pairs. Nonpairwise stability is defined as the entropy of the consensus matrix of the ensemble. An experimental comparison with the stability of the standard k-means algorithm was carried out for k from 2 to 20. The results revealed that ensembles are generally more stable, markedly so for larger k. To establish whether stability can serve as a cluster validity index, we first looked at the relationship between stability and accuracy with respect to the number of clusters, k. We found that such a relationship strongly depends on the data set, varying from almost perfect positive correlation (0.97, for the glass data) to almost perfect negative correlation (-0.93, for the crabs data). We propose a new combined stability index to be the sum of the pairwise individual and ensemble stabilities. This index was found to correlate better with the ensemble accuracy. Following the hypothesis that a point of stability of a clustering algorithm corresponds to a structure found in the data, we used the stability measures to pick the number of clusters. The combined stability index gave best results.
The k-means clustering technique: General considerations and implementation in Mathematica

Directory of Open Access Journals (Sweden)

Laurence Morissette

2013-02-01

Full Text Available Data clustering techniques are valuable tools for researchers working with large databases of multivariate data. In this tutorial, we present a simple yet powerful one: the k-means clustering technique, through three different algorithms: the Forgy/Lloyd, algorithm, the MacQueen algorithm and the Hartigan and Wong algorithm. We then present an implementation in Mathematica and various examples of the different options available to illustrate the application of the technique.
A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

Science.gov (United States)

Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

2016-02-01

Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.
Development of a Semi-Automatic Technique for Flow Estimation using Optical Flow Registration and k-means Clustering on Two Dimensional Cardiovascular Magnetic Resonance Flow Images

DEFF Research Database (Denmark)

Brix, Lau; Christoffersen, Christian P. V.; Kristiansen, Martin Søndergaard

was then categorized into groups by the k-means clustering method. Finally, the cluster containing the vessel under investigation was selected manually by a single mouse click. All calculations were performed on a Nvidia 8800 GTX graphics card using the Compute Unified Device Architecture (CUDA) extension to the C...
Investigation of clustering effects in the reaction pp→ppπ+π+π-π- at 19 GeV/c

International Nuclear Information System (INIS)

Allan, J.; Blomqvist, G.

1975-07-01

Possible production of high multiplicity clusters of secondaries in the reaction pp→ppπ + π + π - π - at 19 GeV/c is investigated. The experimental distribution of dispersion versus mean for the pion rapidities shows, compared to simple one component models, an excess of events in the regions where a single diffraction dissociation process is expected to populate. A method based on the Cramer van Mises statistical test combined with an operational method for selection of quasi two body reactions is used for investigation of clustering effects in phase space caused by different reaction mechanisms. The analysis indicates that the distribution of experimental events in phase space has mainly two population centers, one consisting of events with the kinematical configuration expected from a single diffraction dissociation process. (Auth.)
Topic modeling for cluster analysis of large biological and medical datasets.

Science.gov (United States)

Zhao, Weizhong; Zou, Wen; Chen, James J

2014-01-01

The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting
An improved initialization center k-means clustering algorithm based on distance and density

Science.gov (United States)

Duan, Yanling; Liu, Qun; Xia, Shuyin

2018-04-01

Aiming at the problem of the random initial clustering center of k means algorithm that the clustering results are influenced by outlier data sample and are unstable in multiple clustering, a method of central point initialization method based on larger distance and higher density is proposed. The reciprocal of the weighted average of distance is used to represent the sample density, and the data sample with the larger distance and the higher density are selected as the initial clustering centers to optimize the clustering results. Then, a clustering evaluation method based on distance and density is designed to verify the feasibility of the algorithm and the practicality, the experimental results on UCI data sets show that the algorithm has a certain stability and practicality.
AN EFFICIENT INITIALIZATION METHOD FOR K-MEANS CLUSTERING OF HYPERSPECTRAL DATA

Directory of Open Access Journals (Sweden)

A. Alizade Naeini

2014-10-01

Full Text Available K-means is definitely the most frequently used partitional clustering algorithm in the remote sensing community. Unfortunately due to its gradient decent nature, this algorithm is highly sensitive to the initial placement of cluster centers. This problem deteriorates for the high-dimensional data such as hyperspectral remotely sensed imagery. To tackle this problem, in this paper, the spectral signatures of the endmembers in the image scene are extracted and used as the initial positions of the cluster centers. For this purpose, in the first step, A Neyman–Pearson detection theory based eigen-thresholding method (i.e., the HFC method has been employed to estimate the number of endmembers in the image. Afterwards, the spectral signatures of the endmembers are obtained using the Minimum Volume Enclosing Simplex (MVES algorithm. Eventually, these spectral signatures are used to initialize the k-means clustering algorithm. The proposed method is implemented on a hyperspectral dataset acquired by ROSIS sensor with 103 spectral bands over the Pavia University campus, Italy. For comparative evaluation, two other commonly used initialization methods (i.e., Bradley & Fayyad (BF and Random methods are implemented and compared. The confusion matrix, overall accuracy and Kappa coefficient are employed to assess the methods’ performance. The evaluations demonstrate that the proposed solution outperforms the other initialization methods and can be applied for unsupervised classification of hyperspectral imagery for landcover mapping.
Automatic classification of canine PRG neuronal discharge patterns using K-means clustering.

Science.gov (United States)

Zuperku, Edward J; Prkic, Ivana; Stucke, Astrid G; Miller, Justin R; Hopp, Francis A; Stuth, Eckehard A

2015-02-01

Respiratory-related neurons in the parabrachial-Kölliker-Fuse (PB-KF) region of the pons play a key role in the control of breathing. The neuronal activities of these pontine respiratory group (PRG) neurons exhibit a variety of inspiratory (I), expiratory (E), phase spanning and non-respiratory related (NRM) discharge patterns. Due to the variety of patterns, it can be difficult to classify them into distinct subgroups according to their discharge contours. This report presents a method that automatically classifies neurons according to their discharge patterns and derives an average subgroup contour of each class. It is based on the K-means clustering technique and it is implemented via SigmaPlot User-Defined transform scripts. The discharge patterns of 135 canine PRG neurons were classified into seven distinct subgroups. Additional methods for choosing the optimal number of clusters are described. Analysis of the results suggests that the K-means clustering method offers a robust objective means of both automatically categorizing neuron patterns and establishing the underlying archetypical contours of subtypes based on the discharge patterns of group of neurons. Published by Elsevier B.V.
Hierarchical Adaptive Means (HAM) clustering for hardware-efficient, unsupervised and real-time spike sorting.

Science.gov (United States)

Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G

2014-09-30

This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.
Determining characteristic principal clusters in the “cluster-plus-glue-atom” model

International Nuclear Information System (INIS)

Du, Jinglian; Wen, Bin; 2NeT Lab, Wilfrid Laurier University, Waterloo, 75 University Ave West, Ontario N2L 3C5 (Canada))" data-affiliation=" (M2NeT Lab, Wilfrid Laurier University, Waterloo, 75 University Ave West, Ontario N2L 3C5 (Canada))" >Melnik, Roderick; Kawazoe, Yoshiyuki

2014-01-01

The “cluster-plus-glue-atom” model can easily describe the structure of complex metallic alloy phases. However, the biggest obstacle limiting the application of this model is that it is difficult to determine the characteristic principal cluster. In the case when interatomic force constants (IFCs) inside the cluster lead to stronger interaction than the interaction between the clusters, a new rule for determining the characteristic principal cluster in the “cluster-plus-glue-atom” model has been proposed on the basis of IFCs. To verify this new rule, the alloy phases in Cu–Zr and Al–Ni–Zr systems have been tested, and our results indicate that the present new rule for determining characteristic principal clusters is effective and reliable
Fuzzy cluster means algorithm for the diagnosis of confusable disease

African Journals Online (AJOL)

... end platform while Microsoft Access was used as the database application. The system gives a measure of each disease within a set of confusable disease. The proposed system had a classification accuracy of 60%. Keywords: Artificial Intelligence, expert system Fuzzy cluster – means Algorithm, physician, Diagnosis ...
K-mean clustering algorithm for processing signals from compound semiconductor detectors

International Nuclear Information System (INIS)

Tada, Tsutomu; Hitomi, Keitaro; Wu, Yan; Kim, Seong-Yun; Yamazaki, Hiromichi; Ishii, Keizo

2011-01-01

The K-mean clustering algorithm was employed for processing signal waveforms from TlBr detectors. The signal waveforms were classified based on its shape reflecting the charge collection process in the detector. The classified signal waveforms were processed individually to suppress the pulse height variation of signals due to the charge collection loss. The obtained energy resolution of a 137 Cs spectrum measured with a 0.5 mm thick TlBr detector was 1.3% FWHM by employing 500 clusters.
Raman spectroscopy of few-layer graphene prepared by C2–C6 cluster ion implantation

International Nuclear Information System (INIS)

Wang, Z.S.; Zhang, R.; Zhang, Z.D.; Huang, Z.H.; Liu, C.S.; Fu, D.J.; Liu, J.R.

2013-01-01

Few-layer graphene has been prepared on 300 nm-thick Ni films by C 2 –C 6 cluster ion implantation at 20 keV/cluster. Raman spectroscopy reveals significant influence of the number of atoms in the cluster, the implantation dose, and thermal treatment on the structure of the graphene layers. In particular, the graphene samples exhibit a sharp G peak at 1584 cm −1 and 2D peaks at 2711–2717 cm −1 . The I G /I 2D ratios higher than 1.70 and I G /I D ratio as high as 1.95 confirm that graphene sheets with low density of defects have been synthesized with much improved quality by ion implantation with larger clusters of C 4 –C 6
New experimental investigation of cluster structures in 10 Be and 16 C neutron-rich nuclei

Science.gov (United States)

Dell'Aquila, L.; Acosta, D.; Auditore, L.; Cardella, G.; De Filippo, E.; De Luca, S.; Francalanza, L.; Gnoffo, B.; Lanzalone, G.; Lombardo, I.; Martorana, N. S.; Norella, S.; Pagano, A.; Pagano, E. V.; Papa, M.; Pirrone, S.; Politi, G.; Quattrocchi, L.; Rizzo, F.; Russotto, P.; Trifirò, A.; Trimarchi, M.; Verde, G.; Vigilante, M.

2017-11-01

The existence of cluster structures in ^{10} Be and ^{16} C neutron-rich isotopes is investigated via projectile break-up reactions induced on polyethylene (CH _2 target. We used a fragmentation beam constituted by 55MeV/u ^{10} Be and 49MeV/u ^{16} C beams provided by the FRIBs facility at INFN-LNS. Invariant mass spectra of 4{He}+ 6 He and 6{He} + ^{10} Be breakup fragments are reconstructed by means of the CHIMERA 4π detector to investigate the presence of excited states of projectile nuclei characterized by cluster structure. In the first case, we suggest the presence of a new state in ^{10} Be at 13.5MeV. A non-vanishing yield corresponding to 20.6MeV excitation energy of ^{16} C was observed in the 6{He} + ^{10} Be cluster decay channel. To improve the results of the present analysis, a new experiment has been performed recently, taking advantage of the coupling of CHIMERA and FARCOS. In the paper we describe the data reduction process of the new experiment together with preliminary results.
Modeling of correlated data with informative cluster sizes: An evaluation of joint modeling and within-cluster resampling approaches.

Science.gov (United States)

Zhang, Bo; Liu, Wei; Zhang, Zhiwei; Qu, Yanping; Chen, Zhen; Albert, Paul S

2017-08-01

Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.
Cluster evolution and critical cluster sizes for the square and triangular lattice Ising models using lattice animals and Monte Carlo simulations

NARCIS (Netherlands)

Eising, G.; Kooi, B. J.

2012-01-01

Growth and decay of clusters at temperatures below T-c have been studied for a two-dimensional Ising model for both square and triangular lattices using Monte Carlo (MC) simulations and the enumeration of lattice animals. For the lattice animals, all unique cluster configurations with their internal
MEMANFAATKAN ALGORITMA K-MEANS DALAM MENENTUKAN PEGAWAI YANG LAYAK MENGIKUTI ASESSMENT CENTER UNTUK CLUSTERING PROGRAM SDP

Directory of Open Access Journals (Sweden)

Iin Parlina

2018-01-01

Full Text Available Data mining merupakan teknik pengolahan data dalam jumlah besar untuk pengelompokan. Teknik Data mining mempunyai beberapa metode dalam mengelompokkan salah satu teknik yang dipakai penulis saat ini adalah K-Means. Dalam hal ini penulis mengelompokan data daftar program SDP tahun 2017 untuk mengetahui manakah pegawai yang layak lolos dalam program SDP sehingga dapat melakukan Registrasi Asessment Center. Pengelompokan tersebut berdasarkan kriteria – kriteria data Program SDP. Pada penelitian ini, penulis menerapkan algoritma K-Means Clustering untuk pengelompokan data Program SDP di PT.Bank Syariah. Dalam hal ini, pada umumnya untuk memamasuki program SDP tersebut disesuaikan dengan ketentuan dan parameter Program SDP saja, namun dalam penelitian ini pengelompokan disesuaikan dengan kriteria – kriteria Program SDP seperti kedisiplinan pegawai, Target Kerja Pegawai, Kepatuhan Program SDP. Penulis menggunakan beberapa kriteria tersebut agar pengelompokan yang dihasilkan menjadi lebih optimal. Tujuan dari pengelompokan ini adalah terbentuknya kelompok SDP pada Program SDP yang menggunakan algoritma K-Means clustering. Hasil dari pengelompokan tersebut diperoleh tiga kelompok yaitu kelompok Lolos, Hampir Lolos dan Tidak Lolos. Terdapat pusat cluster dengan Cluster-1= 8;66;13, Cluster-2= 10;71;14 dan Cluster-3=7;60;12. Pusat cluster tersebut didapat dari beberapa iterasi sehingga mengahasilakan pusat cluster yang optimal.

Co-clustering models, algorithms and applications

CERN Document Server

Govaert, Gérard

2013-01-01

Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixture
Clustering Dycom

KAUST Repository

Minku, Leandro L.

2017-10-06

Background: Software Effort Estimation (SEE) can be formulated as an online learning problem, where new projects are completed over time and may become available for training. In this scenario, a Cross-Company (CC) SEE approach called Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving the high cost of collecting such training projects. However, Dycom relies on splitting CC projects into different subsets in order to create its CC models. Such splitting can have a significant impact on Dycom\\'s predictive performance. Aims: This paper investigates whether clustering methods can be used to help finding good CC splits for Dycom. Method: Dycom is extended to use clustering methods for creating the CC subsets. Three different clustering methods are investigated, namely Hierarchical Clustering, K-Means, and Expectation-Maximisation. Clustering Dycom is compared against the original Dycom with CC subsets of different sizes, based on four SEE databases. A baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number of CC subsets to be pre-defined, and a poor choice can negatively affect predictive performance. EM enables Dycom to automatically set the number of CC subsets while still maintaining or improving predictive performance with respect to the baseline WC model. Clustering Dycom with Hierarchical Clustering did not offer significant advantage in terms of predictive performance. Conclusion: Clustering methods can be an effective way to automatically generate Dycom\\'s CC subsets.
K-means clustering for support construction in diffractive imaging.

Science.gov (United States)

Hattanda, Shunsuke; Shioya, Hiroyuki; Maehara, Yosuke; Gohara, Kazutoshi

2014-03-01

A method for constructing an object support based on K-means clustering of the object-intensity distribution is newly presented in diffractive imaging. This releases the adjustment of unknown parameters in the support construction, and it is well incorporated with the Gerchberg and Saxton diagram. A simple numerical simulation reveals that the proposed method is effective for dynamically constructing the support without an initial prior support.
A Decentralized Fuzzy C-Means-Based Energy-Efficient Routing Protocol for Wireless Sensor Networks

Directory of Open Access Journals (Sweden)

Osama Moh’d Alia

2014-01-01

Full Text Available Energy conservation in wireless sensor networks (WSNs is a vital consideration when designing wireless networking protocols. In this paper, we propose a Decentralized Fuzzy Clustering Protocol, named DCFP, which minimizes total network energy dissipation to promote maximum network lifetime. The process of constructing the infrastructure for a given WSN is performed only once at the beginning of the protocol at a base station, which remains unchanged throughout the network’s lifetime. In this initial construction step, a fuzzy C-means algorithm is adopted to allocate sensor nodes into their most appropriate clusters. Subsequently, the protocol runs its rounds where each round is divided into a CH-Election phase and a Data Transmission phase. In the CH-Election phase, the election of new cluster heads is done locally in each cluster where a new multicriteria objective function is proposed to enhance the quality of elected cluster heads. In the Data Transmission phase, the sensing and data transmission from each sensor node to their respective cluster head is performed and cluster heads in turn aggregate and send the sensed data to the base station. Simulation results demonstrate that the proposed protocol improves network lifetime, data delivery, and energy consumption compared to other well-known energy-efficient protocols.
A decentralized fuzzy C-means-based energy-efficient routing protocol for wireless sensor networks.

Science.gov (United States)

Alia, Osama Moh'd

2014-01-01

Energy conservation in wireless sensor networks (WSNs) is a vital consideration when designing wireless networking protocols. In this paper, we propose a Decentralized Fuzzy Clustering Protocol, named DCFP, which minimizes total network energy dissipation to promote maximum network lifetime. The process of constructing the infrastructure for a given WSN is performed only once at the beginning of the protocol at a base station, which remains unchanged throughout the network's lifetime. In this initial construction step, a fuzzy C-means algorithm is adopted to allocate sensor nodes into their most appropriate clusters. Subsequently, the protocol runs its rounds where each round is divided into a CH-Election phase and a Data Transmission phase. In the CH-Election phase, the election of new cluster heads is done locally in each cluster where a new multicriteria objective function is proposed to enhance the quality of elected cluster heads. In the Data Transmission phase, the sensing and data transmission from each sensor node to their respective cluster head is performed and cluster heads in turn aggregate and send the sensed data to the base station. Simulation results demonstrate that the proposed protocol improves network lifetime, data delivery, and energy consumption compared to other well-known energy-efficient protocols.
A Decentralized Fuzzy C-Means-Based Energy-Efficient Routing Protocol for Wireless Sensor Networks

Science.gov (United States)

2014-01-01

Energy conservation in wireless sensor networks (WSNs) is a vital consideration when designing wireless networking protocols. In this paper, we propose a Decentralized Fuzzy Clustering Protocol, named DCFP, which minimizes total network energy dissipation to promote maximum network lifetime. The process of constructing the infrastructure for a given WSN is performed only once at the beginning of the protocol at a base station, which remains unchanged throughout the network's lifetime. In this initial construction step, a fuzzy C-means algorithm is adopted to allocate sensor nodes into their most appropriate clusters. Subsequently, the protocol runs its rounds where each round is divided into a CH-Election phase and a Data Transmission phase. In the CH-Election phase, the election of new cluster heads is done locally in each cluster where a new multicriteria objective function is proposed to enhance the quality of elected cluster heads. In the Data Transmission phase, the sensing and data transmission from each sensor node to their respective cluster head is performed and cluster heads in turn aggregate and send the sensed data to the base station. Simulation results demonstrate that the proposed protocol improves network lifetime, data delivery, and energy consumption compared to other well-known energy-efficient protocols. PMID:25162060
Generalized Smoluchowski equation with correlation between clusters

International Nuclear Information System (INIS)

Sittler, Lionel

2008-01-01

In this paper we compute new reaction rates of the Smoluchowski equation which takes into account correlations. The new rate K = K MF + K C is the sum of two terms. The first term is the known Smoluchowski rate with the mean-field approximation. The second takes into account a correlation between clusters. For this purpose we introduce the average path of a cluster. We relate the length of this path to the reaction rate of the Smoluchowski equation. We solve the implicit dependence between the average path and the density of clusters. We show that this correlation length is the same for all clusters. Our result depends strongly on the spatial dimension d. The mean-field term K MF i,j = (D i + D j )(r j + r i ) d-2 , which vanishes for d = 1 and is valid up to logarithmic correction for d = 2, is the usual rate found with the Smoluchowski model without correlation (where r i is the radius and D i is the diffusion constant of the cluster). We compute a new rate: the correlation rate K i,j C = (D i +D j )(r j +r i ) d-1 M((d-1)/d f ) is valid for d ≥ 1(where M(α) = Σ +∞ i=1 i α N i is the moment of the density of clusters and d f is the fractal dimension of the cluster). The result is valid for a large class of diffusion processes and mass-radius relations. This approach confirms some analytical solutions in d = 1 found with other methods. We also show Monte Carlo simulations which illustrate some exact new solvable models
ANALISIS SEGMENTASI PELANGGAN MENGGUNAKAN KOMBINASI RFM MODEL DAN TEKNIK CLUSTERING

Directory of Open Access Journals (Sweden)

Beta Estri Adiana

2018-04-01

Full Text Available Intense competition in the business field motivates a small and medium enterprises (SMEs to manage customer services to the maximal. Improve of customer royalty by grouping cunstomers into some of groups and determining appropriate and effective marketing strategies for each group. Customer segmentation can be performed by data mining approach with clustering method. The main purpose of this paper is customer segmentation and measure their loyalty to a SME’s product. Using CRISP-DM method which consist of six phases, namely business understanding, data understanding, data preparatuin, modeling, evaluation and deployment. The K-Means algorithm is used for cluster formation and RapidMiner as a tool used to evaluate the result of clusters. Cluster formation is based on RFM (recency, frequency, monetary analysis. Davies Bouldin Index (DBI is used to find the optimal number of clusters (k. The customers are divided into 3 clusters, total of customer in first cluster is 30 customers who entered in typical customer category, the second cluster there are 8 customer whho entered in superstar customer and 89 customers in third cluster is dormant cluster category.
Prediction of chemotherapeutic response in bladder cancer using K-means clustering of dynamic contrast-enhanced (DCE)-MRI pharmacokinetic parameters.

Science.gov (United States)

Nguyen, Huyen T; Jia, Guang; Shah, Zarine K; Pohar, Kamal; Mortazavi, Amir; Zynger, Debra L; Wei, Lai; Yang, Xiangyu; Clark, Daniel; Knopp, Michael V

2015-05-01

To apply k-means clustering of two pharmacokinetic parameters derived from 3T dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to predict the chemotherapeutic response in bladder cancer at the mid-cycle timepoint. With the predetermined number of three clusters, k-means clustering was performed on nondimensionalized Amp and kep estimates of each bladder tumor. Three cluster volume fractions (VFs) were calculated for each tumor at baseline and mid-cycle. The changes of three cluster VFs from baseline to mid-cycle were correlated with the tumor's chemotherapeutic response. Receiver-operating-characteristics curve analysis was used to evaluate the performance of each cluster VF change as a biomarker of chemotherapeutic response in bladder cancer. The k-means clustering partitioned each bladder tumor into cluster 1 (low kep and low Amp), cluster 2 (low kep and high Amp), cluster 3 (high kep and low Amp). The changes of all three cluster VFs were found to be associated with bladder tumor response to chemotherapy. The VF change of cluster 2 presented with the highest area-under-the-curve value (0.96) and the highest sensitivity/specificity/accuracy (96%/100%/97%) with a selected cutoff value. The k-means clustering of the two DCE-MRI pharmacokinetic parameters can characterize the complex microcirculatory changes within a bladder tumor to enable early prediction of the tumor's chemotherapeutic response. © 2014 Wiley Periodicals, Inc.
"K"-Means Clustering and Mixture Model Clustering: Reply to McLachlan (2011) and Vermunt (2011)

Science.gov (United States)

Steinley, Douglas; Brusco, Michael J.

2011-01-01

McLachlan (2011) and Vermunt (2011) each provided thoughtful replies to our original article (Steinley & Brusco, 2011). This response serves to incorporate some of their comments while simultaneously clarifying our position. We argue that greater caution against overparamaterization must be taken when assuming that clusters are highly elliptical…
Risk Mapping of Cutaneous Leishmaniasis via a Fuzzy C Means-based Neuro-Fuzzy Inference System

Science.gov (United States)

Akhavan, P.; Karimi, M.; Pahlavani, P.

2014-10-01

Finding pathogenic factors and how they are spread in the environment has become a global demand, recently. Cutaneous Leishmaniasis (CL) created by Leishmania is a special parasitic disease which can be passed on to human through phlebotomus of vector-born. Studies show that economic situation, cultural issues, as well as environmental and ecological conditions can affect the prevalence of this disease. In this study, Data Mining is utilized in order to predict CL prevalence rate and obtain a risk map. This case is based on effective environmental parameters on CL and a Neuro-Fuzzy system was also used. Learning capacity of Neuro-Fuzzy systems in neural network on one hand and reasoning power of fuzzy systems on the other, make it very efficient to use. In this research, in order to predict CL prevalence rate, an adaptive Neuro-fuzzy inference system with fuzzy inference structure of fuzzy C Means clustering was applied to determine the initial membership functions. Regarding to high incidence of CL in Ilam province, counties of Ilam, Mehran, and Dehloran have been examined and evaluated. The CL prevalence rate was predicted in 2012 by providing effective environmental map and topography properties including temperature, moisture, annual, rainfall, vegetation and elevation. Results indicate that the model precision with fuzzy C Means clustering structure rises acceptable RMSE values of both training and checking data and support our analyses. Using the proposed data mining technology, the pattern of disease spatial distribution and vulnerable areas become identifiable and the map can be used by experts and decision makers of public health as a useful tool in management and optimal decision-making.
Risk Mapping of Cutaneous Leishmaniasis via a Fuzzy C Means-based Neuro-Fuzzy Inference System

Directory of Open Access Journals (Sweden)

P. Akhavan

2014-10-01

Full Text Available Finding pathogenic factors and how they are spread in the environment has become a global demand, recently. Cutaneous Leishmaniasis (CL created by Leishmania is a special parasitic disease which can be passed on to human through phlebotomus of vector-born. Studies show that economic situation, cultural issues, as well as environmental and ecological conditions can affect the prevalence of this disease. In this study, Data Mining is utilized in order to predict CL prevalence rate and obtain a risk map. This case is based on effective environmental parameters on CL and a Neuro-Fuzzy system was also used. Learning capacity of Neuro-Fuzzy systems in neural network on one hand and reasoning power of fuzzy systems on the other, make it very efficient to use. In this research, in order to predict CL prevalence rate, an adaptive Neuro-fuzzy inference system with fuzzy inference structure of fuzzy C Means clustering was applied to determine the initial membership functions. Regarding to high incidence of CL in Ilam province, counties of Ilam, Mehran, and Dehloran have been examined and evaluated. The CL prevalence rate was predicted in 2012 by providing effective environmental map and topography properties including temperature, moisture, annual, rainfall, vegetation and elevation. Results indicate that the model precision with fuzzy C Means clustering structure rises acceptable RMSE values of both training and checking data and support our analyses. Using the proposed data mining technology, the pattern of disease spatial distribution and vulnerable areas become identifiable and the map can be used by experts and decision makers of public health as a useful tool in management and optimal decision-making.
Hierarchical modeling of cluster size in wildlife surveys

Science.gov (United States)

Royle, J. Andrew

2008-01-01

Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).
Analysis of k-means clustering approach on the breast cancer Wisconsin dataset.

Science.gov (United States)

Dubey, Ashutosh Kumar; Gupta, Umesh; Jain, Sonal

2016-11-01

Breast cancer is one of the most common cancers found worldwide and most frequently found in women. An early detection of breast cancer provides the possibility of its cure; therefore, a large number of studies are currently going on to identify methods that can detect breast cancer in its early stages. This study was aimed to find the effects of k-means clustering algorithm with different computation measures like centroid, distance, split method, epoch, attribute, and iteration and to carefully consider and identify the combination of measures that has potential of highly accurate clustering accuracy. K-means algorithm was used to evaluate the impact of clustering using centroid initialization, distance measures, and split methods. The experiments were performed using breast cancer Wisconsin (BCW) diagnostic dataset. Foggy and random centroids were used for the centroid initialization. In foggy centroid, based on random values, the first centroid was calculated. For random centroid, the initial centroid was considered as (0, 0). The results were obtained by employing k-means algorithm and are discussed with different cases considering variable parameters. The calculations were based on the centroid (foggy/random), distance (Euclidean/Manhattan/Pearson), split (simple/variance), threshold (constant epoch/same centroid), attribute (2-9), and iteration (4-10). Approximately, 92 % average positive prediction accuracy was obtained with this approach. Better results were found for the same centroid and the highest variance. The results achieved using Euclidean and Manhattan were better than the Pearson correlation. The findings of this work provided extensive understanding of the computational parameters that can be used with k-means. The results indicated that k-means has a potential to classify BCW dataset.
Cluster analysis of midlatitude oceanic cloud regimes: mean properties and temperature sensitivity

Directory of Open Access Journals (Sweden)

N. D. Gordon

2010-07-01

Full Text Available Clouds play an important role in the climate system by reducing the amount of shortwave radiation reaching the surface and the amount of longwave radiation escaping to space. Accurate simulation of clouds in computer models remains elusive, however, pointing to a lack of understanding of the connection between large-scale dynamics and cloud properties. This study uses a k-means clustering algorithm to group 21 years of satellite cloud data over midlatitude oceans into seven clusters, and demonstrates that the cloud clusters are associated with distinct large-scale dynamical conditions. Three clusters correspond to low-level cloud regimes with different cloud fraction and cumuliform or stratiform characteristics, but all occur under large-scale descent and a relatively dry free troposphere. Three clusters correspond to vertically extensive cloud regimes with tops in the middle or upper troposphere, and they differ according to the strength of large-scale ascent and enhancement of tropospheric temperature and humidity. The final cluster is associated with a lower troposphere that is dry and an upper troposphere that is moist and experiencing weak ascent and horizontal moist advection.

Since the present balance of reflection of shortwave and absorption of longwave radiation by clouds could change as the atmosphere warms from increasing anthropogenic greenhouse gases, we must also better understand how increasing temperature modifies cloud and radiative properties. We therefore undertake an observational analysis of how midlatitude oceanic clouds change with temperature when dynamical processes are held constant (i.e., partial derivative with respect to temperature. For each of the seven cloud regimes, we examine the difference in cloud and radiative properties between warm and cold subsets. To avoid misinterpreting a cloud response to large-scale dynamical forcing as a cloud response to temperature, we require horizontal and vertical
Exotic nuclei in self-consistent mean-field models

International Nuclear Information System (INIS)

Bender, M.; Rutz, K.; Buervenich, T.; Reinhard, P.-G.; Maruhn, J. A.; Greiner, W.

1999-01-01

We discuss two widely used nuclear mean-field models, the relativistic mean-field model and the (nonrelativistic) Skyrme-Hartree-Fock model, and their capability to describe exotic nuclei with emphasis on neutron-rich tin isotopes and superheavy nuclei. (c) 1999 American Institute of Physics
Estimating Single and Multiple Target Locations Using K-Means Clustering with Radio Tomographic Imaging in Wireless Sensor Networks

Science.gov (United States)

2015-03-26

clustering is an algorithm that has been used in data mining applications such as machine learning applications , pattern recognition, hyper-spectral imagery...42 3.7.2 Application of K-means Clustering . . . . . . . . . . . . . . . . . 42 3.8 Experiment Design...Tomographic Imaging WLAN Wireless Local Area Networks WSN Wireless Sensor Network xx ESTIMATING SINGLE AND MULTIPLE TARGET LOCATIONS USING K-MEANS CLUSTERING
Search for 12 C+ 12 C clustering in 24 Mg ground state

Indian Academy of Sciences (India)

Home; Journals; Pramana – Journal of Physics; Volume 88; Issue 2. Search for 12C+12C clustering in 24Mg ground state. B N JOSHI ARUN K JAIN D C BISWAS B V JOHN Y K GUPTA L S DANU R P VIND G K PRAJAPATI S MUKHOPADHYAY A SAXENA. Regular Volume 88 Issue 2 February 2017 Article ID 29 ...
Testing dark energy and dark matter cosmological models with clusters of galaxies

Energy Technology Data Exchange (ETDEWEB)

Boehringer, Hans [Max-Planck-Institut fuer Extraterrestrische Physik, Garching (Germany)

2008-07-01

Galaxy clusters are, as the largest building blocks of our Universe, ideal probes to study the large-scale structure and to test cosmological models. The principle approach und the status of this research is reviewed. Clusters lend themselves for tests in serveral ways: the cluster mass function, the spatial clustering, the evolution of both functions with reshift, and the internal composition can be used to constrain cosmological parameters. X-ray observations are currently the best means of obtaining the relevant data on the galaxy cluster population. We illustrate in particular all the above mentioned methods with our ROSAT based cluster surveys. The mass calibration of clusters is an important issue, that is currently solved with XMM-Newton and Chandra studies. Based on the current experience we provide an outlook for future research, especially with eROSITA.
Evaluating Mixture Modeling for Clustering: Recommendations and Cautions

Science.gov (United States)

Steinley, Douglas; Brusco, Michael J.

2011-01-01

This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…

Pharmacokinetic analysis and k-means clustering of DCEMR images for radiotherapy outcome prediction of advanced cervical cancers.

Science.gov (United States)

Andersen, Erlend K F; Kristensen, Gunnar B; Lyng, Heidi; Malinen, Eirik

2011-08-01

Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, K(trans) and ύ(e), were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control.
Pharmacokinetic analysis and k-means clustering of DCEMR images for radiotherapy outcome prediction of advanced cervical cancers

International Nuclear Information System (INIS)

Andersen, Erlend K. F.; Kristensen, Gunnar B.; Lyng, Heidi; Malinen, Eirik

2011-01-01

Introduction. Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Materials and methods. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, K trans and u e , were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Results. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Conclusions. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control
Pharmacokinetic analysis and k-means clustering of DCEMR images for radiotherapy outcome prediction of advanced cervical cancers

Energy Technology Data Exchange (ETDEWEB)

Andersen, Erlend K. F. (Dept. of Medical Physics, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway)), e-mail: eirik.malinen@fys.uio.no; Kristensen, Gunnar B. (Section for Gynaecological Oncology, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway)); Lyng, Heidi (Dept. of Radiation Biology, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway)); Malinen, Eirik (Dept. of Medical Physics, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway); Dept. of Physics, Univ. of Oslo, Oslo (Norway))

2011-08-15

Introduction. Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Materials and methods. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, Ktrans and u{sub e}, were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Results. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Conclusions. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control
An implementation of the relational k-means algorithm

OpenAIRE

Szalkai, Balázs

2013-01-01

A C# implementation of a generalized k-means variant called relational k-means is described here. Relational k-means is a generalization of the well-known k-means clustering method which works for non-Euclidean scenarios as well. The input is an arbitrary distance matrix, as opposed to the traditional k-means method, where the clustered objects need to be identified with vectors.
"K"-Means May Perform as well as Mixture Model Clustering but May Also Be Much Worse: Comment on Steinley and Brusco (2011)

Science.gov (United States)

Vermunt, Jeroen K.

2011-01-01

Steinley and Brusco (2011) presented the results of a huge simulation study aimed at evaluating cluster recovery of mixture model clustering (MMC) both for the situation where the number of clusters is known and is unknown. They derived rather strong conclusions on the basis of this study, especially with regard to the good performance of…
FORMATION OF A INNOVATION REGIONAL CLUSTER MODEL

Directory of Open Access Journals (Sweden)

G. S. Merzlikina

2015-01-01

Full Text Available Summary. As a result of investigation of science and methodical approaches related problems of building and development of innovation clusters there were some issues in functional assignments of innovation and production clusters. Because of those issues, article’s authors differ conceptions of innovation cluster and production cluster, as they explain notion of innovation-production cluster. The main goal of this article is to reveal existing organizational issues in cluster building and its successful development. Based on regional clusters building analysis carried out there was typical practical structure of cluster members interaction revealed. This structure also have its cons, as following: absence cluster orientation to marketing environment, lack of members’ prolonged relations’ building and development system, along with ineffective management of information, financial and material streams within cluster, narrow competence difference and responsibility zones between cluster members, lack of transparence of cluster’s action, low environment changes adaptivity, hard to use cluster members’ intellectual property, and commercialization of hi-tech products. When all those issues listed above come together, it reduces life activity of existing models of innovative cluster-building along with practical opportunity of cluster realization. Because of that, authors offer an upgraded innovative-productive cluster building model with more efficient business processes management system, which includes advanced innovative cluster structure, competence matrix and subcluster responsibility zone. Suggested model differs from other ones by using unified innovative product development control center, which also controls production and marketing realization.
A User-Adaptive Algorithm for Activity Recognition Based on K-Means Clustering, Local Outlier Factor, and Multivariate Gaussian Distribution

Directory of Open Access Journals (Sweden)

Shizhen Zhao

2018-06-01

Full Text Available Mobile activity recognition is significant to the development of human-centric pervasive applications including elderly care, personalized recommendations, etc. Nevertheless, the distribution of inertial sensor data can be influenced to a great extent by varying users. This means that the performance of an activity recognition classifier trained by one user’s dataset will degenerate when transferred to others. In this study, we focus on building a personalized classifier to detect four categories of human activities: light intensity activity, moderate intensity activity, vigorous intensity activity, and fall. In order to solve the problem caused by different distributions of inertial sensor signals, a user-adaptive algorithm based on K-Means clustering, local outlier factor (LOF, and multivariate Gaussian distribution (MGD is proposed. To automatically cluster and annotate a specific user’s activity data, an improved K-Means algorithm with a novel initialization method is designed. By quantifying the samples’ informative degree in a labeled individual dataset, the most profitable samples can be selected for activity recognition model adaption. Through experiments, we conclude that our proposed models can adapt to new users with good recognition performance.
Modelling baryonic effects on galaxy cluster mass profiles

Science.gov (United States)

Shirasaki, Masato; Lau, Erwin T.; Nagai, Daisuke

2018-06-01

Gravitational lensing is a powerful probe of the mass distribution of galaxy clusters and cosmology. However, accurate measurements of the cluster mass profiles are limited by uncertainties in cluster astrophysics. In this work, we present a physically motivated model of baryonic effects on the cluster mass profiles, which self-consistently takes into account the impact of baryons on the concentration as well as mass accretion histories of galaxy clusters. We calibrate this model using the Omega500 hydrodynamical cosmological simulations of galaxy clusters with varying baryonic physics. Our model will enable us to simultaneously constrain cluster mass, concentration, and cosmological parameters using stacked weak lensing measurements from upcoming optical cluster surveys.
Modelling Baryonic Effects on Galaxy Cluster Mass Profiles

Science.gov (United States)

Shirasaki, Masato; Lau, Erwin T.; Nagai, Daisuke

2018-03-01

Gravitational lensing is a powerful probe of the mass distribution of galaxy clusters and cosmology. However, accurate measurements of the cluster mass profiles are limited by uncertainties in cluster astrophysics. In this work, we present a physically motivated model of baryonic effects on the cluster mass profiles, which self-consistently takes into account the impact of baryons on the concentration as well as mass accretion histories of galaxy clusters. We calibrate this model using the Omega500 hydrodynamical cosmological simulations of galaxy clusters with varying baryonic physics. Our model will enable us to simultaneously constrain cluster mass, concentration, and cosmological parameters using stacked weak lensing measurements from upcoming optical cluster surveys.
Magnetic cluster mean-field description of spin glasses in amorphous La-Gd-Au alloys

International Nuclear Information System (INIS)

Poon, S.J.; Durand, J.

1978-03-01

Bulk magnetic properties of splat-cooled amorphous alloys of composition La/sub 80-x/Gd/sub x/Au 20 (0 less than or equal to x less than or equal to 80) were studied. Zero-field susceptibility, high-field magnetization (up to 75 kOe) and saturated remanence were measured between 1.8 and 290 0 K. Data were analyzed using a cluster mean-field approximation for the spin-glass and mictomagnetic alloys (x less than or equal to 56). Mean-field theories can account for the experimental freezing-temperatures of dilute spin-glasses in which the Ruderman-Kittel-Kasuya-Yosida interaction is dominant. For the dilute alloys, the role of amorphousness on the magnetic interactions is discussed. By extending the mean-field approximation, the concentrated spin-glasses are represented by rigid ferromagnetic clusters as individual spin-entities interacting via random forces. Scaling laws for the magnetization M and saturation remanent magnetization M/sub rs/ are obtained and presented graphically for the x less than or equal to 32 alloys in which M/x = g(H/x*, T/x), M/sub rs/(T)/x = M/sub rs/(0)/x/ exp (-α*T/x/sup p/) where x* is the concentration of clusters, α* is a constant, and p is the freezing-temperature exponent given by T/sub M/ infinity x/sup p/. It is found that p = 1 and 1.3 for the regions 4 less than or equal to x less than or equal to 40 respectively. An attempt is also made to account for the freezing temperatures of concentrated spin glasses. The strength of the interaction among clusters is determined from high-field magnetization measurements using the Larkin-Smith method modified for clusters. It is shown that for the x < 24 alloys, the size of the clusters can be correlated to the structural short-range order in the amorphous state. More concentrated alloys are marked by the emergence of cluster percolation
A Kondo cluster-glass model for spin glass Cerium alloys

International Nuclear Information System (INIS)

Zimmer, F M; Magalhaes, S G; Coqblin, B

2011-01-01

There are clear indications that the presence of disorder in Ce alloys, such as Ce(Ni,Cu) or Ce(Pd,Rh), is responsible for the existence of a cluster spin glass state which changes continuously into inhomogeneous ferromagnetism at low temperatures. We present a study of the competition between magnetism and Kondo effect in a cluster-glass model composed by a random inter-cluster interaction term and an intra-cluster one, which contains an intra-site Kondo interaction J k and an inter-site ferromagnetic one J 0 . The random interaction is given by the van Hemmen type of randomness which allows to solve the problem without the use of the replica method. The inter-cluster term is solved within the cluster mean-field theory and the remaining intra-cluster interactions can be treated by exact diagonalization. Results show the behavior of the cluster glass order parameter and the Kondo correlation function for several sizes of the clusters, J k , J 0 and values of the ferromagnetic inter-cluster average interaction I 0 . Particularly, for small J k , the magnetic solution is strongly dependent on I 0 and J 0 and a Kondo cluster-glass or a mixed phase can be obtained, while, for large J k , the Kondo effect is still dominant, both in good agreement with experiment in Ce(Ni,Cu) or Ce(Pd,Rh) alloys.
Formation mechanism of solute clusters under neutron irradiation in ferritic model alloys and in a reactor pressure vessel steel: clusters of defects

International Nuclear Information System (INIS)

Meslin-Chiffon, E.

2007-11-01

The embrittlement of reactor pressure vessel (RPV) under irradiation is partly due to the formation of point defects (PD) and solute clusters. The aim of this work was to gain more insight into the formation mechanisms of solute clusters in low copper ([Cu] = 0.1 wt%) FeCu and FeCuMnNi model alloys, in a copper free FeMnNi model alloy and in a low copper French RPV steel (16MND5). These materials were neutron-irradiated around 300 C in a test reactor. Solute clusters were characterized by tomographic atom probe whereas PD clusters were simulated with a rate theory numerical code calibrated under cascade damage conditions using transmission electron microscopy analysis. The confrontation between experiments and simulation reveals that a heterogeneous irradiation-induced solute precipitation/segregation probably occurs on PD clusters. (author)
Experimental Tests of the Algebraic Cluster Model

Science.gov (United States)

Gai, Moshe

2018-02-01

The Algebraic Cluster Model (ACM) of Bijker and Iachello that was proposed already in 2000 has been recently applied to 12C and 16O with much success. We review the current status in 12C with the outstanding observation of the ground state rotational band composed of the spin-parity states of: 0+, 2+, 3-, 4± and 5-. The observation of the 4± parity doublet is a characteristic of (tri-atomic) molecular configuration where the three alpha- particles are arranged in an equilateral triangular configuration of a symmetric spinning top. We discuss future measurement with electron scattering, 12C(e,e’) to test the predicted B(Eλ) of the ACM.
Nucleus and cytoplasm segmentation in microscopic images using K-means clustering and region growing.

Science.gov (United States)

Sarrafzadeh, Omid; Dehnavi, Alireza Mehri

2015-01-01

Segmentation of leukocytes acts as the foundation for all automated image-based hematological disease recognition systems. Most of the time, hematologists are interested in evaluation of white blood cells only. Digital image processing techniques can help them in their analysis and diagnosis. The main objective of this paper is to detect leukocytes from a blood smear microscopic image and segment them into their two dominant elements, nucleus and cytoplasm. The segmentation is conducted using two stages of applying K-means clustering. First, the nuclei are segmented using K-means clustering. Then, a proposed method based on region growing is applied to separate the connected nuclei. Next, the nuclei are subtracted from the original image. Finally, the cytoplasm is segmented using the second stage of K-means clustering. The results indicate that the proposed method is able to extract the nucleus and cytoplasm regions accurately and works well even though there is no significant contrast between the components in the image. In this paper, a method based on K-means clustering and region growing is proposed in order to detect leukocytes from a blood smear microscopic image and segment its components, the nucleus and the cytoplasm. As region growing step of the algorithm relies on the information of edges, it will not able to separate the connected nuclei more accurately in poor edges and it requires at least a weak edge to exist between the nuclei. The nucleus and cytoplasm segments of a leukocyte can be used for feature extraction and classification which leads to automated leukemia detection.
Origin of nanodiamonds from Allende constrained by statistical analysis of C isotopes from small clusters of acid residue by NanoSIMS

Science.gov (United States)

Lewis, Josiah B.; Floss, Christine; Gyngard, Frank

2018-01-01

Meteoritic nanodiamonds carry noble gases with anomalies in their stable isotopes that have drawn attention to their potentially presolar origin. Measurements of 12C/13C isotope ratios of presolar nanodiamonds are essential to understanding their origins, but bulk studies do not show notable deviations from the solar system 12C/13C ratio. We implemented a technique using secondary ion mass spectrometry with maximized spatial resolution to measure carbon isotopes in the smallest clusters of nanodiamonds possible. We measured C and Si from clusters containing as few as 1000 nanodiamonds, the smallest clusters of nanodiamonds measured to date by traditional mass spectrometry. This allowed us to investigate many possible complex compositions of the nanodiamonds, both through direct methods and statistical analysis of the distributions of observed isotopic ratios. Analysis of the breadth of distributions of carbon isotopic ratios for a number of ∼1000-nanodiamond aggregates indicates that the 12C/13C ratio may be drawn from multiple Gaussian distributions about different isotopic ratios, which implies the presence of presolar material. The mean isotopic ratio is consistent with the solar system value, so presolar components are required to be either low in concentration, or to have a mean ratio close to that of the solar system. Supernovae are likely candidates for the source of such a presolar component, although asymptotic giant branch stars are not excluded. A few aggregates show deviations from the mean 12C/13C ratio large enough to be borderline detections of enrichments in 13C. These could be caused by the presence of a small population of nanodiamonds formed from sources that produce extremely 13C-rich material, such as J-stars, novae, born-again asymptotic giant branch stars, or supernovae. Of these possible sources, only supernovae would account for the anomalous noble gases carried in the nanodiamonds.
Structure and tensile properties of Fe-Cr model alloy strengthened by nano-scale NbC particles derived from controlled crystallization of Nb-rich clusters

Energy Technology Data Exchange (ETDEWEB)

Dai, Lei [College of Materials and Chemical Engineering, Three Gorges University, Yichang 443002 (China); Guo, Qianying [State Key Lab of Hydraulic Engineering Simulation and Safety, School of Material Science and Engineering, Tianjin University, Tianjin 300354 (China); Liu, Yongchang, E-mail: licmtju@163.com [State Key Lab of Hydraulic Engineering Simulation and Safety, School of Material Science and Engineering, Tianjin University, Tianjin 300354 (China); Yu, Liming; Li, Huijun [State Key Lab of Hydraulic Engineering Simulation and Safety, School of Material Science and Engineering, Tianjin University, Tianjin 300354 (China)

2016-09-30

This article describes the microstructural evolution and tensile properties of Fe-Cr model alloy strengthened by nano-scale NbC particles. According to the results obtained from X-ray diffraction and transmission electron microscope with Energy Dispersive Spectrometer, the bcc ultrafine grains and the disordered phase of Nb-rich nano-clusters were observed in the milled powders. The hot pressing (HP) resulted in a nearly equiaxed ferritic grains and dispersed nano-scale NbC (~8 nm) particles. The microstructure studies reveal that the formation of NbC nanoparticles is composed of nucleation and growth of the Nb-rich nano-clusters involving diffusion of their component. At room temperature the material exhibits an ultimate tensile strength of 700 MPa, yield strength of 650 MPa, and total elongation of 11.7 pct. The fracture surface studies reveal that a typical ductile fracture mode has occurred during tensile test.
Comparison of Cluster C personality disorders in couples with ...

African Journals Online (AJOL)

Comparison of Cluster C personality disorders in couples with normal divorce. ... Also purposeful sampling was used to select individuals. ... that the personality disorder group C, there is no significant difference between men and women.
Cost-effectiveness of psychotherapy for cluster C personality disorders: A decision-analytic model in The Netherlands

NARCIS (Netherlands)

D.I. Soeteman (Djora); R. Verheul (Roel); A.M.M.A. Meerman (Anke); U.M. Ziegler (Uli); B. van Rossum (Bert); J. Delimon (Jos); P. Rijnierse (Piet); M.M. Thunnissen (Moniek); J.J. van Busschbach (Jan); J.J. Kim (Julie)

2011-01-01

textabstractObjective: To conduct a formal economic evaluation of various dosages of psychotherapy for patients with avoidant, dependent, and obsessive-compulsive (ie, cluster C) personality disorders (Structured Interview for DSM-IV Personality criteria). Method: We developed a decision-analytic
Cluster structure in Cf nuclei

International Nuclear Information System (INIS)

Singh, Shailesh K.; Biswal, S.K.; Bhuyan, M.; Patra, S.K.; Gupta, R.K.

2014-01-01

Due to the availability of advance experimental facilities, it is possible to probe the nuclei upto their nucleon level very precisely and analyzed the internal structure which will help us to resolve some mysterious problem of the decay of nuclei. Recently, the relativistic nuclear collision, confirmed the α cluster type structure in the 12 C which is the mile stone for the cluster structure in nuclei. The clustering phenomena in light and intermediate elements in nuclear chart is very interesting. There is a lot of work done by our group in the clustering behaviour of the nuclei. In this paper, the various prospectus of clustering in the isotopes of Cf nucleus including fission state is discussed. Here, 242 Cf isotope for the analysis, which is experimentally known is taken. The relativistic mean field model with well established NL3 parameter set is taken. For getting the exact ground state configuration of the isotopes, the calculation for minimizing the potential energy surface is performed by constraint method. The clustering structure of other Cf isotopes is discussed
The updated geodetic mean dynamic topography model – DTU15MDT

DEFF Research Database (Denmark)

Knudsen, Per; Andersen, Ole Baltazar; Maximenko, Nikolai

An update to the global mean dynamic topography model DTU13MDT is presented. For DTU15MDT the newer gravity model EIGEN-6C4 has been combined with the DTU15MSS mean sea surface model to construct this global mean dynamic topography model. The EIGEN-6C4 is derived using the full series of GOCE data...

Segmentation of Mushroom and Cap width Measurement using Modified K-Means Clustering Algorithm

Directory of Open Access Journals (Sweden)

Eser Sert

2014-01-01

Full Text Available Mushroom is one of the commonly consumed foods. Image processing is one of the effective way for examination of visual features and detecting the size of a mushroom. We developed software for segmentation of a mushroom in a picture and also to measure the cap width of the mushroom. K-Means clustering method is used for the process. K-Means is one of the most successful clustering methods. In our study we customized the algorithm to get the best result and tested the algorithm. In the system, at first mushroom picture is filtered, histograms are balanced and after that segmentation is performed. Results provided that customized algorithm performed better segmentation than classical K-Means algorithm. Tests performed on the designed software showed that segmentation on complex background pictures is performed with high accuracy, and 20 mushrooms caps are measured with 2.281 % relative error.
A K-means multivariate approach for clustering independent components from magnetoencephalographic data.

Science.gov (United States)

Spadone, Sara; de Pasquale, Francesco; Mantini, Dante; Della Penna, Stefania

2012-09-01

Independent component analysis (ICA) is typically applied on functional magnetic resonance imaging, electroencephalographic and magnetoencephalographic (MEG) data due to its data-driven nature. In these applications, ICA needs to be extended from single to multi-session and multi-subject studies for interpreting and assigning a statistical significance at the group level. Here a novel strategy for analyzing MEG independent components (ICs) is presented, Multivariate Algorithm for Grouping MEG Independent Components K-means based (MAGMICK). The proposed approach is able to capture spatio-temporal dynamics of brain activity in MEG studies by running ICA at subject level and then clustering the ICs across sessions and subjects. Distinctive features of MAGMICK are: i) the implementation of an efficient set of "MEG fingerprints" designed to summarize properties of MEG ICs as they are built on spatial, temporal and spectral parameters; ii) the implementation of a modified version of the standard K-means procedure to improve its data-driven character. This algorithm groups the obtained ICs automatically estimating the number of clusters through an adaptive weighting of the parameters and a constraint on the ICs independence, i.e. components coming from the same session (at subject level) or subject (at group level) cannot be grouped together. The performances of MAGMICK are illustrated by analyzing two sets of MEG data obtained during a finger tapping task and median nerve stimulation. The results demonstrate that the method can extract consistent patterns of spatial topography and spectral properties across sessions and subjects that are in good agreement with the literature. In addition, these results are compared to those from a modified version of affinity propagation clustering method. The comparison, evaluated in terms of different clustering validity indices, shows that our methodology often outperforms the clustering algorithm. Eventually, these results are
Clusters of cultures: diversity in meaning of family value and gender role items across Europe.

Science.gov (United States)

van Vlimmeren, Eva; Moors, Guy B D; Gelissen, John P T M

2017-01-01

Survey data are often used to map cultural diversity by aggregating scores of attitude and value items across countries. However, this procedure only makes sense if the same concept is measured in all countries. In this study we argue that when (co)variances among sets of items are similar across countries, these countries share a common way of assigning meaning to the items. Clusters of cultures can then be observed by doing a cluster analysis on the (co)variance matrices of sets of related items. This study focuses on family values and gender role attitudes. We find four clusters of cultures that assign a distinct meaning to these items, especially in the case of gender roles. Some of these differences reflect response style behavior in the form of acquiescence. Adjusting for this style effect impacts on country comparisons hence demonstrating the usefulness of investigating the patterns of meaning given to sets of items prior to aggregating scores into cultural characteristics.
Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering.

Science.gov (United States)

Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar

2015-01-01

Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed.
[Research on K-means clustering segmentation method for MRI brain image based on selecting multi-peaks in gray histogram].

Science.gov (United States)

Chen, Zhaoxue; Yu, Haizhong; Chen, Hao

2013-12-01

To solve the problem of traditional K-means clustering in which initial clustering centers are selected randomly, we proposed a new K-means segmentation algorithm based on robustly selecting 'peaks' standing for White Matter, Gray Matter and Cerebrospinal Fluid in multi-peaks gray histogram of MRI brain image. The new algorithm takes gray value of selected histogram 'peaks' as the initial K-means clustering center and can segment the MRI brain image into three parts of tissue more effectively, accurately, steadily and successfully. Massive experiments have proved that the proposed algorithm can overcome many shortcomings caused by traditional K-means clustering method such as low efficiency, veracity, robustness and time consuming. The histogram 'peak' selecting idea of the proposed segmentootion method is of more universal availability.
The clustered nucleus-cluster structures in stable and unstable nuclei

International Nuclear Information System (INIS)

Freer, Martin

2007-01-01

The subject of clustering has a lineage which runs throughout the history of nuclear physics. Its attraction is the simplification of the often uncorrelated behaviour of independent particles to organized and coherent quasi-crystalline structures. In this review the ideas behind the development of clustering in light nuclei are investigated, mostly from the stand-point of the harmonic oscillator framework. This allows a unifying description of alpha-conjugate and neutron-rich nuclei, alike. More sophisticated models of clusters are explored, such as antisymmetrized molecular dynamics. A number of contemporary topics in clustering are touched upon; the 3α-cluster state in 12 C, nuclear molecules and clustering at the drip-line. Finally, an understanding of the 12 C+ 12 C resonances in 24 Mg, within the framework of the theoretical ideas developed in the review, is presented
Genetic and environmental influences on dimensional representations of DSM-IV cluster C personality disorders: a population-based multivariate twin study.

Science.gov (United States)

Reichborn-Kjennerud, Ted; Czajkowski, Nikolai; Neale, Michael C; Ørstavik, Ragnhild E; Torgersen, Svenn; Tambs, Kristian; Røysamb, Espen; Harris, Jennifer R; Kendler, Kenneth S

2007-05-01

The DSM-IV cluster C Axis II disorders include avoidant (AVPD), dependent (DEPD) and obsessive-compulsive (OCPD) personality disorders. We aimed to estimate the genetic and environmental influences on dimensional representations of these disorders and examine the validity of the cluster C construct by determining to what extent common familial factors influence the individual PDs. PDs were assessed using the Structured Interview for DSM-IV Personality (SIDP-IV) in a sample of 1386 young adult twin pairs from the Norwegian Institute of Public Health Twin Panel (NIPHTP). A single-factor independent pathway multivariate model was applied to the number of endorsed criteria for the three cluster C disorders, using the statistical modeling program Mx. The best-fitting model included genetic and unique environmental factors only, and equated parameters for males and females. Heritability ranged from 27% to 35%. The proportion of genetic variance explained by a common factor was 83, 48 and 15% respectively for AVPD, DEPD and OCPD. Common genetic and environmental factors accounted for 54% and 64% respectively of the variance in AVPD and DEPD but only 11% of the variance in OCPD. Cluster C PDs are moderately heritable. No evidence was found for shared environmental or sex effects. Common genetic and individual environmental factors account for a substantial proportion of the variance in AVPD and DEPD. However, OCPD appears to be largely etiologically distinct from the other two PDs. The results do not support the validity of the DSM-IV cluster C construct in its present form.
Hopfield-K-Means clustering algorithm: A proposal for the segmentation of electricity customers

Energy Technology Data Exchange (ETDEWEB)

Lopez, Jose J.; Aguado, Jose A.; Martin, F.; Munoz, F.; Rodriguez, A.; Ruiz, Jose E. [Department of Electrical Engineering, University of Malaga, C/ Dr. Ortiz Ramos, sn., Escuela de Ingenierias, 29071 Malaga (Spain)

2011-02-15

Customer classification aims at providing electric utilities with a volume of information to enable them to establish different types of tariffs. Several methods have been used to segment electricity customers, including, among others, the hierarchical clustering, Modified Follow the Leader and K-Means methods. These, however, entail problems with the pre-allocation of the number of clusters (Follow the Leader), randomness of the solution (K-Means) and improvement of the solution obtained (hierarchical algorithm). Another segmentation method used is Hopfield's autonomous recurrent neural network, although the solution obtained only guarantees that it is a local minimum. In this paper, we present the Hopfield-K-Means algorithm in order to overcome these limitations. This approach eliminates the randomness of the initial solution provided by K-Means based algorithms and it moves closer to the global optimun. The proposed algorithm is also compared against other customer segmentation and characterization techniques, on the basis of relative validation indexes. Finally, the results obtained by this algorithm with a set of 230 electricity customers (residential, industrial and administrative) are presented. (author)
Hopfield-K-Means clustering algorithm: A proposal for the segmentation of electricity customers

International Nuclear Information System (INIS)

Lopez, Jose J.; Aguado, Jose A.; Martin, F.; Munoz, F.; Rodriguez, A.; Ruiz, Jose E.

2011-01-01

Customer classification aims at providing electric utilities with a volume of information to enable them to establish different types of tariffs. Several methods have been used to segment electricity customers, including, among others, the hierarchical clustering, Modified Follow the Leader and K-Means methods. These, however, entail problems with the pre-allocation of the number of clusters (Follow the Leader), randomness of the solution (K-Means) and improvement of the solution obtained (hierarchical algorithm). Another segmentation method used is Hopfield's autonomous recurrent neural network, although the solution obtained only guarantees that it is a local minimum. In this paper, we present the Hopfield-K-Means algorithm in order to overcome these limitations. This approach eliminates the randomness of the initial solution provided by K-Means based algorithms and it moves closer to the global optimun. The proposed algorithm is also compared against other customer segmentation and characterization techniques, on the basis of relative validation indexes. Finally, the results obtained by this algorithm with a set of 230 electricity customers (residential, industrial and administrative) are presented. (author)
Energy spectra of vibron and cluster models in molecular and nuclear systems

Science.gov (United States)

Jalili Majarshin, A.; Sabri, H.; Jafarizadeh, M. A.

2018-03-01

The relation of the algebraic cluster model, i.e., of the vibron model and its extension, to the collective structure, is discussed. In the first section of the paper, we study the energy spectra of vibron model, for diatomic molecule then we derive the rotation-vibration spectrum of 2α, 3α and 4α configuration in the low-lying spectrum of 8Be, 12C and 16O nuclei. All vibrational and rotational states with ground and excited A, E and F states appear to have been observed, moreover the transitional descriptions of the vibron model and α-cluster model were considered by using an infinite-dimensional algebraic method based on the affine \\widehat{SU(1,1)} Lie algebra. The calculated energy spectra are compared with experimental data. Applications to the rotation-vibration spectrum for the diatomic molecule and many-body nuclear clusters indicate that there are solvable models and they can be approximated very well using the transitional theory.
A two-stage method for microcalcification cluster segmentation in mammography by deformable models

International Nuclear Information System (INIS)

Arikidis, N.; Kazantzi, A.; Skiadopoulos, S.; Karahaliou, A.; Costaridou, L.; Vassiou, K.

2015-01-01

Purpose: Segmentation of microcalcification (MC) clusters in x-ray mammography is a difficult task for radiologists. Accurate segmentation is prerequisite for quantitative image analysis of MC clusters and subsequent feature extraction and classification in computer-aided diagnosis schemes. Methods: In this study, a two-stage semiautomated segmentation method of MC clusters is investigated. The first stage is targeted to accurate and time efficient segmentation of the majority of the particles of a MC cluster, by means of a level set method. The second stage is targeted to shape refinement of selected individual MCs, by means of an active contour model. Both methods are applied in the framework of a rich scale-space representation, provided by the wavelet transform at integer scales. Segmentation reliability of the proposed method in terms of inter and intraobserver agreements was evaluated in a case sample of 80 MC clusters originating from the digital database for screening mammography, corresponding to 4 morphology types (punctate: 22, fine linear branching: 16, pleomorphic: 18, and amorphous: 24) of MC clusters, assessing radiologists’ segmentations quantitatively by two distance metrics (Hausdorff distance—HDIST cluster , average of minimum distance—AMINDIST cluster ) and the area overlap measure (AOM cluster ). The effect of the proposed segmentation method on MC cluster characterization accuracy was evaluated in a case sample of 162 pleomorphic MC clusters (72 malignant and 90 benign). Ten MC cluster features, targeted to capture morphologic properties of individual MCs in a cluster (area, major length, perimeter, compactness, and spread), were extracted and a correlation-based feature selection method yielded a feature subset to feed in a support vector machine classifier. Classification performance of the MC cluster features was estimated by means of the area under receiver operating characteristic curve (Az ± Standard Error) utilizing tenfold cross
Modelling clustering of vertically aligned carbon nanotube arrays.

Science.gov (United States)

Schaber, Clemens F; Filippov, Alexander E; Heinlein, Thorsten; Schneider, Jörg J; Gorb, Stanislav N

2015-08-06

Previous research demonstrated that arrays of vertically aligned carbon nanotubes (VACNTs) exhibit strong frictional properties. Experiments indicated a strong decrease of the friction coefficient from the first to the second sliding cycle in repetitive measurements on the same VACNT spot, but stable values in consecutive cycles. VACNTs form clusters under shear applied during friction tests, and self-organization stabilizes the mechanical properties of the arrays. With increasing load in the range between 300 µN and 4 mN applied normally to the array surface during friction tests the size of the clusters increases, while the coefficient of friction decreases. To better understand the experimentally obtained results, we formulated and numerically studied a minimalistic model, which reproduces the main features of the system with a minimum of adjustable parameters. We calculate the van der Waals forces between the spherical friction probe and bunches of the arrays using the well-known Morse potential function to predict the number of clusters, their size, instantaneous and mean friction forces and the behaviour of the VACNTs during consecutive sliding cycles and at different normal loads. The data obtained by the model calculations coincide very well with the experimental data and can help in adapting VACNT arrays for biomimetic applications.
Sistem Pendukung Keputusan Pemilihan Line-up Pemain Sepak Bola Menggunakan Metode Fuzzy Multiple Attribute Decision Making dan K-Means Clustering

Directory of Open Access Journals (Sweden)

Aldi Nurzahputra

2017-07-01

Full Text Available In football, the selection of players line-up is based on their statistical performance. In this research, the line-up selection can implement the decision support system (DSS with FMADM SAW method. The criterias were used are goal, assists, saves, clean sheets, yellow cards, red cards, games, and an own goal. Then, the assessment players performance is using K-Means Clustering. There are two clusters: cluster_cukup and cluster_baik. The system used Manchester City player data in Forward, Mildfilder, Defender and Goal Keeper position. The purpose of this research is applying the FMADM and K-Means Clustering method to the system. Based on the results, the line-up selection can be processed by FMADM method and the performance assessed by K-Means Clustering method. By using the system, the selection and the assessment can be conducted and give the best decision for footbal coach objectively. Dalam sepak bola, pemilihan line-up pemain oleh pelatih dilakukan berdasarkan statistik yang dimiliki pemain. Penelitian ini menerapkan sistem pendukung keputusan (SPK dengan metode FMADM SAW untuk memilih pemain dari hasil pembobotan dari beberapa kriteria, yaitu goal, assist, saves, clean sheet, kartu kuning, kartu merah, main, dan gol bunuh diri. Penilaian performa pemain menggunakan metode K-Means clustering dengan dua cluster, yaitu cluster_cukup dan cluster_baik. Data yang digunakan dalam sistem ini menggunakan data pemain club Manchester City dengan posisi Forward, Mildfilder, Defender, dan Goal Keeper. Berdasarkan hasil yang diteliti, data statistik pemain dapat diolah dengan metode FMADM dan penilaian performa dengan metode K-Means clustering. Dengan adanya sistem ini, pemilihan dan penilaian dilakukan secara objektif dan memberikan pilihan untuk pelatih dalam mengambil keputusan.
Three-Dimensional Modeling of Fracture Clusters in Geothermal Reservoirs

Energy Technology Data Exchange (ETDEWEB)

Ghassemi, Ahmad [Univ. of Oklahoma, Norman, OK (United States)

2017-08-11

The objective of this is to develop a 3-D numerical model for simulating mode I, II, and III (tensile, shear, and out-of-plane) propagation of multiple fractures and fracture clusters to accurately predict geothermal reservoir stimulation using the virtual multi-dimensional internal bond (VMIB). Effective development of enhanced geothermal systems can significantly benefit from improved modeling of hydraulic fracturing. In geothermal reservoirs, where the temperature can reach or exceed 350oC, thermal and poro-mechanical processes play an important role in fracture initiation and propagation. In this project hydraulic fracturing of hot subsurface rock mass will be numerically modeled by extending the virtual multiple internal bond theory and implementing it in a finite element code, WARP3D, a three-dimensional finite element code for solid mechanics. The new constitutive model along with the poro-thermoelastic computational algorithms will allow modeling the initiation and propagation of clusters of fractures, and extension of pre-existing fractures. The work will enable the industry to realistically model stimulation of geothermal reservoirs. The project addresses the Geothermal Technologies Office objective of accurately predicting geothermal reservoir stimulation (GTO technology priority item). The project goal will be attained by: (i) development of the VMIB method for application to 3D analysis of fracture clusters; (ii) development of poro- and thermoelastic material sub-routines for use in 3D finite element code WARP3D; (iii) implementation of VMIB and the new material routines in WARP3D to enable simulation of clusters of fractures while accounting for the effects of the pore pressure, thermal stress and inelastic deformation; (iv) simulation of 3D fracture propagation and coalescence and formation of clusters, and comparison with laboratory compression tests; and (v) application of the model to interpretation of injection experiments (planned by our
Computational Modeling of Radiation Phenomenon in SiC for Nuclear Applications

Science.gov (United States)

Ko, Hyunseok

Silicon carbide (SiC) material has been investigated for promising nuclear materials owing to its superior thermo-mechanical properties, and low neutron cross-section. While the interest in SiC has been increasing, the lack of fundamental understanding in many radiation phenomena is an important issue. More specifically, these phenomena in SiC include the fission gas transport, radiation induced defects and its evolution, radiation effects on the mechanical stability, matrix brittleness of SiC composites, and low thermal conductivities of SiC composites. To better design SiC and SiC composite materials for various nuclear applications, understanding each phenomenon and its significance under specific reactor conditions is important. In this thesis, we used various modeling approaches to understand the fundamental radiation phenomena in SiC for nuclear applications in three aspects: (a) fission product diffusion through SiC, (b) optimization of thermodynamic stable self-interstitial atom clusters, (c) interface effect in SiC composite and their change upon radiation. In (a) fission product transport work, we proposed that Ag/Cs diffusion in high energy grain boundaries may be the upper boundary in unirradiated SiC at relevant temperature, and radiation enhanced diffusion is responsible for fast diffusion measured in post-irradiated fuel particles. For (b) the self-interstitial cluster work, thermodynamically stable clusters are identified as a function of cluster size, shape, and compositions using a genetic algorithm. We found that there are compositional and configurational transitions for stable clusters as the cluster size increases. For (c) the interface effect in SiC composite, we investigated recently proposed interface, which is CNT reinforced SiC composite. The analytical model suggests that CNT/SiC composites have attractive mechanical and thermal properties, and these fortify the argument that SiC composites are good candidate materials for the cladding
A self-consistent model of rich clusters of galaxies. I. The galactic component of a cluster

International Nuclear Information System (INIS)

Konyukov, M.V.

1985-01-01

It is shown that to obtain the distribution function for the galactic component of a cluster reduces in the last analysis to solving the boundary-value problem for the gravitational potential of a self-consistent field. The distribution function is determined by two main parameters. An algorithm is constructed for the solution of the problem, and a program is set up to solve it. It is used to establish the region of values of the parameters in the problem for which solutions exist. The scheme proposed is extended to the case where there exists in the cluster a separate central body with a known density distribution (for example, a cD galaxy). A method is indicated for the estimation of the parameters of the model from the results of observations of clusters of galaxies in the optical range
Cluster regression model and level fluctuation features of Van Lake, Turkey

Directory of Open Access Journals (Sweden)

Z. Şen

1999-02-01

Full Text Available Lake water levels change under the influences of natural and/or anthropogenic environmental conditions. Among these influences are the climate change, greenhouse effects and ozone layer depletions which are reflected in the hydrological cycle features over the lake drainage basins. Lake levels are among the most significant hydrological variables that are influenced by different atmospheric and environmental conditions. Consequently, lake level time series in many parts of the world include nonstationarity components such as shifts in the mean value, apparent or hidden periodicities. On the other hand, many lake level modeling techniques have a stationarity assumption. The main purpose of this work is to develop a cluster regression model for dealing with nonstationarity especially in the form of shifting means. The basis of this model is the combination of transition probability and classical regression technique. Both parts of the model are applied to monthly level fluctuations of Lake Van in eastern Turkey. It is observed that the cluster regression procedure does preserve the statistical properties and the transitional probabilities that are indistinguishable from the original data.Key words. Hydrology (hydrologic budget; stochastic processes · Meteorology and atmospheric dynamics (ocean-atmosphere interactions
Cluster regression model and level fluctuation features of Van Lake, Turkey

Directory of Open Access Journals (Sweden)

Z. Şen

Full Text Available Lake water levels change under the influences of natural and/or anthropogenic environmental conditions. Among these influences are the climate change, greenhouse effects and ozone layer depletions which are reflected in the hydrological cycle features over the lake drainage basins. Lake levels are among the most significant hydrological variables that are influenced by different atmospheric and environmental conditions. Consequently, lake level time series in many parts of the world include nonstationarity components such as shifts in the mean value, apparent or hidden periodicities. On the other hand, many lake level modeling techniques have a stationarity assumption. The main purpose of this work is to develop a cluster regression model for dealing with nonstationarity especially in the form of shifting means. The basis of this model is the combination of transition probability and classical regression technique. Both parts of the model are applied to monthly level fluctuations of Lake Van in eastern Turkey. It is observed that the cluster regression procedure does preserve the statistical properties and the transitional probabilities that are indistinguishable from the original data.

Key words. Hydrology (hydrologic budget; stochastic processes · Meteorology and atmospheric dynamics (ocean-atmosphere interactions
Microscopic cluster model analysis of 14O+p elastic scattering

International Nuclear Information System (INIS)

Baye, D.; Descouvemont, P.; Leo, F.

2005-01-01

The 14 O+p elastic scattering is discussed in detail in a fully microscopic cluster model. The 14 O cluster is described by a closed p shell for protons and a closed p3/2 subshell for neutrons in the translation-invariant harmonic-oscillator model. The exchange and spin-orbit parameters of the effective forces are tuned on the energy levels of the 15 C mirror system. With the generator-coordinate and microscopic R-matrix methods, phase shifts and cross sections are calculated for the 14 O+p elastic scattering. An excellent agreement is found with recent experimental data. A comparison is performed with phenomenological R-matrix fits. Resonances properties in 15 F are discussed
Group analyses of connectivity-based cortical parcellation using repeated k-means clustering

NARCIS (Netherlands)

Nanetti, Luca; Cerliani, Leonardo; Gazzola, Valeria; Renken, Remco; Keysers, Christian

2009-01-01

K-means clustering has become a popular tool for connectivity-based cortical segmentation using Diffusion Weighted Imaging (DWI) data. A sometimes ignored issue is, however, that the output of the algorithm depends on the initial placement of starting points, and that different sets of starting

Identification of flooded area from satellite images using Hybrid Kohonen Fuzzy C-Means sigma classifier

Directory of Open Access Journals (Sweden)

Krishna Kant Singh

2017-06-01

Full Text Available A novel neuro fuzzy classifier Hybrid Kohonen Fuzzy C-Means-σ (HKFCM-σ is proposed in this paper. The proposed classifier is a hybridization of Kohonen Clustering Network (KCN with FCM-σ clustering algorithm. The network architecture of HKFCM-σ is similar to simple KCN network having only two layers, i.e., input and output layer. However, the selection of winner neuron is done based on FCM-σ algorithm. Thus, embedding the features of both, a neural network and a fuzzy clustering algorithm in the classifier. This hybridization results in a more efficient, less complex and faster classifier for classifying satellite images. HKFCM-σ is used to identify the flooding that occurred in Kashmir area in September 2014. The HKFCM-σ classifier is applied on pre and post flooding Landsat 8 OLI images of Kashmir to detect the areas that were flooded due to the heavy rainfalls of September, 2014. The classifier is trained using the mean values of the various spectral indices like NDVI, NDWI, NDBI and first component of Principal Component Analysis. The error matrix was computed to test the performance of the method. The method yields high producer’s accuracy, consumer’s accuracy and kappa coefficient value indicating that the proposed classifier is highly effective and efficient.
Enhanced K-means clustering with encryption on cloud

Science.gov (United States)

Singh, Iqjot; Dwivedi, Prerna; Gupta, Taru; Shynu, P. G.

2017-11-01

This paper tries to solve the problem of storing and managing big files over cloud by implementing hashing on Hadoop in big-data and ensure security while uploading and downloading files. Cloud computing is a term that emphasis on sharing data and facilitates to share infrastructure and resources.[10] Hadoop is an open source software that gives us access to store and manage big files according to our needs on cloud. K-means clustering algorithm is an algorithm used to calculate distance between the centroid of the cluster and the data points. Hashing is a algorithm in which we are storing and retrieving data with hash keys. The hashing algorithm is called as hash function which is used to portray the original data and later to fetch the data stored at the specific key. [17] Encryption is a process to transform electronic data into non readable form known as cipher text. Decryption is the opposite process of encryption, it transforms the cipher text into plain text that the end user can read and understand well. For encryption and decryption we are using Symmetric key cryptographic algorithm. In symmetric key cryptography are using DES algorithm for a secure storage of the files. [3
Topics in modelling of clustered data

CERN Document Server

Aerts, Marc; Ryan, Louise M; Geys, Helena

2002-01-01

Many methods for analyzing clustered data exist, all with advantages and limitations in particular applications. Compiled from the contributions of leading specialists in the field, Topics in Modelling of Clustered Data describes the tools and techniques for modelling the clustered data often encountered in medical, biological, environmental, and social science studies. It focuses on providing a comprehensive treatment of marginal, conditional, and random effects models using, among others, likelihood, pseudo-likelihood, and generalized estimating equations methods. The authors motivate and illustrate all aspects of these models in a variety of real applications. They discuss several variations and extensions, including individual-level covariates and combined continuous and discrete outcomes. Flexible modelling with fractional and local polynomials, omnibus lack-of-fit tests, robustification against misspecification, exact, and bootstrap inferential procedures all receive extensive treatment. The application...
Mean Occupation Function of High-redshift Quasars from the Planck Cluster Catalog

Science.gov (United States)

Chakraborty, Priyanka; Chatterjee, Suchetana; Dutta, Alankar; Myers, Adam D.

2018-06-01

We characterize the distribution of quasars within dark matter halos using a direct measurement technique for the first time at redshifts as high as z ∼ 1. Using the Planck Sunyaev-Zeldovich (SZ) catalog for galaxy groups and the Sloan Digital Sky Survey (SDSS) DR12 quasar data set, we assign host clusters/groups to the quasars and make a measurement of the mean number of quasars within dark matter halos as a function of halo mass. We find that a simple power-law fit of {log} =(2.11+/- 0.01) {log}(M)-(32.77+/- 0.11) can be used to model the quasar fraction in dark matter halos. This suggests that the quasar fraction increases monotonically as a function of halo mass even to redshifts as high as z ∼ 1.
Clustering Dycom

KAUST Repository

Minku, Leandro L.; Hou, Siqing

2017-01-01

baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number
Estimation of breast percent density in raw and processed full field digital mammography images via adaptive fuzzy c-means clustering and support vector machine segmentation

International Nuclear Information System (INIS)

Keller, Brad M.; Nathan, Diane L.; Wang Yan; Zheng Yuanjie; Gee, James C.; Conant, Emily F.; Kontos, Despina

2012-01-01

Purpose: The amount of fibroglandular tissue content in the breast as estimated mammographically, commonly referred to as breast percent density (PD%), is one of the most significant risk factors for developing breast cancer. Approaches to quantify breast density commonly focus on either semiautomated methods or visual assessment, both of which are highly subjective. Furthermore, most studies published to date investigating computer-aided assessment of breast PD% have been performed using digitized screen-film mammograms, while digital mammography is increasingly replacing screen-film mammography in breast cancer screening protocols. Digital mammography imaging generates two types of images for analysis, raw (i.e., “FOR PROCESSING”) and vendor postprocessed (i.e., “FOR PRESENTATION”), of which postprocessed images are commonly used in clinical practice. Development of an algorithm which effectively estimates breast PD% in both raw and postprocessed digital mammography images would be beneficial in terms of direct clinical application and retrospective analysis. Methods: This work proposes a new algorithm for fully automated quantification of breast PD% based on adaptive multiclass fuzzy c-means (FCM) clustering and support vector machine (SVM) classification, optimized for the imaging characteristics of both raw and processed digital mammography images as well as for individual patient and image characteristics. Our algorithm first delineates the breast region within the mammogram via an automated thresholding scheme to identify background air followed by a straight line Hough transform to extract the pectoral muscle region. The algorithm then applies adaptive FCM clustering based on an optimal number of clusters derived from image properties of the specific mammogram to subdivide the breast into regions of similar gray-level intensity. Finally, a SVM classifier is trained to identify which clusters within the breast tissue are likely fibroglandular, which
Prioritizing the risk of plant pests by clustering methods; self-organising maps, k-means and hierarchical clustering

Directory of Open Access Journals (Sweden)

Susan Worner

2013-09-01

Full Text Available For greater preparedness, pest risk assessors are required to prioritise long lists of pest species with potential to establish and cause significant impact in an endangered area. Such prioritization is often qualitative, subjective, and sometimes biased, relying mostly on expert and stakeholder consultation. In recent years, cluster based analyses have been used to investigate regional pest species assemblages or pest profiles to indicate the risk of new organism establishment. Such an approach is based on the premise that the co-occurrence of well-known global invasive pest species in a region is not random, and that the pest species profile or assemblage integrates complex functional relationships that are difficult to tease apart. In other words, the assemblage can help identify and prioritise species that pose a threat in a target region. A computational intelligence method called a Kohonen self-organizing map (SOM, a type of artificial neural network, was the first clustering method applied to analyse assemblages of invasive pests. The SOM is a well known dimension reduction and visualization method especially useful for high dimensional data that more conventional clustering methods may not analyse suitably. Like all clustering algorithms, the SOM can give details of clusters that identify regions with similar pest assemblages, possible donor and recipient regions. More important, however SOM connection weights that result from the analysis can be used to rank the strength of association of each species within each regional assemblage. Species with high weights that are not already established in the target region are identified as high risk. However, the SOM analysis is only the first step in a process to assess risk to be used alongside or incorporated within other measures. Here we illustrate the application of SOM analyses in a range of contexts in invasive species risk assessment, and discuss other clustering methods such as k-means
Cysteine 295 indirectly affects Ni coordination of carbon monoxide dehydrogenase-II C-cluster

Energy Technology Data Exchange (ETDEWEB)

Inoue, Takahiro; Takao, Kyosuke; Yoshida, Takashi [Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University, Kyoto 606-8502 (Japan); Wada, Kei [Organization for Promotion of Tenure Track, University of Miyazaki, Miyazaki 889-1692 (Japan); Daifuku, Takashi; Yoneda, Yasuko [Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University, Kyoto 606-8502 (Japan); Fukuyama, Keiichi [Department of Biological Sciences, Graduate School of Science, Osaka University, Toyonaka, Osaka 560-0043 (Japan); Sako, Yoshihiko, E-mail: sako@kais.kyoto-u.ac.jp [Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University, Kyoto 606-8502 (Japan)

2013-11-08

Highlights: •CODH-II harbors a unique [Ni-Fe-S] cluster. •We substituted the ligand residues of Cys{sup 295} and His{sup 261}. •Dramatic decreases in Ni content upon substitutions were observed. •All substitutions did not affect Fe-S clusters assembly. •CO oxidation activity was decreased by the substitutions. -- Abstract: A unique [Ni–Fe–S] cluster (C-cluster) constitutes the active center of Ni-containing carbon monoxide dehydrogenases (CODHs). His{sup 261}, which coordinates one of the Fe atoms with Cys{sup 295}, is suggested to be the only residue required for Ni coordination in the C-cluster. To evaluate the role of Cys{sup 295}, we constructed CODH-II variants. Ala substitution for the Cys{sup 295} substitution resulted in the decrease of Ni content and didn’t result in major change of Fe content. In addition, the substitution had no effect on the ability to assemble a full complement of [Fe–S] clusters. This strongly suggests Cys{sup 295} indirectly and His{sup 261} together affect Ni-coordination in the C-cluster.
Cysteine 295 indirectly affects Ni coordination of carbon monoxide dehydrogenase-II C-cluster

International Nuclear Information System (INIS)

Inoue, Takahiro; Takao, Kyosuke; Yoshida, Takashi; Wada, Kei; Daifuku, Takashi; Yoneda, Yasuko; Fukuyama, Keiichi; Sako, Yoshihiko

2013-01-01

Highlights: •CODH-II harbors a unique [Ni-Fe-S] cluster. •We substituted the ligand residues of Cys 295 and His 261 . •Dramatic decreases in Ni content upon substitutions were observed. •All substitutions did not affect Fe-S clusters assembly. •CO oxidation activity was decreased by the substitutions. -- Abstract: A unique [Ni–Fe–S] cluster (C-cluster) constitutes the active center of Ni-containing carbon monoxide dehydrogenases (CODHs). His 261 , which coordinates one of the Fe atoms with Cys 295 , is suggested to be the only residue required for Ni coordination in the C-cluster. To evaluate the role of Cys 295 , we constructed CODH-II variants. Ala substitution for the Cys 295 substitution resulted in the decrease of Ni content and didn’t result in major change of Fe content. In addition, the substitution had no effect on the ability to assemble a full complement of [Fe–S] clusters. This strongly suggests Cys 295 indirectly and His 261 together affect Ni-coordination in the C-cluster
A Novel Grouping Method for Lithium Iron Phosphate Batteries Based on a Fractional Joint Kalman Filter and a New Modified K-Means Clustering Algorithm

Directory of Open Access Journals (Sweden)

Xiaoyu Li

2015-07-01

Full Text Available This paper presents a novel grouping method for lithium iron phosphate batteries. In this method, a simplified electrochemical impedance spectroscopy (EIS model is utilized to describe the battery characteristics. Dynamic stress test (DST and fractional joint Kalman filter (FJKF are used to extract battery model parameters. In order to realize equal-number grouping of batteries, a new modified K-means clustering algorithm is proposed. Two rules are designed to equalize the numbers of elements in each group and exchange samples among groups. In this paper, the principles of battery model selection, physical meaning and identification method of model parameters, data preprocessing and equal-number clustering method for battery grouping are comprehensively described. Additionally, experiments for battery grouping and method validation are designed. This method is meaningful to application involving the grouping of fresh batteries for electric vehicles (EVs and screening of aged batteries for recycling.
GLOBAL CLASSIFICATION OF DERMATITIS DISEASE WITH K-MEANS CLUSTERING IMAGE SEGMENTATION METHODS

OpenAIRE

Prafulla N. Aerkewar1 & Dr. G. H. Agrawal2

2018-01-01

The objective of this paper to presents a global technique for classification of different dermatitis disease lesions using the process of k-Means clustering image segmentation method. The word global is used such that the all dermatitis disease having skin lesion on body are classified in to four category using k-means image segmentation and nntool of Matlab. Through the image segmentation technique and nntool can be analyze and study the segmentation properties of skin lesions occurs in...
A spatial hazard model for cluster detection on continuous indicators of disease: application to somatic cell score.

Science.gov (United States)

Gay, Emilie; Senoussi, Rachid; Barnouin, Jacques

2007-01-01

Methods for spatial cluster detection dealing with diseases quantified by continuous variables are few, whereas several diseases are better approached by continuous indicators. For example, subclinical mastitis of the dairy cow is evaluated using a continuous marker of udder inflammation, the somatic cell score (SCS). Consequently, this study proposed to analyze spatialized risk and cluster components of herd SCS through a new method based on a spatial hazard model. The dataset included annual SCS for 34 142 French dairy herds for the year 2000, and important SCS risk factors: mean parity, percentage of winter and spring calvings, and herd size. The model allowed the simultaneous estimation of the effects of known risk factors and of potential spatial clusters on SCS, and the mapping of the estimated clusters and their range. Mean parity and winter and spring calvings were significantly associated with subclinical mastitis risk. The model with the presence of 3 clusters was highly significant, and the 3 clusters were attractive, i.e. closeness to cluster center increased the occurrence of high SCS. The three localizations were the following: close to the city of Troyes in the northeast of France; around the city of Limoges in the center-west; and in the southwest close to the city of Tarbes. The semi-parametric method based on spatial hazard modeling applies to continuous variables, and takes account of both risk factors and potential heterogeneity of the background population. This tool allows a quantitative detection but assumes a spatially specified form for clusters.
Comparison of five cluster validity indices performance in brain [18 F]FET-PET image segmentation using k-means.

Science.gov (United States)

Abualhaj, Bedor; Weng, Guoyang; Ong, Melissa; Attarwala, Ali Asgar; Molina, Flavia; Büsing, Karen; Glatting, Gerhard

2017-01-01

Dynamic [ 18 F]fluoro-ethyl-L-tyrosine positron emission tomography ([ 18 F]FET-PET) is used to identify tumor lesions for radiotherapy treatment planning, to differentiate glioma recurrence from radiation necrosis and to classify gliomas grading. To segment different regions in the brain k-means cluster analysis can be used. The main disadvantage of k-means is that the number of clusters must be pre-defined. In this study, we therefore compared different cluster validity indices for automated and reproducible determination of the optimal number of clusters based on the dynamic PET data. The k-means algorithm was applied to dynamic [ 18 F]FET-PET images of 8 patients. Akaike information criterion (AIC), WB, I, modified Dunn's and Silhouette indices were compared on their ability to determine the optimal number of clusters based on requirements for an adequate cluster validity index. To check the reproducibility of k-means, the coefficients of variation CVs of the objective function values OFVs (sum of squared Euclidean distances within each cluster) were calculated using 100 random centroid initialization replications RCI 100 for 2 to 50 clusters. k-means was performed independently on three neighboring slices containing tumor for each patient to investigate the stability of the optimal number of clusters within them. To check the independence of the validity indices on the number of voxels, cluster analysis was applied after duplication of a slice selected from each patient. CVs of index values were calculated at the optimal number of clusters using RCI 100 to investigate the reproducibility of the validity indices. To check if the indices have a single extremum, visual inspection was performed on the replication with minimum OFV from RCI 100 . The maximum CV of OFVs was 2.7 × 10 -2 from all patients. The optimal number of clusters given by modified Dunn's and Silhouette indices was 2 or 3 leading to a very poor segmentation. WB and I indices suggested in
Interactive K-Means Clustering Method Based on User Behavior for Different Analysis Target in Medicine.

Science.gov (United States)

Lei, Yang; Yu, Dai; Bin, Zhang; Yang, Yang

2017-01-01

Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor's knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K -means clustering method to improve the user's satisfactions towards the result. The core of this method is to get the user's feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user's business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user's requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.
Molecular Polarizability of Sc and C (Fullerene and Graphite Clusters

Directory of Open Access Journals (Sweden)

Francisco Torrens

2001-05-01

Full Text Available A method (POLAR for the calculation of the molecular polarizability is presented. It uses the interacting induced dipoles polarization model. As an example, the method is applied to Scn and Cn (fullerene and one-shell graphite model clusters. On varying the number of atoms, the clusters show numbers indicative of particularly polarizable structures. The are compared with reference calculations (PAPID. In general, the Scn calculated (POLAR and Cn computed (POLAR and PAPID are less polarizable than what is inferred from the bulk. However, the Scn calculated (PAPID are more polarizable than what is inferred. Moreover, previous theoretical work yielded the same trend for Sin, Gen and GanAsm small clusters. The high polarizability of the Scn clusters (PAPID is attributed to arise from dangling bonds at the surface of the cluster.
A Hybrid Double-Layer Master-Slave Model For Multicore-Node Clusters

International Nuclear Information System (INIS)

Liu Gang; Schmider, Hartmut; Edgecombe, Kenneth E

2012-01-01

The Double-Layer Master-Slave Model (DMSM) is a suitable hybrid model for executing a workload that consists of multiple independent tasks of varying length on a cluster consisting of multicore nodes. In this model, groups of individual tasks are first deployed to the cluster nodes through an MPI based Master-Slave model. Then, each group is processed by multiple threads on the node through an OpenMP based All-Slave approach. The lack of thread safety of most MPI libraries has to be addressed by a judicious use of OpenMP critical regions and locks. The HPCVL DMSM Library implements this model in Fortran and C. It requires a minimum of user input to set up the framework for the model and to define the individual tasks. Optionally, it supports the dynamic distribution of task-related data and the collection of results at runtime. This library is freely available as source code. Here, we outline the working principles of the library and on a few examples demonstrate its capability to efficiently distribute a workload on a distributed-memory cluster with shared-memory nodes.
A user credit assessment model based on clustering ensemble for broadband network new media service supervision

Science.gov (United States)

Liu, Fang; Cao, San-xing; Lu, Rui

2012-04-01

This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.
The rotation of galaxy clusters

International Nuclear Information System (INIS)

Tovmassian, H.M.

2015-01-01

The method for detection of the galaxy cluster rotation based on the study of distribution of member galaxies with velocities lower and higher of the cluster mean velocity over the cluster image is proposed. The search for rotation is made for flat clusters with a/b> 1.8 and BMI type clusters which are expected to be rotating. For comparison there were studied also round clusters and clusters of NBMI type, the second by brightness galaxy in which does not differ significantly from the cluster cD galaxy. Seventeen out of studied 65 clusters are found to be rotating. It was found that the detection rate is sufficiently high for flat clusters, over 60 per cent, and clusters of BMI type with dominant cD galaxy, ≈ 35 per cent. The obtained results show that clusters were formed from the huge primordial gas clouds and preserved the rotation of the primordial clouds, unless they did not have mergings with other clusters and groups of galaxies, in the result of which the rotation has been prevented
Worst-case and smoothed analysis of $k$-means clustering with Bregman divergences

NARCIS (Netherlands)

Manthey, Bodo; Röglin, Heiko; Dong, Yingfei; Du, Dingzhu; Ibarra, Oscar

2009-01-01

The $k$-means algorithm is the method of choice for clustering large-scale data sets and it performs exceedingly well in practice. Most of the theoretical work is restricted to the case that squared Euclidean distances are used as similarity measure. In many applications, however, data is to be
Mean-cluster approach indicates cell sorting time scales are determined by collective dynamics

Science.gov (United States)

Beatrici, Carine P.; de Almeida, Rita M. C.; Brunnet, Leonardo G.

2017-03-01

Cell migration is essential to cell segregation, playing a central role in tissue formation, wound healing, and tumor evolution. Considering random mixtures of two cell types, it is still not clear which cell characteristics define clustering time scales. The mass of diffusing clusters merging with one another is expected to grow as td /d +2 when the diffusion constant scales with the inverse of the cluster mass. Cell segregation experiments deviate from that behavior. Explanations for that could arise from specific microscopic mechanisms or from collective effects, typical of active matter. Here we consider a power law connecting diffusion constant and cluster mass to propose an analytic approach to model cell segregation where we explicitly take into account finite-size corrections. The results are compared with active matter model simulations and experiments available in the literature. To investigate the role played by different mechanisms we considered different hypotheses describing cell-cell interaction: differential adhesion hypothesis and different velocities hypothesis. We find that the simulations yield normal diffusion for long time intervals. Analytic and simulation results show that (i) cluster evolution clearly tends to a scaling regime, disrupted only at finite-size limits; (ii) cluster diffusion is greatly enhanced by cell collective behavior, such that for high enough tendency to follow the neighbors, cluster diffusion may become independent of cluster size; (iii) the scaling exponent for cluster growth depends only on the mass-diffusion relation, not on the detailed local segregation mechanism. These results apply for active matter systems in general and, in particular, the mechanisms found underlying the increase in cell sorting speed certainly have deep implications in biological evolution as a selection mechanism.

Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation.

Science.gov (United States)

Saatchi, Mahdi; McClure, Mathew C; McKay, Stephanie D; Rolf, Megan M; Kim, JaeWoo; Decker, Jared E; Taxis, Tasia M; Chapple, Richard H; Ramey, Holly R; Northcutt, Sally L; Bauck, Stewart; Woodward, Brent; Dekkers, Jack C M; Fernando, Rohan L; Schnabel, Robert D; Garrick, Dorian J; Taylor, Jeremy F

2011-11-28

Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but
Mean field theory of nuclei and shell model. Present status and future outlook

International Nuclear Information System (INIS)

Nakada, Hitoshi

2003-01-01

Many of the recent topics of the nuclear structure are concerned on the problems of unstable nuclei. It has been revealed experimentally that the nuclear halos and the neutron skins as well as the cluster structures or the molecule-like structures can be present in the unstable nuclei, and the magic numbers well established in the stable nuclei disappear occasionally while new ones appear. The shell model based on the mean field approximation has been successfully applied to stable nuclei to explain the nuclear structure as the finite many body system quantitatively and it is considered as the standard model at present. If the unstable nuclei will be understood on the same model basis or not is a matter related to fundamental principle of nuclear structure theories. In this lecture, the fundamental concept and the framework of the theory of nuclear structure based on the mean field theory and the shell model are presented to make clear the problems and to suggest directions for future researches. At first fundamental properties of nuclei are described under the subtitles: saturation and magic numbers, nuclear force and effective interactions, nuclear matter, and LS splitting. Then the mean field theory is presented under subtitles: the potential model, the mean field theory, Hartree-Fock approximation for nuclear matter, density dependent force, semiclassical mean field theory, mean field theory and symmetry, Skyrme interaction and density functional, density matrix expansion, finite range interactions, effective masses, and motion of center of mass. The subsequent section is devoted to the shell model with the subtitles: beyond the mean field approximation, core polarization, effective interaction of shell model, one-particle wave function, nuclear deformation and shell model, and shell model of cross shell. Finally structure of unstable nuclei is discussed with the subtitles: general remark on the study of unstable nuclear structure, asymptotic behavior of wave
π plasmon modes in C60 clusters

International Nuclear Information System (INIS)

Nguyen Van Giai; Lipparini, E.

1992-07-01

RPA correlations and collective excitations of π electrons in the C 60 cluster, the fullerene molecule are studied, by using the sum rule approach and linear response theory. The results for the excitation spectrum are discussed in relation to experimental data and to other theoretical approaches. (K.A.) 17 refs.; 4 figs
Possible world based consistency learning model for clustering and classifying uncertain data.

Science.gov (United States)

Liu, Han; Zhang, Xianchao; Zhang, Xiaotong

2018-06-01

Possible world has shown to be effective for handling various types of data uncertainty in uncertain data management. However, few uncertain data clustering and classification algorithms are proposed based on possible world. Moreover, existing possible world based algorithms suffer from the following issues: (1) they deal with each possible world independently and ignore the consistency principle across different possible worlds; (2) they require the extra post-processing procedure to obtain the final result, which causes that the effectiveness highly relies on the post-processing method and the efficiency is also not very good. In this paper, we propose a novel possible world based consistency learning model for uncertain data, which can be extended both for clustering and classifying uncertain data. This model utilizes the consistency principle to learn a consensus affinity matrix for uncertain data, which can make full use of the information across different possible worlds and then improve the clustering and classification performance. Meanwhile, this model imposes a new rank constraint on the Laplacian matrix of the consensus affinity matrix, thereby ensuring that the number of connected components in the consensus affinity matrix is exactly equal to the number of classes. This also means that the clustering and classification results can be directly obtained without any post-processing procedure. Furthermore, for the clustering and classification tasks, we respectively derive the efficient optimization methods to solve the proposed model. Experimental results on real benchmark datasets and real world uncertain datasets show that the proposed model outperforms the state-of-the-art uncertain data clustering and classification algorithms in effectiveness and performs competitively in efficiency. Copyright © 2018 Elsevier Ltd. All rights reserved.
Time series modelling of global mean temperature for managerial decision-making.

Science.gov (United States)

Romilly, Peter

2005-07-01

Climate change has important implications for business and economic activity. Effective management of climate change impacts will depend on the availability of accurate and cost-effective forecasts. This paper uses univariate time series techniques to model the properties of a global mean temperature dataset in order to develop a parsimonious forecasting model for managerial decision-making over the short-term horizon. Although the model is estimated on global temperature data, the methodology could also be applied to temperature data at more localised levels. The statistical techniques include seasonal and non-seasonal unit root testing with and without structural breaks, as well as ARIMA and GARCH modelling. A forecasting evaluation shows that the chosen model performs well against rival models. The estimation results confirm the findings of a number of previous studies, namely that global mean temperatures increased significantly throughout the 20th century. The use of GARCH modelling also shows the presence of volatility clustering in the temperature data, and a positive association between volatility and global mean temperature.
Maximum-entropy clustering algorithm and its global convergence analysis

Institute of Scientific and Technical Information of China (English)

无

2001-01-01

Constructing a batch of differentiable entropy functions touniformly approximate an objective function by means of the maximum-entropy principle, a new clustering algorithm, called maximum-entropy clustering algorithm, is proposed based on optimization theory. This algorithm is a soft generalization of the hard C-means algorithm and possesses global convergence. Its relations with other clustering algorithms are discussed.
Quark cluster model in the three-nucleon system

International Nuclear Information System (INIS)

Osman, A.

1986-11-01

The quark cluster model is used to investigate the structure of the three-nucleon systems. The nucleon-nucleon interaction is proposed considering the colour-nucleon clusters and incorporating the quark degrees of freedom. The quark-quark potential in the quark compound bag model agrees with the central force potentials. The confinement potential reduces the short-range repulsion. The colour van der Waals force is determined. Then, the probability of quark clusters in the three-nucleon bound state systems are numerically calculated using realistic nuclear wave functions. The results of the present calculations show that quarks cluster themselves in three-quark systems building the quark cluster model for the trinucleon system. (author)
Modeling the formation of globular cluster systems in the Virgo cluster

International Nuclear Information System (INIS)

Li, Hui; Gnedin, Oleg Y.

2014-01-01

The mass distribution and chemical composition of globular cluster (GC) systems preserve fossil record of the early stages of galaxy formation. The observed distribution of GC colors within massive early-type galaxies in the ACS Virgo Cluster Survey (ACSVCS) reveals a multi-modal shape, which likely corresponds to a multi-modal metallicity distribution. We present a simple model for the formation and disruption of GCs that aims to match the ACSVCS data. This model tests the hypothesis that GCs are formed during major mergers of gas-rich galaxies and inherit the metallicity of their hosts. To trace merger events, we use halo merger trees extracted from a large cosmological N-body simulation. We select 20 halos in the mass range of 2 × 10 12 to 7 × 10 13 M ☉ and match them to 19 Virgo galaxies with K-band luminosity between 3 × 10 10 and 3 × 10 11 L ☉ . To set the [Fe/H] abundances, we use an empirical galaxy mass-metallicity relation. We find that a minimal merger ratio of 1:3 best matches the observed cluster metallicity distribution. A characteristic bimodal shape appears because metal-rich GCs are produced by late mergers between massive halos, while metal-poor GCs are produced by collective merger activities of less massive hosts at early times. The model outcome is robust to alternative prescriptions for cluster formation rate throughout cosmic time, but a gradual evolution of the mass-metallicity relation with redshift appears to be necessary to match the observed cluster metallicities. We also affirm the age-metallicity relation, predicted by an earlier model, in which metal-rich clusters are systematically several billion younger than their metal-poor counterparts.
Spin dynamics study of magnetic molecular clusters by means of Moessbauer spectroscopy

International Nuclear Information System (INIS)

Cianchi, L.; Del Giallo, F.; Spina, G.; Reiff, W.; Caneschi, A.

2002-01-01

Spin dynamics of the two magnetic molecular clusters Fe4 and Fe8, with four and eight Fe(III) ions, respectively, was studied by means of Moessbauer spectroscopy. The transition probabilities W's between the spin states of the ground multiplet were obtained from the fitting of the spectra. For the Fe4 cluster we found that, in the range from 1.38 to 77 K, the trend of W's versus the temperature corresponds to an Orbach's process involving an excited state with energy of about 160 K. For the Fe8, which, due to the presence of a low-energy excited state, could not be studied at temperatures greater than 20 K, the trend of W's in the range from 4 to 18 K seems to correspond to a direct process. The correlation functions of the magnetization were then calculated in terms of the W's. They have an exponential trend for the Fe4 cluster, while a small oscillating component is also present for the Fe8 cluster. For the first of the clusters, τ vs T (τ is the decay time of the magnetization) has a trend which, at low temperatures (T 15 K, τ follows the trend of W -1 . For the Fe8, τ follows an Arrhenius law, but with a prefactor which is smaller than the one obtained susceptibility measurements
Homicidal behaviour among people with avoidant, dependent and obsessive-compulsive (cluster C) personality disorder.

Science.gov (United States)

Laajasalo, Taina; Ylipekka, Mikko; Häkkänen-Nyholm, Helinä

2013-02-01

Despite a growing forensic psychiatry literature, no previous study has examined in detail homicidal behaviour among offenders with cluster C personality disorders - the avoidant, dependent or obsessional personality disorders. This study aims to compare homicide offenders with cluster C personality disorders with those with other personality disorders on criminal history, offender-victim relationship and post-offence reaction variables. The sample was drawn from all Finnish homicide cases of 1996-2004 for whom a forensic psychiatric evaluation had been conducted. Data were extracted from forensic psychiatric and crime reports. In a nationwide sample of 593 homicide offenders, 21 had at least one cluster C personality disorder. These offenders had significantly shorter criminal histories than the others. Offender-victim relationship did not differ between the groups, but confession to the crime and feelings of remorse were more common among people with cluster C disorders. In addition, compared with other personality disorder clusters, co-morbid depression was more common. Cluster C personality disorders are rare, but not nonexistent, among homicide offenders. Observed differences in their backgrounds and post-offence behaviours indicate that they may have special needs. Copyright © 2012 John Wiley & Sons, Ltd.
IP2P K-means: an efficient method for data clustering on sensor networks

Directory of Open Access Journals (Sweden)

Peyman Mirhadi

2013-03-01

Full Text Available Many wireless sensor network applications require data gathering as the most important parts of their operations. There are increasing demands for innovative methods to improve energy efficiency and to prolong the network lifetime. Clustering is considered as an efficient topology control methods in wireless sensor networks, which can increase network scalability and lifetime. This paper presents a method, IP2P K-means – Improved P2P K-means, which uses efficient leveling in clustering approach, reduces false labeling and restricts the necessary communication among various sensors, which obviously saves more energy. The proposed method is examined in Network Simulator Ver.2 (NS2 and the preliminary results show that the algorithm works effectively and relatively more precisely.
[Automatic Sleep Stage Classification Based on an Improved K-means Clustering Algorithm].

Science.gov (United States)

Xiao, Shuyuan; Wang, Bei; Zhang, Jian; Zhang, Qunfeng; Zou, Junzhong

2016-10-01

Sleep stage scoring is a hotspot in the field of medicine and neuroscience.Visual inspection of sleep is laborious and the results may be subjective to different clinicians.Automatic sleep stage classification algorithm can be used to reduce the manual workload.However,there are still limitations when it encounters complicated and changeable clinical cases.The purpose of this paper is to develop an automatic sleep staging algorithm based on the characteristics of actual sleep data.In the proposed improved K-means clustering algorithm,points were selected as the initial centers by using a concept of density to avoid the randomness of the original K-means algorithm.Meanwhile,the cluster centers were updated according to the‘Three-Sigma Rule’during the iteration to abate the influence of the outliers.The proposed method was tested and analyzed on the overnight sleep data of the healthy persons and patients with sleep disorders after continuous positive airway pressure(CPAP)treatment.The automatic sleep stage classification results were compared with the visual inspection by qualified clinicians and the averaged accuracy reached 76%.With the analysis of morphological diversity of sleep data,it was proved that the proposed improved K-means algorithm was feasible and valid for clinical practice.
Cluster Dynamics Modeling with Bubble Nucleation, Growth and Coalescence

Energy Technology Data Exchange (ETDEWEB)

de Almeida, Valmor F. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Blondel, Sophie [Univ. of Tennessee, Knoxville, TN (United States); Bernholdt, David E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Wirth, Brian D. [Univ. of Tennessee, Knoxville, TN (United States)

2017-06-01

The topic of this communication pertains to defect formation in irradiated solids such as plasma-facing tungsten submitted to helium implantation in fusion reactor com- ponents, and nuclear fuel (metal and oxides) submitted to volatile ssion product generation in nuclear reactors. The purpose of this progress report is to describe ef- forts towards addressing the prediction of long-time evolution of defects via continuum cluster dynamics simulation. The di culties are twofold. First, realistic, long-time dynamics in reactor conditions leads to a non-dilute di usion regime which is not accommodated by the prevailing dilute, stressless cluster dynamics theory. Second, long-time dynamics calls for a large set of species (ideally an in nite set) to capture all possible emerging defects, and this represents a computational bottleneck. Extensions beyond the dilute limit is a signi cant undertaking since no model has been advanced to extend cluster dynamics to non-dilute, deformable conditions. Here our proposed approach to model the non-dilute limit is to monitor the appearance of a spatially localized void volume fraction in the solid matrix with a bell shape pro le and insert an explicit geometrical bubble onto the support of the bell function. The newly cre- ated internal moving boundary provides the means to account for the interfacial ux of mobile species into the bubble, and the growth of bubbles allows for coalescence phenomena which captures highly non-dilute interactions. We present a preliminary interfacial kinematic model with associated interfacial di usion transport to follow the evolution of the bubble in any number of spatial dimensions and any number of bubbles, which can be further extended to include a deformation theory. Finally we comment on a computational front-tracking method to be used in conjunction with conventional cluster dynamics simulations in the non-dilute model proposed.
Direct observation of interfacial C60 cluster formation in polystyrene-C60 nanocomposite films

International Nuclear Information System (INIS)

Han, Joong Tark; Lee, Geon-Woong; Kim, Sangcheol; Lee, Hae-Jeong; Douglas, Jack F; Karim, Alamgir

2009-01-01

Large interfacial C 60 clusters were directly imaged at the supporting film-substrate interface in physically detached polystyrene-C 60 nanocomposite films by atomic force microscopy, confirming the stabilizing mechanism previously hypothesized for thin polymer films. Additionally, we found that the C 60 additive influences basic thermodynamic film properties such as the interfacial energy and the film thermal expansion coefficient.
Two generalizations of Kohonen clustering

Science.gov (United States)

Bezdek, James C.; Pal, Nikhil R.; Tsao, Eric C. K.

1993-01-01

The relationship between the sequential hard c-means (SHCM), learning vector quantization (LVQ), and fuzzy c-means (FCM) clustering algorithms is discussed. LVQ and SHCM suffer from several major problems. For example, they depend heavily on initialization. If the initial values of the cluster centers are outside the convex hull of the input data, such algorithms, even if they terminate, may not produce meaningful results in terms of prototypes for cluster representation. This is due in part to the fact that they update only the winning prototype for every input vector. The impact and interaction of these two families with Kohonen's self-organizing feature mapping (SOFM), which is not a clustering method, but which often leads ideas to clustering algorithms is discussed. Then two generalizations of LVQ that are explicitly designed as clustering algorithms are presented; these algorithms are referred to as generalized LVQ = GLVQ; and fuzzy LVQ = FLVQ. Learning rules are derived to optimize an objective function whose goal is to produce 'good clusters'. GLVQ/FLVQ (may) update every node in the clustering net for each input vector. Neither GLVQ nor FLVQ depends upon a choice for the update neighborhood or learning rate distribution - these are taken care of automatically. Segmentation of a gray tone image is used as a typical application of these algorithms to illustrate the performance of GLVQ/FLVQ.
Application of fuzzy C-Means Algorithm for Determining Field of Interest in Information System Study STTH Medan

Science.gov (United States)

Rahman Syahputra, Edy; Agustina Dalimunthe, Yulia; Irvan

2017-12-01

Many students are confused in choosing their own field of specialization, ultimately choosing areas of specialization that are incompatible with a variety of reasons such as just following a friend or because of the area of interest of many choices without knowing whether they have Competencies in the chosen field of interest. This research aims to apply Clustering method with Fuzzy C-means algorithm to classify students in the chosen interest field. The Fuzzy C-Means algorithm is one of the easiest and often used algorithms in data grouping techniques because it makes efficient estimates and does not require many parameters. Several studies have led to the conclusion that the Fuzzy C-Means algorithm can be used to group data based on certain attributes. In this research will be used Fuzzy C-Means algorithm to classify student data based on the value of core subjects in the selection of specialization field. This study also tested the accuracy of the Fuzzy C-Means algorithm in the determination of interest area. The study was conducted on the STT-Harapan Medan Information System Study program, and the object of research is the value of all students of STT-Harapan Medan Information System Study Program 2012. From this research, it is expected to get the specialization field, according to the students' ability based on the prerequisite principal value.
Artificial Bee Colony Algorithm Based on K-Means Clustering for Multiobjective Optimal Power Flow Problem

Directory of Open Access Journals (Sweden)

Liling Sun

2015-01-01

Full Text Available An improved multiobjective ABC algorithm based on K-means clustering, called CMOABC, is proposed. To fasten the convergence rate of the canonical MOABC, the way of information communication in the employed bees’ phase is modified. For keeping the population diversity, the multiswarm technology based on K-means clustering is employed to decompose the population into many clusters. Due to each subcomponent evolving separately, after every specific iteration, the population will be reclustered to facilitate information exchange among different clusters. Application of the new CMOABC on several multiobjective benchmark functions shows a marked improvement in performance over the fast nondominated sorting genetic algorithm (NSGA-II, the multiobjective particle swarm optimizer (MOPSO, and the multiobjective ABC (MOABC. Finally, the CMOABC is applied to solve the real-world optimal power flow (OPF problem that considers the cost, loss, and emission impacts as the objective functions. The 30-bus IEEE test system is presented to illustrate the application of the proposed algorithm. The simulation results demonstrate that, compared to NSGA-II, MOPSO, and MOABC, the proposed CMOABC is superior for solving OPF problem, in terms of optimization accuracy.
Cluster dynamics modeling and experimental investigation of the effect of injected interstitials

Science.gov (United States)

Michaut, B.; Jourdan, T.; Malaplate, J.; Renault-Laborne, A.; Sefta, F.; Décamps, B.

2017-12-01

The effect of injected interstitials on loop and cavity microstructures is investigated experimentally and numerically for 304L austenitic stainless steel irradiated at 450 °C with 10 MeV Fe5+ ions up to about 100 dpa. A cluster dynamics model is parametrized on experimental results obtained by transmission electron microscopy (TEM) in a region where injected interstitials can be safely neglected. It is then used to model the damage profile and study the impact of self-ion injection. Results are compared to TEM observations on cross-sections of specimens. It is shown that injected interstitials have a significant effect on cavity density and mean size, even in the sink-dominated regime. To quantitatively match the experimental data in the self-ions injected area, a variation of some parameters is necessary. We propose that the fraction of freely migrating species may vary as a function of depth. Finally, we show that simple rate theory considerations do not seem to be valid for these experimental conditions.
Extension of K-Means Algorithm for clustering mixed data | Onuodu ...

African Journals Online (AJOL)

Also proposed is a new dissimilarity measure that uses relative cumulative frequency-based method in clustering objects with mixed values. The dissimilarity model developed could serve as a predictive tool for identifying attributes of objects in mixed datasets. It has been implemented using JAVA programming language ...
An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

Science.gov (United States)

Nidheesh, N; Abdul Nazeer, K A; Ameer, P M

2017-12-01

Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.

blockcluster: An R Package for Model-Based Co-Clustering

Directory of Open Access Journals (Sweden)

Parmeet Singh Bhatia

2017-02-01

Full Text Available Simultaneous clustering of rows and columns, usually designated by bi-clustering, coclustering or block clustering, is an important technique in two way data analysis. A new standard and efficient approach has been recently proposed based on the latent block model (Govaert and Nadif 2003 which takes into account the block clustering problem on both the individual and variable sets. This article presents our R package blockcluster for co-clustering of binary, contingency and continuous data based on these very models. In this document, we will give a brief review of the model-based block clustering methods, and we will show how the R package blockcluster can be used for co-clustering.
Modeling and clustering water demand patterns from real-world smart meter data

Directory of Open Access Journals (Sweden)

N. Cheifetz

2017-08-01

Full Text Available Nowadays, drinking water utilities need an acute comprehension of the water demand on their distribution network, in order to efficiently operate the optimization of resources, manage billing and propose new customer services. With the emergence of smart grids, based on automated meter reading (AMR, a better understanding of the consumption modes is now accessible for smart cities with more granularities. In this context, this paper evaluates a novel methodology for identifying relevant usage profiles from the water consumption data produced by smart meters. The methodology is fully data-driven using the consumption time series which are seen as functions or curves observed with an hourly time step. First, a Fourier-based additive time series decomposition model is introduced to extract seasonal patterns from time series. These patterns are intended to represent the customer habits in terms of water consumption. Two functional clustering approaches are then used to classify the extracted seasonal patterns: the functional version of K-means, and the Fourier REgression Mixture (FReMix model. The K-means approach produces a hard segmentation and K representative prototypes. On the other hand, the FReMix is a generative model and also produces K profiles as well as a soft segmentation based on the posterior probabilities. The proposed approach is applied to a smart grid deployed on the largest water distribution network (WDN in France. The two clustering strategies are evaluated and compared. Finally, a realistic interpretation of the consumption habits is given for each cluster. The extensive experiments and the qualitative interpretation of the resulting clusters allow one to highlight the effectiveness of the proposed methodology.
Modeling and clustering water demand patterns from real-world smart meter data

Science.gov (United States)

Cheifetz, Nicolas; Noumir, Zineb; Samé, Allou; Sandraz, Anne-Claire; Féliers, Cédric; Heim, Véronique

2017-08-01

Nowadays, drinking water utilities need an acute comprehension of the water demand on their distribution network, in order to efficiently operate the optimization of resources, manage billing and propose new customer services. With the emergence of smart grids, based on automated meter reading (AMR), a better understanding of the consumption modes is now accessible for smart cities with more granularities. In this context, this paper evaluates a novel methodology for identifying relevant usage profiles from the water consumption data produced by smart meters. The methodology is fully data-driven using the consumption time series which are seen as functions or curves observed with an hourly time step. First, a Fourier-based additive time series decomposition model is introduced to extract seasonal patterns from time series. These patterns are intended to represent the customer habits in terms of water consumption. Two functional clustering approaches are then used to classify the extracted seasonal patterns: the functional version of K-means, and the Fourier REgression Mixture (FReMix) model. The K-means approach produces a hard segmentation and K representative prototypes. On the other hand, the FReMix is a generative model and also produces K profiles as well as a soft segmentation based on the posterior probabilities. The proposed approach is applied to a smart grid deployed on the largest water distribution network (WDN) in France. The two clustering strategies are evaluated and compared. Finally, a realistic interpretation of the consumption habits is given for each cluster. The extensive experiments and the qualitative interpretation of the resulting clusters allow one to highlight the effectiveness of the proposed methodology.
Modeling sports highlights using a time-series clustering framework and model interpretation

Science.gov (United States)

Radhakrishnan, Regunathan; Otsuka, Isao; Xiong, Ziyou; Divakaran, Ajay

2005-01-01

In our past work on sports highlights extraction, we have shown the utility of detecting audience reaction using an audio classification framework. The audio classes in the framework were chosen based on intuition. In this paper, we present a systematic way of identifying the key audio classes for sports highlights extraction using a time series clustering framework. We treat the low-level audio features as a time series and model the highlight segments as "unusual" events in a background of an "usual" process. The set of audio classes to characterize the sports domain is then identified by analyzing the consistent patterns in each of the clusters output from the time series clustering framework. The distribution of features from the training data so obtained for each of the key audio classes, is parameterized by a Minimum Description Length Gaussian Mixture Model (MDL-GMM). We also interpret the meaning of each of the mixture components of the MDL-GMM for the key audio class (the "highlight" class) that is correlated with highlight moments. Our results show that the "highlight" class is a mixture of audience cheering and commentator's excited speech. Furthermore, we show that the precision-recall performance for highlights extraction based on this "highlight" class is better than that of our previous approach which uses only audience cheering as the key highlight class.
Clustering of European winter storms: A multi-model perspective

Science.gov (United States)

Renggli, Dominik; Buettner, Annemarie; Scherb, Anke; Straub, Daniel; Zimmerli, Peter

2016-04-01

The storm series over Europe in 1990 (Daria, Vivian, Wiebke, Herta) and 1999 (Anatol, Lothar, Martin) are very well known. Such clusters of severe events strongly affect the seasonally accumulated damage statistics. The (re)insurance industry has quantified clustering by using distribution assumptions deduced from the historical storm activity of the last 30 to 40 years. The use of storm series simulated by climate models has only started recently. Climate model runs can potentially represent 100s to 1000s of years, allowing a more detailed quantification of clustering than the history of the last few decades. However, it is unknown how sensitive the representation of clustering is to systematic biases. Using a multi-model ensemble allows quantifying that uncertainty. This work uses CMIP5 decadal ensemble hindcasts to study clustering of European winter storms from a multi-model perspective. An objective identification algorithm extracts winter storms (September to April) in the gridded 6-hourly wind data. Since the skill of European storm predictions is very limited on the decadal scale, the different hindcast runs are interpreted as independent realizations. As a consequence, the available hindcast ensemble represents several 1000 simulated storm seasons. The seasonal clustering of winter storms is quantified using the dispersion coefficient. The benchmark for the decadal prediction models is the 20th Century Reanalysis. The decadal prediction models are able to reproduce typical features of the clustering characteristics observed in the reanalysis data. Clustering occurs in all analyzed models over the North Atlantic and European region, in particular over Great Britain and Scandinavia as well as over Iberia (i.e. the exit regions of the North Atlantic storm track). Clustering is generally weaker in the models compared to reanalysis, although the differences between different models are substantial. In contrast to existing studies, clustering is driven by weak
DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.

Science.gov (United States)

Sun, Zhe; Wang, Ting; Deng, Ke; Wang, Xiao-Feng; Lafyatis, Robert; Ding, Ying; Hu, Ming; Chen, Wei

2018-01-01

Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods. DIMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/∼wec47/singlecell.html. wei.chen@chp.edu or hum@ccf.org. Supplementary data are available at Bioinformatics online. © The Author
Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.

Science.gov (United States)

Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço

2017-11-01

The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples. © 2017 American Academy of Forensic Sciences.
Validating clustering of molecular dynamics simulations using polymer models

Directory of Open Access Journals (Sweden)

Phillips Joshua L

2011-11-01

Full Text Available Abstract Background Molecular dynamics (MD simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. Results We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. Conclusions We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our
On the Merging Cluster Abell 578 and Its Central Radio Galaxy 4C+67.13

Science.gov (United States)

Hagino, K.; Stawarz, Ł.; Siemiginowska, A.; Cheung, C. C.; Kozieł-Wierzbowska, D.; Szostek, A.; Madejski, G.; Harris, D. E.; Simionescu, A.; Takahashi, T.

2015-06-01

Here we analyze radio, optical, and X-ray data for the peculiar cluster Abell 578. This cluster is not fully relaxed and consists of two merging sub-systems. The brightest cluster galaxy (BCG), CGPG 0719.8+6704, is a pair of interacting ellipticals with projected separation ˜10 kpc, the brighter of which hosts the radio source 4C+67.13. The Fanaroff-Riley type-II radio morphology of 4C+67.13 is unusual for central radio galaxies in local Abell clusters. Our new optical spectroscopy revealed that both nuclei of the CGPG 0719.8+6704 pair are active, albeit at low accretion rates corresponding to the Eddington ratio ˜ {{10}-4} (for the estimated black hole masses of ˜ 3× {{10}8} {{M}⊙ } and ˜ {{10}9} {{M}⊙ }). The gathered X-ray (Chandra) data allowed us to confirm and to quantify robustly the previously noted elongation of the gaseous atmosphere in the dominant sub-cluster, as well as a large spatial offset (˜60 kpc projected) between the position of the BCG and the cluster center inferred from the modeling of the X-ray surface brightness distribution. Detailed analysis of the brightness profiles and temperature revealed also that the cluster gas in the vicinity of 4C+67.13 is compressed (by a factor of about ˜1.4) and heated (from ≃ 2.0 keV up to 2.7 keV), consistent with the presence of a weak shock (Mach number ˜1.3) driven by the expanding jet cocoon. This would then require the jet kinetic power of the order of ˜ {{10}45} erg s-1, implying either a very high efficiency of the jet production for the current accretion rate, or a highly modulated jet/accretion activity in the system. Based on service observations made with the WHT operated on the island of La Palma by the Isaac Newton Group in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofísica de Canarias.
A stepwise-cluster microbial biomass inference model in food waste composting

International Nuclear Information System (INIS)

Sun Wei; Huang, Guo H.; Zeng Guangming; Qin Xiaosheng; Sun Xueling

2009-01-01

A stepwise-cluster microbial biomass inference (SMI) model was developed through introducing stepwise-cluster analysis (SCA) into composting process modeling to tackle the nonlinear relationships among state variables and microbial activities. The essence of SCA is to form a classification tree based on a series of cutting or mergence processes according to given statistical criteria. Eight runs of designed experiments in bench-scale reactors in a laboratory were constructed to demonstrate the feasibility of the proposed method. The results indicated that SMI could help establish a statistical relationship between state variables and composting microbial characteristics, where discrete and nonlinear complexities exist. Significance levels of cutting/merging were provided such that the accuracies of the developed forecasting trees were controllable. Through an attempted definition of input effects on the output in SMI, the effects of the state variables on thermophilic bacteria were ranged in a descending order as: Time (day) > moisture content (%) > ash content (%, dry) > Lower Temperature (deg. C) > pH > NH 4 + -N (mg/Kg, dry) > Total N (%, dry) > Total C (%, dry); the effects on mesophilic bacteria were ordered as: Time > Upper Temperature (deg. C) > Total N > moisture content > NH 4 + -N > Total C > pH. This study made the first attempt in applying SCA to mapping the nonlinear and discrete relationships in composting processes.
A physical analogy to fuzzy clustering

DEFF Research Database (Denmark)

Jantzen, Jan

2004-01-01

This tutorial paper provides an interpretation of the membership assignment in the fuzzy clustering algorithm fuzzy c-means. The membership of a data point to several clusters is shown to be analogous to the gravitational forces between bodies of mass. This provides an alternative way to explain...
A novel intrusion detection method based on OCSVM and K-means recursive clustering

Directory of Open Access Journals (Sweden)

Leandros A. Maglaras

2015-01-01

Full Text Available In this paper we present an intrusion detection module capable of detecting malicious network traffic in a SCADA (Supervisory Control and Data Acquisition system, based on the combination of One-Class Support Vector Machine (OCSVM with RBF kernel and recursive k-means clustering. Important parameters of OCSVM, such as Gaussian width o and parameter v affect the performance of the classifier. Tuning of these parameters is of great importance in order to avoid false positives and over fitting. The combination of OCSVM with recursive k- means clustering leads the proposed intrusion detection module to distinguish real alarms from possible attacks regardless of the values of parameters o and v, making it ideal for real-time intrusion detection mechanisms for SCADA systems. Extensive simulations have been conducted with datasets extracted from small and medium sized HTB SCADA testbeds, in order to compare the accuracy, false alarm rate and execution time against the base line OCSVM method.
Alpha cluster model and spectrum of 16O

International Nuclear Information System (INIS)

Bauhoff, W.; Schultheis, H.; Schultheis, R.

1983-01-01

The structure of 16 O is studied in the alpha cluster model with parity and angular-momentum projection for several nucleon-nucleon interactions. The method differs from previous studies in that the states of positive and negative parity are determined without the customary restriction of the variational space to cluster positions with certain assumed symmetries. It is demonstrated that the alpha cluster model of 16 O is capable of explaining most of the experimental T = O levels up to about 15 MeV excitation. A shell-model analysis of the excited cluster-model states shows the necessity of including a very large number of shells. The evidence for the recently proposed tetrahedral symmetry of some excited states is also discussed
Is the non-isothermal double β-model incompatible with no time evolution of galaxy cluster gas mass fraction?

Science.gov (United States)

Holanda, R. F. L.

2018-05-01

In this paper, we propose a new method to obtain the depletion factor γ(z), the ratio by which the measured baryon fraction in galaxy clusters is depleted with respect to the universal mean. We use exclusively galaxy cluster data, namely, X-ray gas mass fraction (fgas) and angular diameter distance measurements from Sunyaev-Zel'dovich effect plus X-ray observations. The galaxy clusters are the same in both data set and the non-isothermal spherical double β-model was used to describe their electron density and temperature profiles. In order to compare our results with those from recent cosmological hydrodynamical simulations, we suppose a possible time evolution for γ(z), such as, γ(z) =γ0(1 +γ1 z) . As main conclusions we found that: the γ0 value is in full agreement with the simulations. On the other hand, although the γ1 value found in our analysis is compatible with γ1 = 0 within 2σ c.l., our results show a non-negligible time evolution for the depletion factor, unlike the results of the simulations. However, we also put constraints on γ(z) by using the fgas measurements and angular diameter distances obtained from the flat ΛCDM model (Planck results) and from a sample of galaxy clusters described by an elliptical profile. For these cases no significant time evolution for γ(z) was found. Then, if a constant depletion factor is an inherent characteristic of these structures, our results show that the spherical double β-model used to describe the galaxy clusters considered does not affect the quality of their fgas measurements.
Investigation of {sup 16}O+{sup 12}C refractive elastic scattering using the α-cluster model potential

Energy Technology Data Exchange (ETDEWEB)

Hassanain, Mahmoud A. [King Khalid University, Department of Physics, Abha (Saudi Arabia); Assiut University, Department of Physics, New-Valley Faculty of Science, Assiut (Egypt)

2016-01-15

Differential cross-section of the {sup 16}O+{sup 12}C elastic scattering at E{sub lab} = 132, 181, 200, 260, 300, 608 and 1503MeV has been reanalyzed in the framework of double-folding cluster (DFC1) potential over a wide angular range which cover both diffractive and refractive regions. Based upon the α-cluster structure of both colliding nuclei, the real DFC1 optical potential has been generated by using α-α effective interaction and new cluster modified Gaussian (CMGD) of target and projectile has also been extracted. Successful descriptions of the data were obtained over the full measured angular range at all considered energies. The results have been compared with the findings obtained by using the phenomenological approach as well as experimental data. Furthermore, the consistency between the real and imaginary volume integrals is checked by the dispersion relation and the total reaction cross-section has also been investigated. (orig.)
Order-Constrained Solutions in K-Means Clustering: Even Better than Being Globally Optimal

Science.gov (United States)

Steinley, Douglas; Hubert, Lawrence

2008-01-01

This paper proposes an order-constrained K-means cluster analysis strategy, and implements that strategy through an auxiliary quadratic assignment optimization heuristic that identifies an initial object order. A subsequent dynamic programming recursion is applied to optimally subdivide the object set subject to the order constraint. We show that…
Perbandingan Kinerja Metode Ward Dan K-means Dalam Menentukan Cluster Data Mahasiswa Pemohon Beasiswa (Studi Kasus : STMIK Pringsewu)

OpenAIRE

Satria, Fiqih; Aziz, R Z Abdul

2016-01-01

This research aims to determine the steps cluster analysis method with Ward method and K-Means method, and compare the results of the analysis of the two methods for clustering student data related decision-making to determine the students are eligible to receive a Peningkatan PrestasiAkademik (PPA) scholarship and Bantuan Biaya Akademik (BBA) scholarship in STMIK Pringsewu. Cluster analysis was performed using IBM SPSS Version 23. Cluster Analysis results of both methods were compared using ...
STAR CLUSTERS IN M31. V. EVIDENCE FOR SELF-ENRICHMENT IN OLD M31 CLUSTERS FROM INTEGRATED SPECTROSCOPY

International Nuclear Information System (INIS)

Schiavon, Ricardo P.; Caldwell, Nelson; Conroy, Charlie; Graves, Genevieve J.; Strader, Jay; MacArthur, Lauren A.; Courteau, Stéphane; Harding, Paul

2013-01-01

In the past decade, the notion that globular clusters (GCs) are composed of coeval stars with homogeneous initial chemical compositions has been challenged by growing evidence that they host an intricate stellar population mix, likely indicative of a complex history of star formation and chemical enrichment. Several models have been proposed to explain the existence of multiple stellar populations in GCs, but no single model provides a fully satisfactory match to existing data. Correlations between chemistry and global parameters such as cluster mass or luminosity are fundamental clues to the physics of GC formation. In this Letter, we present an analysis of the mean abundances of Fe, Mg, C, N, and Ca for 72 old GCs from the Andromeda galaxy. We show for the first time that there is a correlation between the masses of GCs and the mean stellar abundances of nitrogen, spanning almost two decades in mass. This result sheds new light on the formation of GCs, providing important constraints on their internal chemical evolution and mass loss history
Quark cluster model and confinement

International Nuclear Information System (INIS)

Koike, Yuji; Yazaki, Koichi

2000-01-01

How confinement of quarks is implemented for multi-hadron systems in the quark cluster model is reviewed. In order to learn the nature of the confining interaction for fermions we first study 1+1 dimensional QED and QCD, in which the gauge field can be eliminated exactly and generates linear interaction of fermions. Then, we compare the two-body potential model, the flip-flop model and the Born-Oppenheimer approach in the strong coupling lattice QCD for the meson-meson system. Having shown how the long-range attraction between hadrons, van der Waals interaction, shows up in the two-body potential model, we discuss two distinct attempts beyond the two-body potential model: one is a many-body potential model, the flip-flop model, and the other is the Born-Oppenheimer approach in the strong coupling lattice QCD. We explain how the emergence of the long-range attraction is avoided in these attempts. Finally, we present the results of the application of the flip-flop model to the baryon-baryon scattering in the quark cluster model. (author)
Surface EMG decomposition based on K-means clustering and convolution kernel compensation.

Science.gov (United States)

Ning, Yong; Zhu, Xiangjun; Zhu, Shanan; Zhang, Yingchun

2015-03-01

A new approach has been developed by combining the K-mean clustering (KMC) method and a modified convolution kernel compensation (CKC) method for multichannel surface electromyogram (EMG) decomposition. The KMC method was first utilized to cluster vectors of observations at different time instants and then estimate the initial innervation pulse train (IPT). The CKC method, modified with a novel multistep iterative process, was conducted to update the estimated IPT. The performance of the proposed K-means clustering-Modified CKC (KmCKC) approach was evaluated by reconstructing IPTs from both simulated and experimental surface EMG signals. The KmCKC approach successfully reconstructed all 10 IPTs from the simulated surface EMG signals with true positive rates (TPR) of over 90% with a low signal-to-noise ratio (SNR) of -10 dB. More than 10 motor units were also successfully extracted from the 64-channel experimental surface EMG signals of the first dorsal interosseous (FDI) muscles when a contraction force was held at 8 N by using the KmCKC approach. A "two-source" test was further conducted with 64-channel surface EMG signals. The high percentage of common MUs and common pulses (over 92% at all force levels) between the IPTs reconstructed from the two independent groups of surface EMG signals demonstrates the reliability and capability of the proposed KmCKC approach in multichannel surface EMG decomposition. Results from both simulated and experimental data are consistent and confirm that the proposed KmCKC approach can successfully reconstruct IPTs with high accuracy at different levels of contraction.

Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation

Directory of Open Access Journals (Sweden)

Saatchi Mahdi

2011-11-01

Full Text Available Abstract Background Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Methods Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Results Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. Conclusions These results suggest that genomic estimates
The random cluster model and a new integration identity

International Nuclear Information System (INIS)

Chen, L C; Wu, F Y

2005-01-01

We evaluate the free energy of the random cluster model at its critical point for 0 -1 (√q/2) is a rational number. As a by-product, our consideration leads to a closed-form evaluation of the integral 1/(4π 2 ) ∫ 0 2π dΘ ∫ 0 2π dΦ ln[A+B+C - AcosΘ - BcosΦ - Ccos(Θ+Φ)] = -ln(2S) + (2/π)[Ti 2 (AS) + Ti 2 (BS) + Ti 2 (CS)], which arises in lattice statistics, where A, B, C ≥ 0 and S=1/√(AB + BC + CA)
Genetic k-means clustering approach for mapping human vulnerability to chemical hazards in the industrialized city: a case study of Shanghai, China.

Science.gov (United States)

Shi, Weifang; Zeng, Weihua

2013-06-20

Reducing human vulnerability to chemical hazards in the industrialized city is a matter of great urgency. Vulnerability mapping is an alternative approach for providing vulnerability-reducing interventions in a region. This study presents a method for mapping human vulnerability to chemical hazards by using clustering analysis for effective vulnerability reduction. Taking the city of Shanghai as the study area, we measure human exposure to chemical hazards by using the proximity model with additionally considering the toxicity of hazardous substances, and capture the sensitivity and coping capacity with corresponding indicators. We perform an improved k-means clustering approach on the basis of genetic algorithm by using a 500 m × 500 m geographical grid as basic spatial unit. The sum of squared errors and silhouette coefficient are combined to measure the quality of clustering and to determine the optimal clustering number. Clustering result reveals a set of six typical human vulnerability patterns that show distinct vulnerability dimension combinations. The vulnerability mapping of the study area reflects cluster-specific vulnerability characteristics and their spatial distribution. Finally, we suggest specific points that can provide new insights in rationally allocating the limited funds for the vulnerability reduction of each cluster.
Genetic k-Means Clustering Approach for Mapping Human Vulnerability to Chemical Hazards in the Industrialized City: A Case Study of Shanghai, China

Directory of Open Access Journals (Sweden)

Weihua Zeng

2013-06-01

Full Text Available Reducing human vulnerability to chemical hazards in the industrialized city is a matter of great urgency. Vulnerability mapping is an alternative approach for providing vulnerability-reducing interventions in a region. This study presents a method for mapping human vulnerability to chemical hazards by using clustering analysis for effective vulnerability reduction. Taking the city of Shanghai as the study area, we measure human exposure to chemical hazards by using the proximity model with additionally considering the toxicity of hazardous substances, and capture the sensitivity and coping capacity with corresponding indicators. We perform an improved k-means clustering approach on the basis of genetic algorithm by using a 500 m × 500 m geographical grid as basic spatial unit. The sum of squared errors and silhouette coefficient are combined to measure the quality of clustering and to determine the optimal clustering number. Clustering result reveals a set of six typical human vulnerability patterns that show distinct vulnerability dimension combinations. The vulnerability mapping of the study area reflects cluster-specific vulnerability characteristics and their spatial distribution. Finally, we suggest specific points that can provide new insights in rationally allocating the limited funds for the vulnerability reduction of each cluster.
Parameters of oscillation generation regions in open star cluster models

Science.gov (United States)

Danilov, V. M.; Putkov, S. I.

2017-07-01

We determine the masses and radii of central regions of open star cluster (OCL) models with small or zero entropy production and estimate the masses of oscillation generation regions in clustermodels based on the data of the phase-space coordinates of stars. The radii of such regions are close to the core radii of the OCL models. We develop a new method for estimating the total OCL masses based on the cluster core mass, the cluster and cluster core radii, and radial distribution of stars. This method yields estimates of dynamical masses of Pleiades, Praesepe, and M67, which agree well with the estimates of the total masses of the corresponding clusters based on proper motions and spectroscopic data for cluster stars.We construct the spectra and dispersion curves of the oscillations of the field of azimuthal velocities v φ in OCL models. Weak, low-amplitude unstable oscillations of v φ develop in cluster models near the cluster core boundary, and weak damped oscillations of v φ often develop at frequencies close to the frequencies of more powerful oscillations, which may reduce the non-stationarity degree in OCL models. We determine the number and parameters of such oscillations near the cores boundaries of cluster models. Such oscillations points to the possible role that gradient instability near the core of cluster models plays in the decrease of the mass of the oscillation generation regions and production of entropy in the cores of OCL models with massive extended cores.
A fuzzy clustering technique for calorimetric data reconstruction

International Nuclear Information System (INIS)

Sandhir, Radha Pyari; Muhuri, Sanjib; Nayak, Tapan K.

2010-01-01

In high energy physics experiments, electromagnetic calorimeters are used to measure shower particles produced in p-p or heavy-ion collisions. In order to extract information and reconstruct the characteristics of the various incoming particles, clustering is required to be performed on each of the calorimeter planes. Hard clustering techniques such as Local Maxima Search, Connected-cell Search and K-means Clustering simply assign a data point to a cluster. A data point either lies in a cluster or it does not, and so, overlapping clusters are hardly distinguishable. Fuzzy c-means clustering is a version of the k-means algorithm that incorporates fuzzy logic, so that each point has a weak or strong association to the cluster, determined by the inverse distance to the center of the cluster. The term fuzzy is used because an observation may in fact lie in more than one cluster simultaneously, though to different degrees called 'memberships', as is the case with many high energy physics applications. The centers obtained using the FCM algorithm are based on the geometric locations of the data points
k-Means Clustering with Hölder Divergences

KAUST Repository

Nielsen, Frank

2017-10-24

We introduced two novel classes of Hölder divergences and Hölder pseudo-divergences that are both invariant to rescaling, and that both encapsulate the Cauchy-Schwarz divergence and the skew Bhattacharyya divergences. We review the elementary concepts of those parametric divergences, and perform a clustering analysis on two synthetic datasets. It is shown experimentally that the symmetrized Hölder divergences consistently outperform significantly the Cauchy-Schwarz divergence in clustering tasks.
k-Means Clustering with Hölder Divergences

KAUST Repository

Nielsen, Frank; Sun, Ke; Marchand-Maillet, Sté phane

2017-01-01

We introduced two novel classes of Hölder divergences and Hölder pseudo-divergences that are both invariant to rescaling, and that both encapsulate the Cauchy-Schwarz divergence and the skew Bhattacharyya divergences. We review the elementary concepts of those parametric divergences, and perform a clustering analysis on two synthetic datasets. It is shown experimentally that the symmetrized Hölder divergences consistently outperform significantly the Cauchy-Schwarz divergence in clustering tasks.
MOCK OBSERVATIONS OF BLUE STRAGGLERS IN GLOBULAR CLUSTER MODELS

International Nuclear Information System (INIS)

Sills, Alison; Glebbeek, Evert; Chatterjee, Sourav; Rasio, Frederic A.

2013-01-01

We created artificial color-magnitude diagrams of Monte Carlo dynamical models of globular clusters and then used observational methods to determine the number of blue stragglers in those clusters. We compared these blue stragglers to various cluster properties, mimicking work that has been done for blue stragglers in Milky Way globular clusters to determine the dominant formation mechanism(s) of this unusual stellar population. We find that a mass-based prescription for selecting blue stragglers will select approximately twice as many blue stragglers than a selection criterion that was developed for observations of real clusters. However, the two numbers of blue stragglers are well-correlated, so either selection criterion can be used to characterize the blue straggler population of a cluster. We confirm previous results that the simplified prescription for the evolution of a collision or merger product in the BSE code overestimates their lifetimes. We show that our model blue stragglers follow similar trends with cluster properties (core mass, binary fraction, total mass, collision rate) as the true Milky Way blue stragglers as long as we restrict ourselves to model clusters with an initial binary fraction higher than 5%. We also show that, in contrast to earlier work, the number of blue stragglers in the cluster core does have a weak dependence on the collisional parameter Γ in both our models and in Milky Way globular clusters
Clustered iterative stochastic ensemble method for multi-modal calibration of subsurface flow models

KAUST Repository

Elsheikh, Ahmed H.

2013-05-01

A novel multi-modal parameter estimation algorithm is introduced. Parameter estimation is an ill-posed inverse problem that might admit many different solutions. This is attributed to the limited amount of measured data used to constrain the inverse problem. The proposed multi-modal model calibration algorithm uses an iterative stochastic ensemble method (ISEM) for parameter estimation. ISEM employs an ensemble of directional derivatives within a Gauss-Newton iteration for nonlinear parameter estimation. ISEM is augmented with a clustering step based on k-means algorithm to form sub-ensembles. These sub-ensembles are used to explore different parts of the search space. Clusters are updated at regular intervals of the algorithm to allow merging of close clusters approaching the same local minima. Numerical testing demonstrates the potential of the proposed algorithm in dealing with multi-modal nonlinear parameter estimation for subsurface flow models. © 2013 Elsevier B.V.
Stroke localization and classification using microwave tomography with k-means clustering and support vector machine.

Science.gov (United States)

Guo, Lei; Abbosh, Amin

2018-05-01

For any chance for stroke patients to survive, the stroke type should be classified to enable giving medication within a few hours of the onset of symptoms. In this paper, a microwave-based stroke localization and classification framework is proposed. It is based on microwave tomography, k-means clustering, and a support vector machine (SVM) method. The dielectric profile of the brain is first calculated using the Born iterative method, whereas the amplitude of the dielectric profile is then taken as the input to k-means clustering. The cluster is selected as the feature vector for constructing and testing the SVM. A database of MRI-derived realistic head phantoms at different signal-to-noise ratios is used in the classification procedure. The performance of the proposed framework is evaluated using the receiver operating characteristic (ROC) curve. The results based on a two-dimensional framework show that 88% classification accuracy, with a sensitivity of 91% and a specificity of 87%, can be achieved. Bioelectromagnetics. 39:312-324, 2018. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Representing Degree Distributions, Clustering, and Homophily in Social Networks With Latent Cluster Random Effects Models.

Science.gov (United States)

Krivitsky, Pavel N; Handcock, Mark S; Raftery, Adrian E; Hoff, Peter D

2009-07-01

Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actor degrees. We propose a latent cluster random effects model to represent all of these features, and we describe a Bayesian estimation method for it. The model is applicable to both binary and non-binary network data. We illustrate the model using two real datasets. We also apply it to two simulated network datasets with the same, highly skewed, degree distribution, but very different network behavior: one unstructured and the other with transitivity and clustering. Models based on degree distributions, such as scale-free, preferential attachment and power-law models, cannot distinguish between these very different situations, but our model does.
Automatic video shot boundary detection using k-means clustering and improved adaptive dual threshold comparison

Science.gov (United States)

Sa, Qila; Wang, Zhihui

2018-03-01

At present, content-based video retrieval (CBVR) is the most mainstream video retrieval method, using the video features of its own to perform automatic identification and retrieval. This method involves a key technology, i.e. shot segmentation. In this paper, the method of automatic video shot boundary detection with K-means clustering and improved adaptive dual threshold comparison is proposed. First, extract the visual features of every frame and divide them into two categories using K-means clustering algorithm, namely, one with significant change and one with no significant change. Then, as to the classification results, utilize the improved adaptive dual threshold comparison method to determine the abrupt as well as gradual shot boundaries.Finally, achieve automatic video shot boundary detection system.
Covering Image Segmentation via Matrix X-means and J-means Clustering

Directory of Open Access Journals (Sweden)

Volodymyr MASHTALIR

2015-12-01

Full Text Available To provide tools for image understanding, non-trivial task of image segmentation is now put on a new semantic level of object detection. Internal, external and contextual region properties often can adequately represent image content but there arises field of view coverings due shape ambiguities on blurred images. Truthful image interpretation strictly depends on valid number of regions. The goal is an attempt to solve image clustering problem under fuzzy conditions of overlapping classes, more specifically, to find estimation of meaningful region number with following refining of fuzzy clustering data in matrix form.
Investigation of the alpha cluster model and the density matrix expansion in ion-ion collision

International Nuclear Information System (INIS)

Rashdan, M.B.M.

1986-01-01

This thesis deals with the investigation of the alpha cluster model (ACM) of brink and studies of the accuracy of the density matrix expansion (DME) approximation in deriving the real part of the ion-ion optical potential. the ACM is applied to calculate the inelastic 0 1 + →2 1 + charge form factor for electron scattering by 12 C to investigate the validity of this model for 12 C nucleus. it is found that the experimental curve can be fitted over the entire range of the momentum transfer by a generator - coordinate state for the 2 1 + state that consist of a superposition of two triangular ACM states with two different cluster separations and the same oscillator parameter
Fuzzy C-Means Algorithm for Segmentation of Aerial Photography Data Obtained Using Unmanned Aerial Vehicle

Science.gov (United States)

Akinin, M. V.; Akinina, N. V.; Klochkov, A. Y.; Nikiforov, M. B.; Sokolova, A. V.

2015-05-01

The report reviewed the algorithm fuzzy c-means, performs image segmentation, give an estimate of the quality of his work on the criterion of Xie-Beni, contain the results of experimental studies of the algorithm in the context of solving the problem of drawing up detailed two-dimensional maps with the use of unmanned aerial vehicles. According to the results of the experiment concluded that the possibility of applying the algorithm in problems of decoding images obtained as a result of aerial photography. The considered algorithm can significantly break the original image into a plurality of segments (clusters) in a relatively short period of time, which is achieved by modification of the original k-means algorithm to work in a fuzzy task.
Extending the Functionality of Behavioural Change-Point Analysis with k-Means Clustering: A Case Study with the Little Penguin (Eudyptula minor)

Science.gov (United States)

Zhang, Jingjing; Dennis, Todd E.

2015-01-01

We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1) behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2) hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3) k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known ‘artificial behaviours’ comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor) obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging), with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified. PMID:25922935
Extending the Functionality of Behavioural Change-Point Analysis with k-Means Clustering: A Case Study with the Little Penguin (Eudyptula minor).

Science.gov (United States)

Zhang, Jingjing; O'Reilly, Kathleen M; Perry, George L W; Taylor, Graeme A; Dennis, Todd E

2015-01-01

We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1) behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2) hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3) k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known 'artificial behaviours' comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor) obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging), with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified.
Extending the Functionality of Behavioural Change-Point Analysis with k-Means Clustering: A Case Study with the Little Penguin (Eudyptula minor.

Directory of Open Access Journals (Sweden)

Jingjing Zhang

Full Text Available We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1 behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2 hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3 k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known 'artificial behaviours' comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging, with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified.
IMPLEMENTASI ALGORITMA K-MEANS CLUSTERING UNTUK MENENTUKAN STRATEGI MARKETING PRESIDENT UNIVERSITY

Directory of Open Access Journals (Sweden)

Johan Oscar Ong

2013-06-01

Full Text Available Information technology advances very rapidly at this time to generate thousands or even millions of data from various aspect of life. However, what can be done with that much data?. In this research, we start from calculation of data set of students who have graduated from President University using k-means clustering algorithm, namely by classifying the data of students into several clusters based on the characteristics of this data in order to discover the information hidden from the data set of student who have graduated from President University. The attribute data that is used in this study is hometown, major and GPA. The purpose of this study is to help the President University's marketing department in predicting promotion strategies undertaken in the cities in Indonesia. Information gained in this study can be used as a references in determining the proper strategy for marketing team in their promotion activities in the cities in Indonesia so that the campaign will be more effective and efficient.

Dynamic Trajectory Extraction from Stereo Vision Using Fuzzy Clustering

Science.gov (United States)

Onishi, Masaki; Yoda, Ikushi

In recent years, many human tracking researches have been proposed in order to analyze human dynamic trajectory. These researches are general technology applicable to various fields, such as customer purchase analysis in a shopping environment and safety control in a (railroad) crossing. In this paper, we present a new approach for tracking human positions by stereo image. We use the framework of two-stepped clustering with k-means method and fuzzy clustering to detect human regions. In the initial clustering, k-means method makes middle clusters from objective features extracted by stereo vision at high speed. In the last clustering, c-means fuzzy method cluster middle clusters based on attributes into human regions. Our proposed method can be correctly clustered by expressing ambiguity using fuzzy clustering, even when many people are close to each other. The validity of our technique was evaluated with the experiment of trajectories extraction of doctors and nurses in an emergency room of a hospital.
ADPROCLUS: a graphical user interface for fitting additive profile clustering models to object by variable data matrices.

Science.gov (United States)

Wilderjans, Tom F; Ceulemans, Eva; Van Mechelen, Iven; Depril, Dirk

2011-03-01

In many areas of psychology, one is interested in disclosing the underlying structural mechanisms that generated an object by variable data set. Often, based on theoretical or empirical arguments, it may be expected that these underlying mechanisms imply that the objects are grouped into clusters that are allowed to overlap (i.e., an object may belong to more than one cluster). In such cases, analyzing the data with Mirkin's additive profile clustering model may be appropriate. In this model: (1) each object may belong to no, one or several clusters, (2) there is a specific variable profile associated with each cluster, and (3) the scores of the objects on the variables can be reconstructed by adding the cluster-specific variable profiles of the clusters the object in question belongs to. Until now, however, no software program has been publicly available to perform an additive profile clustering analysis. For this purpose, in this article, the ADPROCLUS program, steered by a graphical user interface, is presented. We further illustrate its use by means of the analysis of a patient by symptom data matrix.
RELICS: Strong Lens Models for Five Galaxy Clusters from the Reionization Lensing Cluster Survey

Science.gov (United States)

Cerny, Catherine; Sharon, Keren; Andrade-Santos, Felipe; Avila, Roberto J.; Bradač, Maruša; Bradley, Larry D.; Carrasco, Daniela; Coe, Dan; Czakon, Nicole G.; Dawson, William A.; Frye, Brenda L.; Hoag, Austin; Huang, Kuang-Han; Johnson, Traci L.; Jones, Christine; Lam, Daniel; Lovisari, Lorenzo; Mainali, Ramesh; Oesch, Pascal A.; Ogaz, Sara; Past, Matthew; Paterno-Mahler, Rachel; Peterson, Avery; Riess, Adam G.; Rodney, Steven A.; Ryan, Russell E.; Salmon, Brett; Sendra-Server, Irene; Stark, Daniel P.; Strolger, Louis-Gregory; Trenti, Michele; Umetsu, Keiichi; Vulcani, Benedetta; Zitrin, Adi

2018-06-01

Strong gravitational lensing by galaxy clusters magnifies background galaxies, enhancing our ability to discover statistically significant samples of galaxies at {\\boldsymbol{z}}> 6, in order to constrain the high-redshift galaxy luminosity functions. Here, we present the first five lens models out of the Reionization Lensing Cluster Survey (RELICS) Hubble Treasury Program, based on new HST WFC3/IR and ACS imaging of the clusters RXC J0142.9+4438, Abell 2537, Abell 2163, RXC J2211.7–0349, and ACT-CLJ0102–49151. The derived lensing magnification is essential for estimating the intrinsic properties of high-redshift galaxy candidates, and properly accounting for the survey volume. We report on new spectroscopic redshifts of multiply imaged lensed galaxies behind these clusters, which are used as constraints, and detail our strategy to reduce systematic uncertainties due to lack of spectroscopic information. In addition, we quantify the uncertainty on the lensing magnification due to statistical and systematic errors related to the lens modeling process, and find that in all but one cluster, the magnification is constrained to better than 20% in at least 80% of the field of view, including statistical and systematic uncertainties. The five clusters presented in this paper span the range of masses and redshifts of the clusters in the RELICS program. We find that they exhibit similar strong lensing efficiencies to the clusters targeted by the Hubble Frontier Fields within the WFC3/IR field of view. Outputs of the lens models are made available to the community through the Mikulski Archive for Space Telescopes.
Photochemical reactivity of aqueous fullerene clusters: C{sub 60} versus C{sub 70}

Energy Technology Data Exchange (ETDEWEB)

Hou, Wen-Che, E-mail: whou@mail.ncku.edu.tw; Huang, Shih-Hong

2017-01-15

Highlights: • Aqueous C{sub 60} and C{sub 70} clusters (nC{sub 60} and nC{sub 70}) formed through direct mixing with water adopted a face-centered cubic crystal structure. • The AQYs of nC{sub 60} were greater than those of nC{sub 70}. • Both nC{sub 60} and nC{sub 70} lost considerable organic carbon contents (>80%) after ∼8 months of outdoor sunlight irradiation. • The intermediate photoproducts of nC{sub 60} and nC{sub 70} exhibited an increased content of oxygen-containing functionalities. - Abstract: Over the past few years, there has been a strong interest in exploring the potential impact of fullerenes in the environment. Despite that both C{sub 60} and C{sub 70} have been detected in environmental matrices, the research on the impact of higher fullerenes, such as C{sub 70,} has been largely missing. This study evaluated and compared the phototransformation of aqueous C{sub 60} and C{sub 70} clusters (nC{sub 60} and nC{sub 70}) and their {sup 1}O{sub 2} production under sunlight and lamp light irradiation (315 nm, 360 nm and 420 nm). The nC{sub 60} and nC{sub 70} samples formed by direct mixing with water adopted a face-centered cubic (FCC) crystal structure. The apparent quantum yields (AQYs) of fullerene phototransformed were relatively constant over the examined wavelengths, while {sup 1}O{sub 2} production AQYs decreased with increased wavelengths. The long-term fate studies with outdoor sunlight indicated that both nC{sub 60} and nC{sub 70} lost considerable organic carbon contents (>80%) in water after ∼8 months of irradiation and that the intermediate photoproducts of nC{sub 60} and nC{sub 70} exhibited a progressively increased level of oxygen-containing functionalities. Overall, the study indicates that nC{sub 70} can be photochemically removed under sunlight conditions and that the photoreactivity of nC{sub 60} based on AQYs is greater than that of nC{sub 70}.
Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

Science.gov (United States)

Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

2017-12-01

Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.
Clustering and training set selection methods for improving the accuracy of quantitative laser induced breakdown spectroscopy

International Nuclear Information System (INIS)

Anderson, Ryan B.; Bell, James F.; Wiens, Roger C.; Morris, Richard V.; Clegg, Samuel M.

2012-01-01

We investigated five clustering and training set selection methods to improve the accuracy of quantitative chemical analysis of geologic samples by laser induced breakdown spectroscopy (LIBS) using partial least squares (PLS) regression. The LIBS spectra were previously acquired for 195 rock slabs and 31 pressed powder geostandards under 7 Torr CO 2 at a stand-off distance of 7 m at 17 mJ per pulse to simulate the operational conditions of the ChemCam LIBS instrument on the Mars Science Laboratory Curiosity rover. The clustering and training set selection methods, which do not require prior knowledge of the chemical composition of the test-set samples, are based on grouping similar spectra and selecting appropriate training spectra for the partial least squares (PLS2) model. These methods were: (1) hierarchical clustering of the full set of training spectra and selection of a subset for use in training; (2) k-means clustering of all spectra and generation of PLS2 models based on the training samples within each cluster; (3) iterative use of PLS2 to predict sample composition and k-means clustering of the predicted compositions to subdivide the groups of spectra; (4) soft independent modeling of class analogy (SIMCA) classification of spectra, and generation of PLS2 models based on the training samples within each class; (5) use of Bayesian information criteria (BIC) to determine an optimal number of clusters and generation of PLS2 models based on the training samples within each cluster. The iterative method and the k-means method using 5 clusters showed the best performance, improving the absolute quadrature root mean squared error (RMSE) by ∼ 3 wt.%. The statistical significance of these improvements was ∼ 85%. Our results show that although clustering methods can modestly improve results, a large and diverse training set is the most reliable way to improve the accuracy of quantitative LIBS. In particular, additional sulfate standards and specifically
A model-based clustering method to detect infectious disease transmission outbreaks from sequence variation.

Directory of Open Access Journals (Sweden)

Rosemary M McCloskey

2017-11-01

Full Text Available Clustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections. A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with large data sets. However, we have found that nonparametric clustering methods can be biased towards identifying clusters of diagnosis-where individuals are sampled sooner post-infection-rather than the clusters of rapid transmission that are meant to be potential foci for public health efforts. We develop a fundamentally new approach to genetic clustering based on fitting a Markov-modulated Poisson process (MMPP, which represents the evolution of transmission rates along the tree relating different infections. We evaluated this model-based method alongside five nonparametric clustering methods using both simulated and actual HIV sequence data sets. For simulated clusters of rapid transmission, the MMPP clustering method obtained higher mean sensitivity (85% and specificity (91% than the nonparametric methods. When we applied these clustering methods to published sequences from a study of HIV-1 genetic clusters in Seattle, USA, we found that the MMPP method categorized about half (46% as many individuals to clusters compared to the other methods. Furthermore, the mean internal branch lengths that approximate transmission rates were significantly shorter in clusters extracted using MMPP, but not by other methods. We determined that the computing time for the MMPP method scaled linearly with the size of trees, requiring about 30 seconds for a tree of 1,000 tips and about 20 minutes for 50,000 tips on a single computer. This new approach to genetic clustering has significant implications for the application of pathogen sequence analysis to public health, where
Predictive coupled-cluster isomer orderings for some Si{sub n}C{sub m} (m, n ≤ 12) clusters: A pragmatic comparison between DFT and complete basis limit coupled-cluster benchmarks

Energy Technology Data Exchange (ETDEWEB)

Byrd, Jason N., E-mail: byrd.jason@ensco.com [Quantum Theory Project, University of Florida, Gainesville, Florida 32611 (United States); ENSCO, Inc., 4849 North Wickham Road, Melbourne, Florida 32940 (United States); Lutz, Jesse J., E-mail: jesse.lutz.ctr@afit.edu; Jin, Yifan; Ranasinghe, Duminda S.; Perera, Ajith; Bartlett, Rodney J., E-mail: rodbartl@ufl.edu [Quantum Theory Project, University of Florida, Gainesville, Florida 32611 (United States); Montgomery, John A. [Department of Physics, University of Connecticut, Storrs, Connecticut 06269 (United States); Duan, Xiaofeng F. [Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio 45433 (United States); Air Force Research Laboratory DoD Supercomputing Resource Center, Wright-Patterson Air Force Base, Ohio 45433 (United States); Burggraf, Larry W. [Air Force Institute of Technology, Wright-Patterson Air Force Base, Ohio 45433 (United States); Sanders, Beverly A. [Quantum Theory Project, University of Florida, Gainesville, Florida 32611 (United States); Department of Computer and Information Science and Engineering, University of Florida, Gainesville, Florida 32611 (United States)

2016-07-14

The accurate determination of the preferred Si{sub 12}C{sub 12} isomer is important to guide experimental efforts directed towards synthesizing SiC nano-wires and related polymer structures which are anticipated to be highly efficient exciton materials for the opto-electronic devices. In order to definitively identify preferred isomeric structures for silicon carbon nano-clusters, highly accurate geometries, energies, and harmonic zero point energies have been computed using coupled-cluster theory with systematic extrapolation to the complete basis limit for set of silicon carbon clusters ranging in size from SiC{sub 3} to Si{sub 12}C{sub 12}. It is found that post-MBPT(2) correlation energy plays a significant role in obtaining converged relative isomer energies, suggesting that predictions using low rung density functional methods will not have adequate accuracy. Utilizing the best composite coupled-cluster energy that is still computationally feasible, entailing a 3-4 SCF and coupled-cluster theory with singles and doubles extrapolation with triple-ζ (T) correlation, the closo Si{sub 12}C{sub 12} isomer is identified to be the preferred isomer in the support of previous calculations [X. F. Duan and L. W. Burggraf, J. Chem. Phys. 142, 034303 (2015)]. Additionally we have investigated more pragmatic approaches to obtaining accurate silicon carbide isomer energies, including the use of frozen natural orbital coupled-cluster theory and several rungs of standard and double-hybrid density functional theory. Frozen natural orbitals as a way to compute post-MBPT(2) correlation energy are found to be an excellent balance between efficiency and accuracy.
Cost-effectiveness of gammaCore (non-invasive vagus nerve stimulation) for acute treatment of episodic cluster headache.

Science.gov (United States)

Mwamburi, Mkaya; Liebler, Eric J; Tenaglia, Andrew T

2017-11-01

Cluster headache is a debilitating disease characterized by excruciatingly painful attacks that affects 0.15% to 0.4% of the US population. Episodic cluster headache manifests as circadian and circannual seasonal bouts of attacks, each lasting 15 to 180 minutes, with periods of remission. In chronic cluster headache, the attacks occur throughout the year with no periods of remission. While existing treatments are effective for some patients, many patients continue to suffer. There are only 2 FDA-approved medications for episodic cluster headache in the United States, while others, such as high-flow oxygen, are used off-label. Episodic cluster headache is associated with comorbidities and affects work, productivity, and daily functioning. The economic burden of episodic cluster headache is considerable, costing more than twice that of nonheadache patients. gammaCore adjunct to standard of care (SoC) was found to have superior efficacy in treatment of acute episodic cluster headaches compared with sham-gammaCore used with SoC in ACT1 and ACT2 trials. However, the economic impact has not been characterized for this indication. We conducted a cost-effectiveness analysis of gammaCore adjunct to SoC compared with SoC alone for the treatment of acute pain associated with episodic cluster headache attacks. The model structure was based on treatment of acute attacks with 3 outcomes: failures, nonresponders, and responders. The time horizon of the model is 1 year using a payer perspective with uncertainty incorporated. Parameter inputs were derived from primary data from the randomized controlled trials for gammaCore. The mean annual costs associated with the gammaCore-plus-SoC arm was $9510, and mean costs for the SoC-alone arm was $10,040. The mean quality-adjusted life years for gammaCore-plus-SoC arm were 0.83, and for the SoC-alone arm, they were 0.74. The gammaCore-plus-SoC arm was dominant over SoC alone. All 1-way and multiway sensitivity analyses were cost
Impact of Personality Disorder Cluster on Depression Outcomes Within Collaborative Care Management Model of Care.

Science.gov (United States)

George, Merit P; Garrison, Gregory M; Merten, Zachary; Heredia, Dagoberto; Gonzales, Cesar; Angstman, Kurt B

2018-01-01

Previous studies have suggested that having a comorbid personality disorder (PD) along with major depression is associated with poorer depression outcomes relative to those without comorbid PD. However, few studies have examined the influence of specific PD cluster types. The purpose of the current study is to compare depression outcomes between cluster A, cluster B, and cluster C PD patients treated within a collaborative care management (CCM), relative to CCM patients without a PD diagnosis. The overarching goal was to identify cluster types that might confer a worse clinical prognosis. This retrospective chart review study examined 2826 adult patients with depression enrolled in CCM. The cohort was divided into 4 groups based on the presence of a comorbid PD diagnosis (cluster A/nonspecified, cluster B, cluster C, or no PD). Baseline clinical and demographic variables, along with 6-month follow-up Patient Health Questionnaire-9 (PHQ-9) scores were obtained for all groups. Depression remission was defined as a PHQ-9 score cluster A or nonspecified PD diagnosis, 122 patients (4.3%) had a cluster B diagnosis, 35 patients (1.2%) had a cluster C diagnosis, and 2610 patients (92.4%) did not have any PD diagnosis. The presence of a cluster A/nonspecified PD diagnosis was associated with a 62% lower likelihood of remission at 6 months (AOR = 0.38; 95% CI 0.20-0.70). The presence of a cluster B PD diagnosis was associated with a 71% lower likelihood of remission at 6 months (AOR = 0.29; 95% CI 0.18-0.47). Conversely, having a cluster C diagnosis was not associated with a significantly lower likelihood of remission at 6 months (AOR = 0.83; 95% CI 0.42-1.65). Increased odds of having PDS at 6-month follow-up were seen with cluster A/nonspecified PD patients (AOR = 3.35; 95% CI 1.92-5.84) as well as cluster B patients (AOR = 3.66; 95% CI 2.45-5.47). However, cluster C patents did not have significantly increased odds of experiencing persistent depressive symptoms at 6-month
K-means clustering for optimal partitioning and dynamic load balancing of parallel hierarchical N-body simulations

International Nuclear Information System (INIS)

Marzouk, Youssef M.; Ghoniem, Ahmed F.

2005-01-01

A number of complex physical problems can be approached through N-body simulation, from fluid flow at high Reynolds number to gravitational astrophysics and molecular dynamics. In all these applications, direct summation is prohibitively expensive for large N and thus hierarchical methods are employed for fast summation. This work introduces new algorithms, based on k-means clustering, for partitioning parallel hierarchical N-body interactions. We demonstrate that the number of particle-cluster interactions and the order at which they are performed are directly affected by partition geometry. Weighted k-means partitions minimize the sum of clusters' second moments and create well-localized domains, and thus reduce the computational cost of N-body approximations by enabling the use of lower-order approximations and fewer cells. We also introduce compatible techniques for dynamic load balancing, including adaptive scaling of cluster volumes and adaptive redistribution of cluster centroids. We demonstrate the performance of these algorithms by constructing a parallel treecode for vortex particle simulations, based on the serial variable-order Cartesian code developed by Lindsay and Krasny [Journal of Computational Physics 172 (2) (2001) 879-907]. The method is applied to vortex simulations of a transverse jet. Results show outstanding parallel efficiencies even at high concurrencies, with velocity evaluation errors maintained at or below their serial values; on a realistic distribution of 1.2 million vortex particles, we observe a parallel efficiency of 98% on 1024 processors. Excellent load balance is achieved even in the face of several obstacles, such as an irregular, time-evolving particle distribution containing a range of length scales and the continual introduction of new vortex particles throughout the domain. Moreover, results suggest that k-means yields a more efficient partition of the domain than a global oct-tree
Performance Analysis of Combined Methods of Genetic Algorithm and K-Means Clustering in Determining the Value of Centroid

Science.gov (United States)

Adya Zizwan, Putra; Zarlis, Muhammad; Budhiarti Nababan, Erna

2017-12-01

The determination of Centroid on K-Means Algorithm directly affects the quality of the clustering results. Determination of centroid by using random numbers has many weaknesses. The GenClust algorithm that combines the use of Genetic Algorithms and K-Means uses a genetic algorithm to determine the centroid of each cluster. The use of the GenClust algorithm uses 50% chromosomes obtained through deterministic calculations and 50% is obtained from the generation of random numbers. This study will modify the use of the GenClust algorithm in which the chromosomes used are 100% obtained through deterministic calculations. The results of this study resulted in performance comparisons expressed in Mean Square Error influenced by centroid determination on K-Means method by using GenClust method, modified GenClust method and also classic K-Means.
Cluster correlation effects in 12C+12C and 14N+10B fusion-evaporation reactions

Directory of Open Access Journals (Sweden)

Morelli L.

2015-01-01

Full Text Available The decay of highly excited states of 24Mg is studied in fusion evaporation events completely detected in charge in the reactions 12C+12C and 14N+10B at 95 and 80 MeV incident energy respectively. The comparison of light charged particles measured spectra with statistical model predictions suggests that the dominant reaction mechanism is compound nucleus (CN formation and decay. However, in both reactions, a discrepancy with statistical expectations is found for α particles detected in coincidence with Carbon, Oxigen and Neon residues. The comparison between the two reactions shows that this discrepancy is only partly explained by an entrance channel effect. Evidence for cluster correlations in excited 24Mg CN is suggested by the comparison between the measured and calculated branching ratios for the channels involving α particles.
Clustering disaggregated load profiles using a Dirichlet process mixture model

International Nuclear Information System (INIS)

Granell, Ramon; Axon, Colin J.; Wallom, David C.H.

2015-01-01

Highlights: • We show that the Dirichlet process mixture model is scaleable. • Our model does not require the number of clusters as an input. • Our model creates clusters only by the features of the demand profiles. • We have used both residential and commercial data sets. - Abstract: The increasing availability of substantial quantities of power-use data in both the residential and commercial sectors raises the possibility of mining the data to the advantage of both consumers and network operations. We present a Bayesian non-parametric model to cluster load profiles from households and business premises. Evaluators show that our model performs as well as other popular clustering methods, but unlike most other methods it does not require the number of clusters to be predetermined by the user. We used the so-called ‘Chinese restaurant process’ method to solve the model, making use of the Dirichlet-multinomial distribution. The number of clusters grew logarithmically with the quantity of data, making the technique suitable for scaling to large data sets. We were able to show that the model could distinguish features such as the nationality, household size, and type of dwelling between the cluster memberships
Normalized mutual information based PET-MR registration using K-Means clustering and shading correction

NARCIS (Netherlands)

Knops, Z.F.; Maintz, J.B.A.; Viergever, M.A.; Pluim, J.P.W.; Gee, J.C.; Maintz, J.B.A.; Vannier, M.W.

2003-01-01

A method for the efficient re-binning and shading based correction of intensity distributions of the images prior to normalized mutual information based registration is presented. Our intensity distribution re-binning method is based on the K-means clustering algorithm as opposed to the generally
Properties of proton clusters in inelastic CC interactions accompanied by the production of Λ and K0 particles at p = 4.2 GeV/c per nucleon

International Nuclear Information System (INIS)

Bekmirzaev, R.N.; Shukurov, E.Kh.; Kuznetsov, A.A.; Yuldashev, B.S.

2004-01-01

Within a new relativistically invariant approach, the properties of proton clusters that are formed together with Λ and K 0 particles in inelastic CC interactions at p = 4.2 GeV/c per nucleon are investigated in the space of relative 4-velocities. The observed proton clusters are shown to be characterized by high values of the mean kinetic energy of the protons in the cluster rest frame: p > = 100 ± 2 MeV
An incremental DPMM-based method for trajectory clustering, modeling, and retrieval.

Science.gov (United States)

Hu, Weiming; Li, Xi; Tian, Guodong; Maybank, Stephen; Zhang, Zhongfei

2013-05-01

Trajectory analysis is the basis for many applications, such as indexing of motion events in videos, activity recognition, and surveillance. In this paper, the Dirichlet process mixture model (DPMM) is applied to trajectory clustering, modeling, and retrieval. We propose an incremental version of a DPMM-based clustering algorithm and apply it to cluster trajectories. An appropriate number of trajectory clusters is determined automatically. When trajectories belonging to new clusters arrive, the new clusters can be identified online and added to the model without any retraining using the previous data. A time-sensitive Dirichlet process mixture model (tDPMM) is applied to each trajectory cluster for learning the trajectory pattern which represents the time-series characteristics of the trajectories in the cluster. Then, a parameterized index is constructed for each cluster. A novel likelihood estimation algorithm for the tDPMM is proposed, and a trajectory-based video retrieval model is developed. The tDPMM-based probabilistic matching method and the DPMM-based model growing method are combined to make the retrieval model scalable and adaptable. Experimental comparisons with state-of-the-art algorithms demonstrate the effectiveness of our algorithm.
Simulating star clusters with the AMUSE software framework. I. Dependence of cluster lifetimes on model assumptions and cluster dissolution modes

International Nuclear Information System (INIS)

Whitehead, Alfred J.; McMillan, Stephen L. W.; Vesperini, Enrico; Portegies Zwart, Simon

2013-01-01

We perform a series of simulations of evolving star clusters using the Astrophysical Multipurpose Software Environment (AMUSE), a new community-based multi-physics simulation package, and compare our results to existing work. These simulations model a star cluster beginning with a King model distribution and a selection of power-law initial mass functions and contain a tidal cutoff. They are evolved using collisional stellar dynamics and include mass loss due to stellar evolution. After studying and understanding that the differences between AMUSE results and results from previous studies are understood, we explored the variation in cluster lifetimes due to the random realization noise introduced by transforming a King model to specific initial conditions. This random realization noise can affect the lifetime of a simulated star cluster by up to 30%. Two modes of star cluster dissolution were identified: a mass evolution curve that contains a runaway cluster dissolution with a sudden loss of mass, and a dissolution mode that does not contain this feature. We refer to these dissolution modes as 'dynamical' and 'relaxation' dominated, respectively. For Salpeter-like initial mass functions, we determined the boundary between these two modes in terms of the dynamical and relaxation timescales.
Service-Aware Clustering: An Energy-Efficient Model for the Internet-of-Things.

Science.gov (United States)

Bagula, Antoine; Abidoye, Ademola Philip; Zodi, Guy-Alain Lusilao

2015-12-23

Current generation wireless sensor routing algorithms and protocols have been designed based on a myopic routing approach, where the motes are assumed to have the same sensing and communication capabilities. Myopic routing is not a natural fit for the IoT, as it may lead to energy imbalance and subsequent short-lived sensor networks, routing the sensor readings over the most service-intensive sensor nodes, while leaving the least active nodes idle. This paper revisits the issue of energy efficiency in sensor networks to propose a clustering model where sensor devices' service delivery is mapped into an energy awareness model, used to design a clustering algorithm that finds service-aware clustering (SAC) configurations in IoT settings. The performance evaluation reveals the relative energy efficiency of the proposed SAC algorithm compared to related routing algorithms in terms of energy consumption, the sensor nodes' life span and its traffic engineering efficiency in terms of throughput and delay. These include the well-known low energy adaptive clustering hierarchy (LEACH) and LEACH-centralized (LEACH-C) algorithms, as well as the most recent algorithms, such as DECSA and MOCRN.
Service-Aware Clustering: An Energy-Efficient Model for the Internet-of-Things

Directory of Open Access Journals (Sweden)

Antoine Bagula

2015-12-01

Full Text Available Current generation wireless sensor routing algorithms and protocols have been designed based on a myopic routing approach, where the motes are assumed to have the same sensing and communication capabilities. Myopic routing is not a natural fit for the IoT, as it may lead to energy imbalance and subsequent short-lived sensor networks, routing the sensor readings over the most service-intensive sensor nodes, while leaving the least active nodes idle. This paper revisits the issue of energy efficiency in sensor networks to propose a clustering model where sensor devices’ service delivery is mapped into an energy awareness model, used to design a clustering algorithm that finds service-aware clustering (SAC configurations in IoT settings. The performance evaluation reveals the relative energy efficiency of the proposed SAC algorithm compared to related routing algorithms in terms of energy consumption, the sensor nodes’ life span and its traffic engineering efficiency in terms of throughput and delay. These include the well-known low energy adaptive clustering hierarchy (LEACH and LEACH-centralized (LEACH-C algorithms, as well as the most recent algorithms, such as DECSA and MOCRN.

Dynamical aspects of galaxy clustering

International Nuclear Information System (INIS)

Fall, S.M.

1980-01-01

Some recent work on the origin and evolution of galaxy clustering is reviewed, particularly within the context of the gravitational instability theory and the hot big-bang cosmological model. Statistical measures of clustering, including correlation functions and multiplicity functions, are explained and discussed. The close connection between galaxy formation and clustering is emphasized. Additional topics include the dependence of galaxy clustering on the spectrum of primordial density fluctuations and the mean mass density of the Universe. (author)
Structures of $p$-shell double-$\\Lambda$ hypernuclei studied with microscopic cluster models

OpenAIRE

Kanada-En'yo, Yoshiko

2018-01-01

$0s$-orbit $\\Lambda$ states in $p$-shell double-$\\Lambda$ hypernuclei ($^{\\ \\,A}_{\\Lambda\\Lambda}Z$), $^{\\ \\,8}_{\\Lambda\\Lambda}\\textrm{Li}$, $^{\\ \\,9}_{\\Lambda\\Lambda}\\textrm{Li}$, $^{10,11,12}_{\\ \\ \\ \\ \\ \\Lambda\\Lambda}\\textrm{Be}$, $^{12,13}_{\\ \\ \\Lambda\\Lambda}\\textrm{B}$, and $^{\\,14}_{\\Lambda\\Lambda}\\textrm{C}$ are investigated. Microscopic cluster models are applied to core nuclear part and a potential model is adopted for $\\Lambda$ particles. The $\\Lambda$-core potential is a folding ...
Infinite number of solvable generalizations of XY-chain, with cluster state, and with central charge c = m/2

Science.gov (United States)

Minami, Kazuhiko

2017-12-01

An infinite number of spin chains are solved and it is derived that the ground-state phase transitions belong to the universality classes with central charge c = m / 2, where m is an integer. The models are diagonalized by automatically obtained transformations, many of which are different from the Jordan-Wigner transformation. The free energies, correlation functions, string order parameters, exponents, central charges, and the phase diagram are obtained. Most of the examples consist of the stabilizers of the cluster state. A unified structure of the one-dimensional XY and cluster-type spin chains is revealed, and other series of solvable models can be obtained through this formula.
An algebraic model for three-cluster giant molecules

International Nuclear Information System (INIS)

Hess, P.O.; Bijker, R.; Misicu, S.

2001-01-01

After an introduction to the algebraic U(7) model for three bodies, we present a relation of a geometrical description of three-cluster molecule to the algebraic U(7) model. Stiffness parameters of oscillations between each of two clusters are calculated and translated to the model parameter values of the algebraic model. The model is applied to the trinuclear system l32 Sn+ α + ll6 Pd which occurs in the ternary cold fission of 252 Cf. (Author)
Structure of small TiC n clusters: A theoretical study

International Nuclear Information System (INIS)

Largo, Laura; Cimas, Alvaro; Redondo, Pilar; Rayon, Victor M.; Barrientos, Carmen

2006-01-01

A theoretical study of the TiC n (n = 1-8) clusters has been carried out at the B3LYP/6-311+G(d) level. Molecular properties for three different isomers, namely linear, cyclic, and fan species, have been determined. The fan isomers, where the titanium atom is essentially side-bonded to the entire C n unit, are predicted to be more stable than both linear and cyclic isomers. Only for the largest studied species, TiC 8 , the cyclic isomer is located lower in energy. An even-odd parity effect in the incremental binding energies is observed for the three isomers, n-even species being in general more stable for linear and fan isomers, whereas for the cyclic species n-odd clusters are favoured. A topological analysis of the electronic charge density shows that all cyclic isomers correspond to true monocyclic rings, whereas for the fan species a variety of different connectivities has been observed
Comprehensive sulfation model verified for T-T sorbent clusters during flue gas desulfurization at moderate temperatures

Energy Technology Data Exchange (ETDEWEB)

Yuran Li; Haiying Qi; Changfu You; Lizhai Yang [Tsinghua University, Beijing (China). Key Laboratory for Thermal Science and Power Engineering of Ministry of Education

2010-08-15

An empirical sulfation model for T-T sorbent clusters was developed based on amassed experimental results under moderate temperatures (300-800{sup o}C). In the model, the reaction rate is a function of clusters mass, SO{sub 2} concentration, CO{sub 2} concentration, calcium conversion and temperature. The smaller pore volume partly results in a lower reaction rate at lower temperatures. The exponent on SO{sub 2} concentration is 0.88 in the rapid reaction stage and then decreases gradually as reaction progresses. The exponent on the fraction of the unreacted calcium is 1/3 in the first stage and then increases significantly in the second stage. The CO{sub 2} concentration has a negative influence on SO{sub 2} removal, especially for the temperature range of 400-650{sup o}C, which should be avoided to achieve a high effective calcium conversion. The sulfation model has been verified for the T-T sorbent clusters and has also been applied to CaO particles. Over extensive reaction conditions, the predictions agree well with experimental data. 17 refs., 10 figs., 2 tabs.
Exactly soluble models for surface partition of large clusters

International Nuclear Information System (INIS)

Bugaev, K.A.; Bugaev, K.A.; Elliott, J.B.

2007-01-01

The surface partition of large clusters is studied analytically within a framework of the 'Hills and Dales Model'. Three formulations are solved exactly by using the Laplace-Fourier transformation method. In the limit of small amplitude deformations, the 'Hills and Dales Model' gives the upper and lower bounds for the surface entropy coefficient of large clusters. The found surface entropy coefficients are compared with those of large clusters within the 2- and 3-dimensional Ising models
Cluster form factor calculation in the ab initio no-core shell model

International Nuclear Information System (INIS)

Navratil, Petr

2004-01-01

We derive expressions for cluster overlap integrals or channel cluster form factors for ab initio no-core shell model (NCSM) wave functions. These are used to obtain the spectroscopic factors and can serve as a starting point for the description of low-energy nuclear reactions. We consider the composite system and the target nucleus to be described in the Slater determinant (SD) harmonic oscillator (HO) basis while the projectile eigenstate to be expanded in the Jacobi coordinate HO basis. This is the most practical case. The spurious center of mass components present in the SD bases are removed exactly. The calculated cluster overlap integrals are translationally invariant. As an illustration, we present results of cluster form factor calculations for 5 He vertical bar 4 He+n>, 5 He vertical bar 3 H+d>, 6 Li vertical bar 4 He+d>, 6 Be vertical bar 3 He+ 3 He>, 7 Li vertical bar 4 He+ 3 H>, 7 Li vertical bar 6 Li+n>, 8 Be vertical bar 6 Li+d>, 8 Be vertical bar 7 Li+p>, 9 Li vertical bar 8 Li+n>, and 13 C vertical bar 12 C+n>, with all the nuclei described by multi-(ℎ/2π)Ω NCSM wave functions
An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks

Directory of Open Access Journals (Sweden)

Yihang Yin

2015-08-01

Full Text Available Wireless sensor networks (WSNs have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA. First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks.

Science.gov (United States)

Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong

2015-08-07

Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
Cluster-surface collisions: Characteristics of Xe55- and C20 - Si[111] surface bombardment

International Nuclear Information System (INIS)

Cheng, H.

1999-01-01

Molecular dynamics (MD) simulations are performed to study the cluster-surface collision processes. Two types of clusters, Xe 55 and C 20 are used as case studies of materials with very different properties. In studies of Xe 55 - Si[111] surface bombardment, two initial velocities, 5.0 and 10.0 km/s (normal to the surface) are chosen to investigate the dynamical consequences of the initial energy or velocity in the cluster-surface impact. A transition in the speed of kinetic energy propagation, from subsonic velocities to supersonic velocities, is observed. Energy transfer, from cluster translational motion to the substrate, occurs at an extremely fast rate that increases as the incident velocity increases. Local melting and amorphous layer formation in the surfaces are found via energetic analysis of individual silicon atoms. For C 20 , the initial velocity ranges from 10 to 100 km/s. The clusters are damaged immediately upon impact. Similar to Xe 55 , increase in the potential energy is larger than the increase in internal kinetic energy. However, the patterns of energy distribution are different for the two types of clusters. The energy transfer from the carbon clusters to Si(111) surface is found to be slower than that found in the Xe clusters. Fragmentation of the carbon cluster occurs when the initial velocity is greater than 30 km/s. At 10 km/s, the clusters show recrystallization at later times. The average penetration depth displays a nonlinear dependence on the initial velocity. Disturbance in the surface caused by C 20 is discussed and compared to the damage caused by Xe 55 . Energetics, structures, and dynamics of these systems are fully analyzed and characterized. copyright 1999 American Institute of Physics
Conveyor Performance based on Motor DC 12 Volt Eg-530ad-2f using K-Means Clustering

Science.gov (United States)

Arifin, Zaenal; Artini, Sri DP; Much Ibnu Subroto, Imam

2017-04-01

To produce goods in industry, a controlled tool to improve production is required. Separation process has become a part of production process. Separation process is carried out based on certain criteria to get optimum result. By knowing the characteristics performance of a controlled tools in separation process the optimum results is also possible to be obtained. Clustering analysis is popular method for clustering data into smaller segments. Clustering analysis is useful to divide a group of object into a k-group in which the member value of the group is homogeny or similar. Similarity in the group is set based on certain criteria. The work in this paper based on K-Means method to conduct clustering of loading in the performance of a conveyor driven by a dc motor 12 volt eg-530-2f. This technique gives a complete clustering data for a prototype of conveyor driven by dc motor to separate goods in term of height. The parameters involved are voltage, current, time of travelling. These parameters give two clusters namely optimal cluster with center of cluster 10.50 volt, 0.3 Ampere, 10.58 second, and unoptimal cluster with center of cluster 10.88 volt, 0.28 Ampere and 40.43 second.
Eigenspace-based fuzzy c-means for sensing trending topics in Twitter

Science.gov (United States)

Muliawati, T.; Murfi, H.

2017-07-01

As the information and communication technology are developed, the fulfillment of information can be obtained through social media, like Twitter. The enormous number of internet users has triggered fast and large data flow, thus making the manual analysis is difficult or even impossible. An automated methods for data analysis is needed, one of which is the topic detection and tracking. An alternative method other than latent Dirichlet allocation (LDA) is a soft clustering approach using Fuzzy C-Means (FCM). FCM meets the assumption that a document may consist of several topics. However, FCM works well in low-dimensional data but fails in high-dimensional data. Therefore, we propose an approach where FCM works on low-dimensional data by reducing the data using singular value decomposition (SVD). Our simulations show that this approach gives better accuracies in term of topic recall than LDA for sensing trending topic in Twitter about an event.
A cluster expansion model for predicting activation barrier of atomic processes

International Nuclear Information System (INIS)

Rehman, Tafizur; Jaipal, M.; Chatterjee, Abhijit

2013-01-01

We introduce a procedure based on cluster expansion models for predicting the activation barrier of atomic processes encountered while studying the dynamics of a material system using the kinetic Monte Carlo (KMC) method. Starting with an interatomic potential description, a mathematical derivation is presented to show that the local environment dependence of the activation barrier can be captured using cluster interaction models. Next, we develop a systematic procedure for training the cluster interaction model on-the-fly, which involves: (i) obtaining activation barriers for handful local environments using nudged elastic band (NEB) calculations, (ii) identifying the local environment by analyzing the NEB results, and (iii) estimating the cluster interaction model parameters from the activation barrier data. Once a cluster expansion model has been trained, it is used to predict activation barriers without requiring any additional NEB calculations. Numerical studies are performed to validate the cluster expansion model by studying hop processes in Ag/Ag(100). We show that the use of cluster expansion model with KMC enables efficient generation of an accurate process rate catalog
Estimation of breast percent density in raw and processed full field digital mammography images via adaptive fuzzy c-means clustering and support vector machine segmentation

Energy Technology Data Exchange (ETDEWEB)

Keller, Brad M.; Nathan, Diane L.; Wang Yan; Zheng Yuanjie; Gee, James C.; Conant, Emily F.; Kontos, Despina [Department of Radiology, University of Pennsylvania, Philadelphia, Pennsylvania 19104 (United States); Applied Mathematics and Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104 (United States); Department of Radiology, University of Pennsylvania, Philadelphia, Pennsylvania 19104 (United States)

2012-08-15

Purpose: The amount of fibroglandular tissue content in the breast as estimated mammographically, commonly referred to as breast percent density (PD%), is one of the most significant risk factors for developing breast cancer. Approaches to quantify breast density commonly focus on either semiautomated methods or visual assessment, both of which are highly subjective. Furthermore, most studies published to date investigating computer-aided assessment of breast PD% have been performed using digitized screen-film mammograms, while digital mammography is increasingly replacing screen-film mammography in breast cancer screening protocols. Digital mammography imaging generates two types of images for analysis, raw (i.e., 'FOR PROCESSING') and vendor postprocessed (i.e., 'FOR PRESENTATION'), of which postprocessed images are commonly used in clinical practice. Development of an algorithm which effectively estimates breast PD% in both raw and postprocessed digital mammography images would be beneficial in terms of direct clinical application and retrospective analysis. Methods: This work proposes a new algorithm for fully automated quantification of breast PD% based on adaptive multiclass fuzzy c-means (FCM) clustering and support vector machine (SVM) classification, optimized for the imaging characteristics of both raw and processed digital mammography images as well as for individual patient and image characteristics. Our algorithm first delineates the breast region within the mammogram via an automated thresholding scheme to identify background air followed by a straight line Hough transform to extract the pectoral muscle region. The algorithm then applies adaptive FCM clustering based on an optimal number of clusters derived from image properties of the specific mammogram to subdivide the breast into regions of similar gray-level intensity. Finally, a SVM classifier is trained to identify which clusters within the breast tissue are likely
Dynamical transitions in large systems of mean field-coupled Landau-Stuart oscillators: Extensive chaos and cluster states.

Science.gov (United States)

Ku, Wai Lim; Girvan, Michelle; Ott, Edward

2015-12-01

In this paper, we study dynamical systems in which a large number N of identical Landau-Stuart oscillators are globally coupled via a mean-field. Previously, it has been observed that this type of system can exhibit a variety of different dynamical behaviors. These behaviors include time periodic cluster states in which each oscillator is in one of a small number of groups for which all oscillators in each group have the same state which is different from group to group, as well as a behavior in which all oscillators have different states and the macroscopic dynamics of the mean field is chaotic. We argue that this second type of behavior is "extensive" in the sense that the chaotic attractor in the full phase space of the system has a fractal dimension that scales linearly with N and that the number of positive Lyapunov exponents of the attractor also scales linearly with N. An important focus of this paper is the transition between cluster states and extensive chaos as the system is subjected to slow adiabatic parameter change. We observe discontinuous transitions between the cluster states (which correspond to low dimensional dynamics) and the extensively chaotic states. Furthermore, examining the cluster state, as the system approaches the discontinuous transition to extensive chaos, we find that the oscillator population distribution between the clusters continually evolves so that the cluster state is always marginally stable. This behavior is used to reveal the mechanism of the discontinuous transition. We also apply the Kaplan-Yorke formula to study the fractal structure of the extensively chaotic attractors.
Graphene synthesis on SiC: Reduced graphitization temperature by C-cluster and Ar-ion implantation

International Nuclear Information System (INIS)

Zhang, R.; Li, H.; Zhang, Z.D.; Wang, Z.S.; Zhou, S.Y.; Wang, Z.; Li, T.C.; Liu, J.R.; Fu, D.J.

2015-01-01

Thermal decomposition of SiC is a promising method for high quality production of wafer-scale graphene layers, when the high decomposition temperature of SiC is substantially reduced. The high decomposition temperature of SiC around 1400 °C is a technical obstacle. In this work, we report on graphene synthesis on 6H–SiC with reduced graphitization temperature via ion implantation. When energetic Ar, C 1 and C 6 -cluster ions implanted into 6H–SiC substrates, some of the Si–C bonds have been broken due to the electronic and nuclear collisions. Owing to the radiation damage induced bond breaking and the implanted C atoms as an additional C source the graphitization temperature was reduced by up to 200 °C
Clustering and training set selection methods for improving the accuracy of quantitative laser induced breakdown spectroscopy

Energy Technology Data Exchange (ETDEWEB)

Anderson, Ryan B., E-mail: randerson@astro.cornell.edu [Cornell University Department of Astronomy, 406 Space Sciences Building, Ithaca, NY 14853 (United States); Bell, James F., E-mail: Jim.Bell@asu.edu [Arizona State University School of Earth and Space Exploration, Bldg.: INTDS-A, Room: 115B, Box 871404, Tempe, AZ 85287 (United States); Wiens, Roger C., E-mail: rwiens@lanl.gov [Los Alamos National Laboratory, P.O. Box 1663 MS J565, Los Alamos, NM 87545 (United States); Morris, Richard V., E-mail: richard.v.morris@nasa.gov [NASA Johnson Space Center, 2101 NASA Parkway, Houston, TX 77058 (United States); Clegg, Samuel M., E-mail: sclegg@lanl.gov [Los Alamos National Laboratory, P.O. Box 1663 MS J565, Los Alamos, NM 87545 (United States)

2012-04-15

We investigated five clustering and training set selection methods to improve the accuracy of quantitative chemical analysis of geologic samples by laser induced breakdown spectroscopy (LIBS) using partial least squares (PLS) regression. The LIBS spectra were previously acquired for 195 rock slabs and 31 pressed powder geostandards under 7 Torr CO{sub 2} at a stand-off distance of 7 m at 17 mJ per pulse to simulate the operational conditions of the ChemCam LIBS instrument on the Mars Science Laboratory Curiosity rover. The clustering and training set selection methods, which do not require prior knowledge of the chemical composition of the test-set samples, are based on grouping similar spectra and selecting appropriate training spectra for the partial least squares (PLS2) model. These methods were: (1) hierarchical clustering of the full set of training spectra and selection of a subset for use in training; (2) k-means clustering of all spectra and generation of PLS2 models based on the training samples within each cluster; (3) iterative use of PLS2 to predict sample composition and k-means clustering of the predicted compositions to subdivide the groups of spectra; (4) soft independent modeling of class analogy (SIMCA) classification of spectra, and generation of PLS2 models based on the training samples within each class; (5) use of Bayesian information criteria (BIC) to determine an optimal number of clusters and generation of PLS2 models based on the training samples within each cluster. The iterative method and the k-means method using 5 clusters showed the best performance, improving the absolute quadrature root mean squared error (RMSE) by {approx} 3 wt.%. The statistical significance of these improvements was {approx} 85%. Our results show that although clustering methods can modestly improve results, a large and diverse training set is the most reliable way to improve the accuracy of quantitative LIBS. In particular, additional sulfate standards and
(16) {C}16C-elastic scattering examined using several models at different energies

Science.gov (United States)

El-hammamy, M. N.; Attia, A.

2018-05-01

In the present paper, the first results concerning the theoretical analysis of the ^{16}C + p reaction by investigating two elastic scattering angular distributions measured at high energy compared to low energy for this system are reported. Several models for the real part of the nuclear potential are tested within the optical model formalism. The imaginary potential has a Woods-Saxon shape with three free parameters. Two types of density distribution and three different cluster structures for ^{16}C are assumed in the analysis. The results are compared with each other as well as with the experimental data to give evidence of the importance of these studied items.
The Parental Environment Cluster Model of Child Neglect: An Integrative Conceptual Model.

Science.gov (United States)

Burke, Judith; Chandy, Joseph; Dannerbeck, Anne; Watt, J. Wilson

1998-01-01

Presents Parental Environment Cluster model of child neglect which identifies three clusters of factors involved in parents' neglectful behavior: (1) parenting skills and functions; (2) development and use of positive social support; and (3) resource availability and management skills. Model offers a focal theory for research, structure for…

Application of k-means clustering algorithm in grouping the DNA sequences of hepatitis B virus (HBV)

Science.gov (United States)

Bustamam, A.; Tasman, H.; Yuniarti, N.; Frisca, Mursidah, I.

2017-07-01

Based on WHO data, an estimated of 15 millions people worldwide who are infected with hepatitis B (HBsAg+), which is caused by HBV virus, are also infected by hepatitis D, which is caused by HDV virus. Hepatitis D infection can occur simultaneously with hepatitis B (co infection) or after a person is exposed to chronic hepatitis B (super infection). Since HDV cannot live without HBV, HDV infection is closely related to HBV infection, hence it is very realistic that every effort of prevention against hepatitis B can indirectly prevent hepatitis D. This paper presents clustering of HBV DNA sequences by using k-means clustering algorithm and R programming. Clustering processes are started with collecting HBV DNA sequences from GenBank, then performing extraction HBV DNA sequences using n-mers frequency and furthermore the extraction results are collected as a matrix and normalized using the min-max normalization with interval [0, 1] which will later be used as an input data. The number of clusters is two and the initial centroid selected of the cluster is chosen randomly. In each iteration, the distance of every object to each centroid are calculated using the Euclidean distance and the minimum distance is selected to determine the membership in a cluster until two convergent clusters are created. As the result, the HBV viruses in the first cluster is more virulent than the HBV viruses in the second cluster, so the HBV viruses in the first cluster can potentially evolve with HDV viruses that cause hepatitis D.
Clusters in nuclei

CERN Document Server

Following the pioneering discovery of alpha clustering and of molecular resonances, the field of nuclear clustering is today one of those domains of heavy-ion nuclear physics that faces the greatest challenges, yet also contains the greatest opportunities. After many summer schools and workshops, in particular over the last decade, the community of nuclear molecular physicists has decided to collaborate in producing a comprehensive collection of lectures and tutorial reviews covering the field. This third volume follows the successful Lect. Notes Phys. 818 (Vol. 1) and 848 (Vol. 2), and comprises six extensive lectures covering the following topics: - Gamma Rays and Molecular Structure - Faddeev Equation Approach for Three Cluster Nuclear Reactions - Tomography of the Cluster Structure of Light Nuclei Via Relativistic Dissociation - Clustering Effects Within the Dinuclear Model : From Light to Hyper-heavy Molecules in Dynamical Mean-field Approach - Clusterization in Ternary Fission - Clusters in Light N...
STUDI SIMULASI MENGGUNAKAN FUZZY C-MEANS DALAM MENGKLASIFIKASI KONSTRUK TES

Directory of Open Access Journals (Sweden)

Rukli Rukli

2013-01-01

Full Text Available Tulisan ini memperkenalkan metode fuzzy c-means dalam mengklasifikasi konstruk tes. Untuk memverifikasi sifat unidimensional suatu tes biasanya menggunakan analisis faktor sebagai bagian dari statistik parametrik dengan beberapa persyaratan yang ketat sedangkan metode fuzzy c-means termasuk metode heuristik yang tidak memerlukan persyaratan yang ketat. Studi simulasi penelitian ini menggunakan dua metode yakni analisis faktor menggunakan program SPSS dan fuzzy c-means menggunakan program Matlab. Data simulasi menggunakan tipe data dikotomi dan politomi yang dibangkitkan lewat prog-ram Microsoft Office Excel dengan desain 2 kategori, yakni: tiga butir soal dengan banyak peserta tes 10, dan 30 butir soal dengan banyak peserta tes 100. Hasil simulasi menunjukkan bahwa metode fuzzy c-means lebih memberikan gambaran pengelompokan secara deskriptif dan dinamis pada semua desain yang telah dibuat dalam memverifikasi unidimensional pada suatu tes. Kata kunci: fuzzy c-means, analisis faktor, unidimensional _____________________________________________________________ SIMULATION STUDY USING FUZZY C-MEANS FOR CLASIFYING TEST CONSTRUCTION Abstract This paper introduces the fuzzy c-means method for classifying the test constructs. To verify the unidimensional a test typically uses factor analysis as part of parametric statistics with some strict requirements, while fuzzy c-means methods including method heuristic that do not require strict require-ments. Simulation comparison between the method of factor analysis using SPSS program and fuzzy c-means methods using Matlab. Simulation data using data type dichotomy and politomus generated through Microsoft Office Excel programs each with a number of 3 items using the number of participants 10 tests, while the number of 30 test items using the number as many as 100 participants. The simulation results show that the fuzzy c-means method provides a more descriptive and dyna-mic grouping of all the designs that
Determining the Number of Instars in Simulium quinquestriatum (Diptera: Simuliidae) Using k-Means Clustering via the Canberra Distance.

Science.gov (United States)

Yang, Yao Ming; Jia, Ruo; Xun, Hui; Yang, Jie; Chen, Qiang; Zeng, Xiang Guang; Yang, Ming

2018-02-21

Simulium quinquestriatum Shiraki (Diptera: Simuliidae), a human-biting fly that is distributed widely across Asia, is a vector for multiple pathogens. However, the larval development of this species is poorly understood. In this study, we determined the number of instars in this pest using three batches of field-collected larvae from Guiyang, Guizhou, China. The postgenal length, head capsule width, mandibular phragma length, and body length of 773 individuals were measured, and k-means clustering was used for instar grouping. Four distance measures-Manhattan, Euclidean, Chebyshev, and Canberra-were determined. The reported instar numbers, ranging from 4 to 11, were set as initial cluster centers for k-means clustering. The Canberra distance yielded reliable instar grouping, which was consistent with the first instar, as characterized by egg bursters and prepupae with dark histoblasts. Females and males of the last cluster of larvae were identified using Feulgen-stained gonads. Morphometric differences between the two sexes were not significant. Validation was performed using the Brooks-Dyar and Crosby rules, revealing that the larval stage of S. quinquestriatum is composed of eight instars.
An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

Science.gov (United States)

Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E

2017-04-12

Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
Merging symmetry projection methods with coupled cluster theory: Lessons from the Lipkin model Hamiltonian

Energy Technology Data Exchange (ETDEWEB)

Wahlen-Strothman, J. M. [Rice Univ., Houston, TX (United States); Henderson, T. H. [Rice Univ., Houston, TX (United States); Hermes, M. R. [Rice Univ., Houston, TX (United States); Degroote, M. [Rice Univ., Houston, TX (United States); Qiu, Y. [Rice Univ., Houston, TX (United States); Zhao, J. [Rice Univ., Houston, TX (United States); Dukelsky, J. [Consejo Superior de Investigaciones Cientificas (CSIC), Madrid (Spain). Inst. de Estructura de la Materia; Scuseria, G. E. [Rice Univ., Houston, TX (United States)

2018-01-03

Coupled cluster and symmetry projected Hartree-Fock are two central paradigms in electronic structure theory. However, they are very different. Single reference coupled cluster is highly successful for treating weakly correlated systems, but fails under strong correlation unless one sacrifices good quantum numbers and works with broken-symmetry wave functions, which is unphysical for finite systems. Symmetry projection is effective for the treatment of strong correlation at the mean-field level through multireference non-orthogonal configuration interaction wavefunctions, but unlike coupled cluster, it is neither size extensive nor ideal for treating dynamic correlation. We here examine different scenarios for merging these two dissimilar theories. We carry out this exercise over the integrable Lipkin model Hamiltonian, which despite its simplicity, encompasses non-trivial physics for degenerate systems and can be solved via diagonalization for a very large number of particles. We show how symmetry projection and coupled cluster doubles individually fail in different correlation limits, whereas models that merge these two theories are highly successful over the entire phase diagram. Despite the simplicity of the Lipkin Hamiltonian, the lessons learned in this work will be useful for building an ab initio symmetry projected coupled cluster theory that we expect to be accurate in the weakly and strongly correlated limits, as well as the recoupling regime.
Optimal Cluster Mill Pass Scheduling With an Accurate and Rapid New Strip Crown Model

International Nuclear Information System (INIS)

Malik, Arif S.; Grandhi, Ramana V.; Zipf, Mark E.

2007-01-01

Besides the requirement to roll coiled sheet at high levels of productivity, the optimal pass scheduling of cluster-type reversing cold mills presents the added challenge of assigning mill parameters that facilitate the best possible strip flatness. The pressures of intense global competition, and the requirements for increasingly thinner, higher quality specialty sheet products that are more difficult to roll, continue to force metal producers to commission innovative flatness-control technologies. This means that during the on-line computerized set-up of rolling mills, the mathematical model should not only determine the minimum total number of passes and maximum rolling speed, it should simultaneously optimize the pass-schedule so that desired flatness is assured, either by manual or automated means. In many cases today, however, on-line prediction of strip crown and corresponding flatness for the complex cluster-type rolling mills is typically addressed either by trial and error, by approximate deflection models for equivalent vertical roll-stacks, or by non-physical pattern recognition style models. The abundance of the aforementioned methods is largely due to the complexity of cluster-type mill configurations and the lack of deflection models with sufficient accuracy and speed for on-line use. Without adequate assignment of the pass-schedule set-up parameters, it may be difficult or impossible to achieve the required strip flatness. In this paper, we demonstrate optimization of cluster mill pass-schedules using a new accurate and rapid strip crown model. This pass-schedule optimization includes computations of the predicted strip thickness profile to validate mathematical constraints. In contrast to many of the existing methods for on-line prediction of strip crown and flatness on cluster mills, the demonstrated method requires minimal prior tuning and no extensive training with collected mill data. To rapidly and accurately solve the multi-contact problem
Fracture enhancement based on artificial ants and fuzzy c-means clustering (FCMC) in Dezful Embayment of Iran

International Nuclear Information System (INIS)

Nasseri, Aynur; Mohammadzadeh, Mohammad Jafar; Tabatabaei Raeisi, S Hashem

2015-01-01

This paper deals with the application of the ant colony algorithm (AC) to a seismic dataset from Dezful Embayment in the southwest region of Iran. The objective of the approach is to generate an accurate representation of faults and discontinuities to assist in pertinent matters such as well planning and field optimization. The AC analyzed all spatial discontinuities in the seismic attributes from which features were extracted. True fault information from the attributes was detected by many artificial ants, whereas noise and the remains of the reflectors were eliminated. Furthermore, the fracture enhancement procedure was conducted by three steps on seismic data of the area. In the first step several attributes such as chaos, variance/coherence and dip deviation were taken into account; the resulting maps indicate high-resolution contrast for the variance attribute. Subsequently, the enhancement of spatial discontinuities was performed and finally elimination of the noise and remains of non-faulting events was carried out by simulating the behavior of ant colonies. After considering stepwise attribute optimization, focusing on chaos and variance in particular, an attribute fusion was generated and used in the ant colony algorithm. The resulting map displayed the highest performance in feature detection along the main structural feature trend, confined to a NW–SE direction. Thus, the optimized attribute fusion might be used with greater confidence to map the structural feature network with more accuracy and resolution. In order to assess the performance of the AC in feature detection, and cross validate the reliability of the method used, fuzzy c-means clustering (FCMC) was employed for the same dataset. Comparing the maps illustrates the effectiveness and preference of the AC approach due to its high resolution contrast for structural feature detection compared to the FCMC method. Accordingly, 3D planes of discontinuity determined spatial distribution of
Modeling blue stragglers in young clusters

International Nuclear Information System (INIS)

Lu Pin; Deng Licai; Zhang Xiaobin

2011-01-01

A grid of binary evolution models are calculated for the study of a blue straggler (BS) population in intermediate age (log Age = 7.85–8.95) star clusters. The BS formation via mass transfer and merging is studied systematically using our models. Both Case A and B close binary evolutionary tracks are calculated for a large range of parameters. The results show that BSs formed via Case B are generally bluer and even more luminous than those produced by Case A. Furthermore, the larger range in orbital separations of Case B models provides a probability of producing more BSs than in Case A. Based on the grid of models, several Monte-Carlo simulations of BS populations in the clusters in the age range are carried out. The results show that BSs formed via different channels populate different areas in the color magnitude diagram (CMD). The locations of BSs in CMD for a number of clusters are compared to our simulations as well. In order to investigate the influence of mass transfer efficiency in the models and simulations, a set of models is also calculated by implementing a constant mass transfer efficiency, β = 0.5, during Roche lobe overflow (Case A binary evolution excluded). The result shows BSs can be formed via mass transfer at any given age in both cases. However, the distributions of the BS populations on CMD are different.
Relationship between optical coherence tomography sector peripapillary angioflow-density and Octopus visual field cluster mean defect values.

Directory of Open Access Journals (Sweden)

Gábor Holló

Full Text Available To compare the relationship of Octopus perimeter cluster mean-defect (cluster MD values with the spatially corresponding optical coherence tomography (OCT sector peripapillary angioflow vessel-density (PAFD and sector retinal nerve fiber layer thickness (RNFLT values.High quality PAFD and RNFLT images acquired on the same day with the Angiovue/RTVue-XR Avanti OCT (Optovue Inc., Fremont, USA on 1 eye of 27 stable early-to-moderate glaucoma, 22 medically controlled ocular hypertensive and 13 healthy participants were analyzed. Octopus G2 normal visual field test was made within 3 months from the imaging.Total peripapillary PAFD and RNFLT showed similar strong positive correlation with global mean sensitivity (r-values: 0.6710 and 0.6088, P<0.0001, and similar (P = 0.9614 strong negative correlation (r-values: -0.4462 and -0.4412, P≤0.004 with global MD. Both inferotemporal and superotemporal sector PAFD were significantly (≤0.039 lower in glaucoma than in the other groups. No significant difference between the corresponding inferotemporal and superotemporal parameters was seen. The coefficient of determination (R2 calculated for the relationship between inferotemporal sector PAFD and superotemporal cluster MD (0.5141, P<0.0001 was significantly greater than that between inferotemporal sector RNFLT and superotemporal cluster MD (0.2546, P = 0.0001. The R2 values calculated for the relationships between superotemporal sector PAFD and RNFLT, and inferotemporal cluster MD were similar (0.3747 and 0.4037, respectively, P<0.0001.In the current population the relationship between inferotemporal sector PAFD and superotemporal cluster MD was strong. It was stronger than that between inferotemporal sector RNFLT and superotemporal cluster MD. Further investigations are necessary to clarify if our results are valid for other populations and can be usefully applied for glaucoma research.
Cluster emission at pre-equilibrium stage in Heavy Nuclear Reactions. A Model considering the Thermodynamics of Small Systems

International Nuclear Information System (INIS)

Bermudez Martinez, A.; Damiani, D.; Guzman Martinez, F.; Rodriguez Hoyos, O.; Rodriguez Manso, A.

2015-01-01

Cluster emission at pre-equilibrium stage, in heavy ion fusion reactions of 12 C and 16 O nuclei with 116 Sn, 208 Pb, 238 U are studied. the energy of the projectile nuclei was chosen at 0.25GeV, 0.5GeV and 1GeV. A cluster formation model is developed in order to calculate the cluster size. Thermodynamics of small systems was used in order to examine the cluster behavior inside the nuclear media. This model is based on considering two phases inside the compound nucleus, on one hand the nuclear media phase, and on the other hand the cluster itself. The cluster acts like an instability inside the compound nucleus, provoking an exchange of nucleons with the nuclear media through its surface. The processes were simulated using Monte Carlo methods. We obtained that the cluster emission probability shows great dependence on the cluster size. This project is aimed to implement cluster emission processes, during the pre-equilibrium stage, in the frame of CRISP code (Collaboration Rio-Sao Paulo). (Author)
Modeling and analysis of the spectrum of the globular cluster NGC 2419

Science.gov (United States)

Sharina, M. E.; Shimansky, V. V.; Davoust, E.

2013-06-01

The properties of the stellar population of the unusual object NGC 2419 are studied; this is the most distant high-mass globular cluster of the Galaxy's outer halo, and a spectrum taken with the 1.93-m telescope of the Haute Provence Observatory displays elemental abundance anomalies. Since traditional high-resolution spectroscopicmethods are applicable to bright stars only, spectroscopic information for the cluster's stellar population as a whole, integrated along the spectrograph slit placed in various positions, is used. Population synthesis is carried out for the spectrum of NGC 2419 using synthetic spectra calculated from a grid of stellar model atmospheres, based on the theoretical isochrone from the literature that best fits the color-magnitude diagram of the cluster. The derived age (12.6 billion years), metallicity ([Fe/H] = -2.25 dex), and abundances of helium ( Y = 0.26) and other chemical elements (a total of 14) are in a good qualitative agreement with estimates from the literature made from high-resolution spectra of eight red giants in the cluster. The influence on the spectrum of deviations from local thermodynamic equilibrium is considered for several elements. The derived abundance of α-elements ([ α/Fe] = 0.13 dex, as the mean of [O/Fe], [Mg/Fe], and [Ca/Fe]) differs from the mean value in the literature ([ α/Fe] = 0.4 for the eight brightest red giants) and may be explained by recently discovered in NGC2419 large [a/Fe] dispersion. Further studies of the integrated properties of the stellar population in NGC 2419 using higher-resolution spectrographs in various wavelength ranges should help improve our understanding of the cluster's chemical anomalies.
Infinite number of solvable generalizations of XY-chain, with cluster state, and with central charge c=m/2

Directory of Open Access Journals (Sweden)

Kazuhiko Minami

2017-12-01

Full Text Available An infinite number of spin chains are solved and it is derived that the ground-state phase transitions belong to the universality classes with central charge c=m/2, where m is an integer. The models are diagonalized by automatically obtained transformations, many of which are different from the JordanâWigner transformation. The free energies, correlation functions, string order parameters, exponents, central charges, and the phase diagram are obtained. Most of the examples consist of the stabilizers of the cluster state. A unified structure of the one-dimensional XY and cluster-type spin chains is revealed, and other series of solvable models can be obtained through this formula.
Automated spike sorting algorithm based on Laplacian eigenmaps and k-means clustering.

Science.gov (United States)

Chah, E; Hok, V; Della-Chiesa, A; Miller, J J H; O'Mara, S M; Reilly, R B

2011-02-01

This study presents a new automatic spike sorting method based on feature extraction by Laplacian eigenmaps combined with k-means clustering. The performance of the proposed method was compared against previously reported algorithms such as principal component analysis (PCA) and amplitude-based feature extraction. Two types of classifier (namely k-means and classification expectation-maximization) were incorporated within the spike sorting algorithms, in order to find a suitable classifier for the feature sets. Simulated data sets and in-vivo tetrode multichannel recordings were employed to assess the performance of the spike sorting algorithms. The results show that the proposed algorithm yields significantly improved performance with mean sorting accuracy of 73% and sorting error of 10% compared to PCA which combined with k-means had a sorting accuracy of 58% and sorting error of 10%.A correction was made to this article on 22 February 2011. The spacing of the title was amended on the abstract page. No changes were made to the article PDF and the print version was unaffected.
Ion collision-induced chemistry in pure and mixed loosely bound clusters of coronene and C60 molecules.

Science.gov (United States)

Domaracka, Alicja; Delaunay, Rudy; Mika, Arkadiusz; Gatchell, Michael; Zettergren, Henning; Cederquist, Henrik; Rousseau, Patrick; Huber, Bernd A

2018-05-23

Ionization, fragmentation and molecular growth have been studied in collisions of 22.5 keV He2+- or 3 keV Ar+-projectiles with pure loosely bound clusters of coronene (C24H12) molecules or with loosely bound mixed C60-C24H12 clusters by using mass spectrometry. The heavier and slower Ar+ projectiles induce prompt knockout-fragmentation - C- and/or H-losses - from individual molecules and highly efficient secondary molecular growth reactions before the clusters disintegrate on picosecond timescales. The lighter and faster He2+ projectiles have a higher charge and the main reactions are then ionization by ions that are not penetrating the clusters. This leads mostly to cluster fragmentation without molecular growth. However, here penetrating collisions may also lead to molecular growth but to a much smaller extent than with 3 keV Ar+. Here we present fragmentation and molecular growth mass distributions with 1 mass unit resolution, which reveals that the same numbers of C- and H-atoms often participate in the formation and breaking of covalent bonds inside the clusters. We find that masses close to those with integer numbers of intact coronene molecules, or with integer numbers of both intact coronene and C60 molecules, are formed where often one or several H-atoms are missing or have been added on. We also find that super-hydrogenated coronene is formed inside the clusters.
Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing binary outcomes: a simulation study

Directory of Open Access Journals (Sweden)

Ma Jinhui

2013-01-01

Full Text Available Abstracts Background The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE and cluster-specific (i.e. random-effects logistic regression (RELR models for analyzing data from cluster randomized trials (CRTs with missing binary responses. Methods In this simulation study, clustered responses were generated from a beta-binomial distribution. The number of clusters per trial arm, the number of subjects per cluster, intra-cluster correlation coefficient, and the percentage of missing data were allowed to vary. Under the assumption of covariate dependent missingness, missing outcomes were handled by complete case analysis, standard multiple imputation (MI and within-cluster MI strategies. Data were analyzed using GEE and RELR. Performance of the methods was assessed using standardized bias, empirical standard error, root mean squared error (RMSE, and coverage probability. Results GEE performs well on all four measures — provided the downward bias of the standard error (when the number of clusters per arm is small is adjusted appropriately — under the following scenarios: complete case analysis for CRTs with a small amount of missing data; standard MI for CRTs with variance inflation factor (VIF 50. RELR performs well only when a small amount of data was missing, and complete case analysis was applied. Conclusion GEE performs well as long as appropriate missing data strategies are adopted based on the design of CRTs and the percentage of missing data. In contrast, RELR does not perform well when either standard or within-cluster MI strategy is applied prior to the analysis.
Hubble Space Telescope Observations of cD Galaxies and Their Globular Cluster Systems

Science.gov (United States)

Jordán, Andrés; Côté, Patrick; West, Michael J.; Marzke, Ronald O.; Minniti, Dante; Rejkuba, Marina

2004-01-01

We have used WFPC2 on the Hubble Space Telescope (HST) to obtain F450W and F814W images of four cD galaxies (NGC 541 in Abell 194, NGC 2832 in Abell 779, NGC 4839 in Abell 1656, and NGC 7768 in Abell 2666) in the range 5400 km s-1cluster (GC) systems reveals no anomalies in terms of specific frequencies, metallicity gradients, average metallicities, or the metallicity offset between the globular clusters and the host galaxy. We show that the latter offset appears roughly constant at Δ[Fe/H]~0.8 dex for early-type galaxies spanning a luminosity range of roughly 4 orders of magnitude. We combine the globular cluster metallicity distributions with an empirical technique described in a series of earlier papers to investigate the form of the protogalactic mass spectrum in these cD galaxies. We find that the observed GC metallicity distributions are consistent with those expected if cD galaxies form through the cannibalism of numerous galaxies and protogalactic fragments that formed their stars and globular clusters before capture and disruption. However, the properties of their GC systems suggest that dynamical friction is not the primary mechanism by which these galaxies are assembled. We argue that cD's instead form rapidly, via hierarchical merging, prior to cluster virialization. Based on observations with the NASA/ESA Hubble Space Telescope obtained at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-26555 Based in part on observations obtained at the European Southern Observatory, for VLT program 68.D-0130(A).
The Business Cluster's Distribution e-Channels

OpenAIRE

Milan Davidovic

2011-01-01

The business cluster cooperative potential and business capability improvement are dependent on e-business implementation and business model change dynamics in cluster and his members based in new and existing distribution channels, customer relationships management and supplychain integration. In this work analyse cluster’s e-business models, e-commerce forms and distribution e-channels for three business cases: when cluster members are oriented on own business, on cooperative’s project or c...
Technical Note: Using k-means clustering to determine the number and position of isocenters in MLC-based multiple target intracranial radiosurgery.

Science.gov (United States)

Yock, Adam D; Kim, Gwe-Ya

2017-09-01

To present the k-means clustering algorithm as a tool to address treatment planning considerations characteristic of stereotactic radiosurgery using a single isocenter for multiple targets. For 30 patients treated with stereotactic radiosurgery for multiple brain metastases, the geometric centroids and radii of each met were determined from the treatment planning system. In-house software used this as well as weighted and unweighted versions of the k-means clustering algorithm to group the targets to be treated with a single isocenter, and to position each isocenter. The algorithm results were evaluated using within-cluster sum of squares as well as a minimum target coverage metric that considered the effect of target size. Both versions of the algorithm were applied to an example patient to demonstrate the prospective determination of the appropriate number and location of isocenters. Both weighted and unweighted versions of the k-means algorithm were applied successfully to determine the number and position of isocenters. Comparing the two, both the within-cluster sum of squares metric and the minimum target coverage metric resulting from the unweighted version were less than those from the weighted version. The average magnitudes of the differences were small (-0.2 cm 2 and 0.1% for the within cluster sum of squares and minimum target coverage, respectively) but statistically significant (Wilcoxon signed-rank test, P k-means clustering algorithm represented an advantage of the unweighted version for the within-cluster sum of squares metric, and an advantage of the weighted version for the minimum target coverage metric. While additional treatment planning considerations have a large influence on the final treatment plan quality, both versions of the k-means algorithm provide automatic, consistent, quantitative, and objective solutions to the tasks associated with SRS treatment planning using a single isocenter for multiple targets. © 2017 The Authors. Journal
Structure and clusters of light unstable nuclei

International Nuclear Information System (INIS)

En'yo, Yoshiko

2010-01-01

As it is known, cluster structures are often observed in light nuclei. In the recent evolution of unstable nuclear research (on nuclei having unbalanced number of neutron and proton) further new types of clusters are coming to be revealed. In this report, structures of light unstable nuclei and some of the theoretical models to describe them are reviewed. The following topics are picked up. 1. Cluster structure and theoretical models, 2. Cluster structure of unstable nuclei (low excited state). 3. Cluster structure of neutron excess beryllium isotopes. 4. Cluster gas like state in C isotope. 5. Dineutron structure of He isotopes. Numbers of strange nuclear structures of light nuclei are illustrated. Antisymmetrized molecular dynamics (AMD) is the recently developed theoretical framework which has been successfully used in heavy ion reactions and nuclear structure studies. Successful application of AMD to the isotopes of Be, B and C are illustrated. (S. Funahashi)

NUCORE - A system for nuclear structure calculations with cluster-core models

International Nuclear Information System (INIS)

Heras, C.A.; Abecasis, S.M.

1982-01-01

Calculation of nuclear energy levels and their electromagnetic properties, modelling the nucleus as a cluster of a few particles and/or holes interacting with a core which in turn is modelled as a quadrupole vibrator (cluster-phonon model). The members of the cluster interact via quadrupole-quadrupole and pairing forces. (orig.)
The existence of a plastic phase and a solid-liquid dynamical bistability region in small fullerene cluster (C60)7: molecular dynamics simulation

International Nuclear Information System (INIS)

Piatek, A; Dawid, A; Gburski, Z

2006-01-01

We have simulated (by the molecular dymanics (MD) method) the dynamics of fullerenes (C 60 ) in an extremely small cluster composed of only as many as seven C 60 molecules. The interaction is taken to be the full 60-site pairwise additive Lennard-Jones (LJ) potential which generates both translational and anisotropic rotational motions of each molecule. Our atomically detailed MD simulations discover the plastic phase (no translations but active reorientations of fullerenes) at low energies (temperatures) of the (C 60 ) 7 cluster. We provide the in-depth evidence of the dynamical solid-liquid bistability region in the investigated cluster. Moreover, we confirm the existence of the liquid phase in (C 60 ) 7 , the finding of Gallego et al (1999 Phys. Rev. Lett. 83 5258) obtained earlier on the basis of Girifalco's model, which assumes single-site only and spherically symmetrical interaction between C 60 molecules. We have calculated the translational and angular velocity autocorrelation functions and estimated the diffusion coefficient of fullerene in the liquid phase
Fine structure of cluster decays

International Nuclear Information System (INIS)

Dumitrescu, O.

1993-07-01

Within the one level R-matrix approach the hindrance factors of the radioactive decays in which are emitted α and 14 C - nuclei are calculated. The generalization to radioactive decays in which are emitted heavier clusters such as e.g. 20 O, 24 Ne, 25 Ne, 28 Mg. 30 Mg, 32 Si and 34 Si is straightforward. The interior wave functions are supposed to be given by the shell model with effective residual interactions (e.g. the large scale shell model code-OXBASH - in the Michigan State University version for nearly spherical nuclei or by the enlarged superfluid model - ESM - recently proposed for deformed nuclei). The exterior wave functions are calculated from a cluster - nucleus double - folding model potential obtained with the M3Y interaction. As examples of the cluster decay fine structure we analyzed the particular cases of α - decay of 241 Am and 14 C -decay of 233 Ra. Good agreement with the experimental data is obtained. (author). 78 refs, 2 figs, 6 tabs
Binary model for the coma cluster of galaxies

International Nuclear Information System (INIS)

Valtonen, M.J.; Byrd, G.G.

1979-01-01

We study the dynamics of galaxies in the Coma cluster and find that the cluster is probably dominated by a central binary of galaxies NGC 4874--NGC4889. We estimate their total mass to be about 3 x 10 14 M/sub sun/ by two independent methods (assuming in Hubble constant of 100 km s -1 Mpc -1 ). This binary is efficient in dynamically ejecting smaller galaxies, some of of which are seen in projection against the inner 3 0 radius of the cluster and which, if erroneously considered as bound members, cause a serious overestimate of the mass of the entire cluster. Taking account of the ejected galaxies, we estimate the total cluster mass to be 4--9 x 10 14 M/sub sun/, with a corresponding mass-to-light ratio for a typical galaxy in the range of 20--120 solar units. The origin of the secondary maximum observed in the radial surface density profile is studied. We consider it to be a remnant of a shell of galaxies which formed around the central binary. This shell expanded, then collapsed into the binary, and is now reexpanding. This is supported by the coincidence of the minimum in the cluster eccentricity and radical velocity dispersion at the same radial distance as the secondary maximum. Numerical simulations of a cluster model with a massive central binary and a spherical shell of test particles are performed, and they reproduce the observed shape, galaxy density, and radial velocity distributions in the Coma cluster fairly well. Consequences of extending the model to other clusters are discussed
Labelling IDS clusters by means of the silhouette index

OpenAIRE

Petrovic, Slovodan; Álvarez, Gonzalo; Orfila, Agustín; Carbó, Javier

2006-01-01

Proceeding of: IX Reunión Española sobre Criptología y Seguridad de la Información. Barcelona, 2006 One of the most difficult problems in the design of an anomaly based intrusion detection system (IDS) that uses clustering is that of labelling the ob- tained clusters, i.e. determining which of them correspond to ”good” behaviour on the network/host and which to ”bad” behaviour. In this paper, a new clusters’ labelling strategy, which makes use of the Silhouette clustering quality index is ...
Maximum Throughput in a C-RAN Cluster with Limited Fronthaul Capacity

OpenAIRE

Duan , Jialong; Lagrange , Xavier; Guilloud , Frédéric

2016-01-01

International audience; Centralized/Cloud Radio Access Network (C-RAN) is a promising future mobile network architecture which can ease the cooperation between different cells to manage interference. However, the feasibility of C-RAN is limited by the large bit rate requirement in the fronthaul. This paper study the maximum throughput of different transmission strategies in a C-RAN cluster with transmission power constraints and fronthaul capacity constraints. Both transmission strategies wit...
Unsupervised Approach Data Analysis Based on Fuzzy Possibilistic Clustering: Application to Medical Image MRI

Directory of Open Access Journals (Sweden)

Nour-Eddine El Harchaoui

2013-01-01

Full Text Available The analysis and processing of large data are a challenge for researchers. Several approaches have been used to model these complex data, and they are based on some mathematical theories: fuzzy, probabilistic, possibilistic, and evidence theories. In this work, we propose a new unsupervised classification approach that combines the fuzzy and possibilistic theories; our purpose is to overcome the problems of uncertain data in complex systems. We used the membership function of fuzzy c-means (FCM to initialize the parameters of possibilistic c-means (PCM, in order to solve the problem of coinciding clusters that are generated by PCM and also overcome the weakness of FCM to noise. To validate our approach, we used several validity indexes and we compared them with other conventional classification algorithms: fuzzy c-means, possibilistic c-means, and possibilistic fuzzy c-means. The experiments were realized on different synthetics data sets and real brain MR images.
Variance-Based Cluster Selection Criteria in a K-Means Framework for One-Mode Dissimilarity Data.

Science.gov (United States)

Vera, J Fernando; Macías, Rodrigo

2017-06-01

One of the main problems in cluster analysis is that of determining the number of groups in the data. In general, the approach taken depends on the cluster method used. For K-means, some of the most widely employed criteria are formulated in terms of the decomposition of the total point scatter, regarding a two-mode data set of N points in p dimensions, which are optimally arranged into K classes. This paper addresses the formulation of criteria to determine the number of clusters, in the general situation in which the available information for clustering is a one-mode [Formula: see text] dissimilarity matrix describing the objects. In this framework, p and the coordinates of points are usually unknown, and the application of criteria originally formulated for two-mode data sets is dependent on their possible reformulation in the one-mode situation. The decomposition of the variability of the clustered objects is proposed in terms of the corresponding block-shaped partition of the dissimilarity matrix. Within-block and between-block dispersion values for the partitioned dissimilarity matrix are derived, and variance-based criteria are subsequently formulated in order to determine the number of groups in the data. A Monte Carlo experiment was carried out to study the performance of the proposed criteria. For simulated clustered points in p dimensions, greater efficiency in recovering the number of clusters is obtained when the criteria are calculated from the related Euclidean distances instead of the known two-mode data set, in general, for unequal-sized clusters and for low dimensionality situations. For simulated dissimilarity data sets, the proposed criteria always outperform the results obtained when these criteria are calculated from their original formulation, using dissimilarities instead of distances.
Old star clusters: Bench tests of low mass stellar models

Directory of Open Access Journals (Sweden)

Salaris M.

2013-03-01

Full Text Available Old star clusters in the Milky Way and external galaxies have been (and still are traditionally used to constrain the age of the universe and the timescales of galaxy formation. A parallel avenue of old star cluster research considers these objects as bench tests of low-mass stellar models. This short review will highlight some recent tests of stellar evolution models that make use of photometric and spectroscopic observations of resolved old star clusters. In some cases these tests have pointed to additional physical processes efficient in low-mass stars, that are not routinely included in model computations. Moreover, recent results from the Kepler mission about the old open cluster NGC6791 are adding new tight constraints to the models.
Sensitivity evaluation of dynamic speckle activity measurements using clustering methods

International Nuclear Information System (INIS)

Etchepareborda, Pablo; Federico, Alejandro; Kaufmann, Guillermo H.

2010-01-01

We evaluate and compare the use of competitive neural networks, self-organizing maps, the expectation-maximization algorithm, K-means, and fuzzy C-means techniques as partitional clustering methods, when the sensitivity of the activity measurement of dynamic speckle images needs to be improved. The temporal history of the acquired intensity generated by each pixel is analyzed in a wavelet decomposition framework, and it is shown that the mean energy of its corresponding wavelet coefficients provides a suited feature space for clustering purposes. The sensitivity obtained by using the evaluated clustering techniques is also compared with the well-known methods of Konishi-Fujii, weighted generalized differences, and wavelet entropy. The performance of the partitional clustering approach is evaluated using simulated dynamic speckle patterns and also experimental data.
A grand unified model for liganded gold clusters

Science.gov (United States)

Xu, Wen Wu; Zhu, Beien; Zeng, Xiao Cheng; Gao, Yi

2016-12-01

A grand unified model (GUM) is developed to achieve fundamental understanding of rich structures of all 71 liganded gold clusters reported to date. Inspired by the quark model by which composite particles (for example, protons and neutrons) are formed by combining three quarks (or flavours), here gold atoms are assigned three `flavours' (namely, bottom, middle and top) to represent three possible valence states. The `composite particles' in GUM are categorized into two groups: variants of triangular elementary block Au3(2e) and tetrahedral elementary block Au4(2e), all satisfying the duet rule (2e) of the valence shell, akin to the octet rule in general chemistry. The elementary blocks, when packed together, form the cores of liganded gold clusters. With the GUM, structures of 71 liganded gold clusters and their growth mechanism can be deciphered altogether. Although GUM is a predictive heuristic and may not be necessarily reflective of the actual electronic structure, several highly stable liganded gold clusters are predicted, thereby offering GUM-guided synthesis of liganded gold clusters by design.
Aerosol cluster impact and break-up: model and implementation

International Nuclear Information System (INIS)

Lechman, Jeremy B.

2010-01-01

In this report a model for simulating aerosol cluster impact with rigid walls is presented. The model is based on JKR adhesion theory and is implemented as an enhancement to the granular (DEM) package within the LAMMPS code. The theory behind the model is outlined and preliminary results are shown. Modeling the interactions of small particles is relevant to a number of applications (e.g., soils, powders, colloidal suspensions, etc.). Modeling the behavior of aerosol particles during agglomeration and cluster dynamics upon impact with a wall is of particular interest. In this report we describe preliminary efforts to develop and implement physical models for aerosol particle interactions. Future work will consist of deploying these models to simulate aerosol cluster behavior upon impact with a rigid wall for the purpose of developing relationships for impact speed and probability of stick/bounce/break-up as well as to assess the distribution of cluster sizes if break-up occurs. These relationships will be developed consistent with the need for inputs into system-level codes. Section 2 gives background and details on the physical model as well as implementations issues. Section 3 presents some preliminary results which lead to discussion in Section 4 of future plans.
Calculations of light scattering matrices for stochastic ensembles of nanosphere clusters

International Nuclear Information System (INIS)

Bunkin, N.F.; Shkirin, A.V.; Suyazov, N.V.; Starosvetskiy, A.V.

2013-01-01

Results of the calculation of the light scattering matrices for systems of stochastic nanosphere clusters are presented. A mathematical model of spherical particle clustering with allowance for cluster–cluster aggregation is used. The fractal properties of cluster structures are explored at different values of the model parameter that governs cluster–cluster interaction. General properties of the light scattering matrices of nanosphere-cluster ensembles as dependent on their mean fractal dimension have been found. The scattering-matrix calculations were performed for finite samples of 10 3 random clusters, made up of polydisperse spherical nanoparticles, having lognormal size distribution with the effective radius 50 nm and effective variance 0.02; the mean number of monomers in a cluster and its standard deviation were set to 500 and 70, respectively. The implemented computation environment, modeling the scattering matrices for overall sequences of clusters, is based upon T-matrix program code for a given single cluster of spheres, which was developed in [1]. The ensemble-averaged results have been compared with orientation-averaged ones calculated for individual clusters. -- Highlights: ► We suggested a hierarchical model of cluster growth allowing for cluster–cluster aggregation. ► We analyzed the light scattering by whole ensembles of nanosphere clusters. ► We studied the evolution of the light scattering matrix when changing the fractal dimension
Volatility Modeling, Seasonality and Risk-Return Relationship in GARCH-in-Mean Framework: The Case of Indian Stock and Commodity Markets

OpenAIRE

Brajesh Kumar; Singh, Priyanka

2008-01-01

This paper is based on an empirical study of volatility, risk premium and seasonality in risk-return relation of the Indian stock and commodity markets. This investigation is conducted by means of the General Autoregressive Conditional Heteroscedasticity in the mean model (GARCH-in-Mean) introduced by Engle et al. (1987). A systematic approach to model volatility in returns is presented. Volatility clustering and asymmetric nature is examined for Indian stock and commodity markets. The risk-r...
Identifying Clusters with Mixture Models that Include Radial Velocity Observations

Science.gov (United States)

Czarnatowicz, Alexis; Ybarra, Jason E.

2018-01-01

The study of stellar clusters plays an integral role in the study of star formation. We present a cluster mixture model that considers radial velocity data in addition to spatial data. Maximum likelihood estimation through the Expectation-Maximization (EM) algorithm is used for parameter estimation. Our mixture model analysis can be used to distinguish adjacent or overlapping clusters, and estimate properties for each cluster.Work supported by awards from the Virginia Foundation for Independent Colleges (VFIC) Undergraduate Science Research Fellowship and The Research Experience @Bridgewater (TREB).
Globular cluster metallicity scale: evidence from stellar models

International Nuclear Information System (INIS)

Demarque, P.; King, C.R.; Diaz, A.

1982-01-01

Theoretical giant branches have been constructed to determine their relative positions for metallicities in the range -2.3 0 )/sub 0,g/ based on these models is presented which yields good agreement over the observed range of metallicities for galactic globular clusters and old disk clusters. The metallicity of 47 Tuc and M71 given by this calibration is about -0.8 dex. Subject headings: clusters, globular: stars: abundances: stars: interiors
Modeling the pinning of Au and Ni clusters on graphite

NARCIS (Netherlands)

Smith, R.; Nock, C.; Kenny, S.D.; Belbruno, J.J.; Di Vece, M.; Paloma, S.; Palmer, R.E.

2006-01-01

The pinning of size-selected AuN and NiN clusters on graphite, for N=7–100, is investigated by means of molecular dynamics simulations and the results are compared to experiment and previous work with Ag clusters. Ab initio calculations of the binding of the metal adatom and dimers on a graphite
A Clustered Extragalactic Foreground Model for the EoR

Science.gov (United States)

Murray, S. G.; Trott, C. M.; Jordan, C. H.

2018-05-01

We review an improved statistical model of extra-galactic point-source foregrounds first introduced in Murray et al. (2017), in the context of the Epoch of Reionization. This model extends the instrumentally-convolved foreground covariance used in inverse-covariance foreground mitigation schemes, by considering the cosmological clustering of the sources. In this short work, we show that over scales of k ~ (0.6, 40.)hMpc-1, ignoring source clustering is a valid approximation. This is in contrast to Murray et al. (2017), who found a possibility of false detection if the clustering was ignored. The dominant cause for this change is the introduction of a Galactic synchrotron component which shadows the clustering of sources.
A diffuse neutron scattering study of clustering in copper-nickel alloys

International Nuclear Information System (INIS)

Vrijen, J.

1977-01-01

The amount of clustering in Cu-Ni alloys in thermal equilibrium at several temperatures between 400degC and 700degC and ranging in composition between 20 and 80 atomic percent Ni has been determined by means of diffuse neutron scattering. A rough calculation of the excess elastic energy due to alloying Cu with Ni shows that the contribution of size effects to the configurational energy is asymmetric in the composition with its maximum located between 60 and 70 atomic percent Ni. This asymmetry is caused by different elastic constants for Cu and Ni and it might explain part of the asymmetry of clustering in Cu-Ni and its temperature dependence. With the help of the measured cluster parameters, the magnetic diffuse neutron scattering cross-sections of several differently clustered compositions in Cu-Ni could be interpreted, both well inside the ferromagnetic phase and in the transition region between ferromagnetism and superparamagnetism. Giants moments have been observed. Non-equilibrium distributions and their changes during relaxing towards equilibrium have been investigated by measuring the time-evolution of the diffuse scattering. The relaxation of the null matrix (composition without Bragg reflections for neutron scattering) has been measured at five temperatures between 320degC and 450degC. The results of these relaxations were compared with a few available kinetic models
Structure-function relationship between the octopus perimeter cluster mean sensitivity and sector retinal nerve fiber layer thickness measured with the RTVue optical coherence tomography and scanning laser polarimetry.

Science.gov (United States)

Naghizadeh, Farzaneh; Garas, Anita; Vargha, Péter; Holló, Gábor

2014-01-01

To determine structure-function relationship between each of 16 Octopus perimeter G2 program clusters and the corresponding 16 peripapillary sector retinal nerve fiber layer thickness (RNFLT) values measured with the RTVue-100 Fourier-domain optical coherence tomography (RTVue OCT) and scanning laser polarimetry with variable corneal compensation (GDx-VCC) and enhanced corneal compensation (GDx-ECC) corneal compensation. One eye of 110 white patients (15 healthy, 20 ocular hypertensive, and 75 glaucoma eyes) were investigated. The Akaike information criterion and the F test were used to identify the best fitting model. Parabolic relationship with logarithmic cluster mean sensitivity and linear sector RNFLT values provided the best fit. For RTVue OCT, significant (P0.05) was found for the control eyes. Mean sensitivity of the Octopus visual field clusters showed significant parabolic relationship with the corresponding peripapillary RNFLT sectors. The relationship was more general with the RTVue OCT than GDx-VCC or GDx-ECC. The results show that visual field clusters of the Octopus G program can be applied for detailed structure-function research.

Molecular dynamics modelling of EGCG clusters on ceramide bilayers

Energy Technology Data Exchange (ETDEWEB)

Yeo, Jingjie; Cheng, Yuan; Li, Weifeng; Zhang, Yong-Wei [Institute of High Performance Computing, A*STAR, 138632 (Singapore)

2015-12-31

A novel method of atomistic modelling and characterization of both pure ceramide and mixed lipid bilayers is being developed, using only the General Amber ForceField. Lipid bilayers modelled as pure ceramides adopt hexagonal packing after equilibration, and the area per lipid and bilayer thickness are consistent with previously reported theoretical results. Mixed lipid bilayers are modelled as a combination of ceramides, cholesterol, and free fatty acids. This model is shown to be stable after equilibration. Green tea extract, also known as epigallocatechin-3-gallate, is introduced as a spherical cluster on the surface of the mixed lipid bilayer. It is demonstrated that the cluster is able to bind to the bilayers as a cluster without diffusing into the surrounding water.
COCOA code for creating mock observations of star cluster models

Science.gov (United States)

Askar, Abbas; Giersz, Mirek; Pych, Wojciech; Dalessandro, Emanuele

2018-04-01

We introduce and present results from the COCOA (Cluster simulatiOn Comparison with ObservAtions) code that has been developed to create idealized mock photometric observations using results from numerical simulations of star cluster evolution. COCOA is able to present the output of realistic numerical simulations of star clusters carried out using Monte Carlo or N-body codes in a way that is useful for direct comparison with photometric observations. In this paper, we describe the COCOA code and demonstrate its different applications by utilizing globular cluster (GC) models simulated with the MOCCA (MOnte Carlo Cluster simulAtor) code. COCOA is used to synthetically observe these different GC models with optical telescopes, perform point spread function photometry, and subsequently produce observed colour-magnitude diagrams. We also use COCOA to compare the results from synthetic observations of a cluster model that has the same age and metallicity as the Galactic GC NGC 2808 with observations of the same cluster carried out with a 2.2 m optical telescope. We find that COCOA can effectively simulate realistic observations and recover photometric data. COCOA has numerous scientific applications that maybe be helpful for both theoreticians and observers that work on star clusters. Plans for further improving and developing the code are also discussed in this paper.
Cluster models, factors and characteristics for the competitive advantage of Lithuanian Maritime sector

OpenAIRE

Viederytė, Rasa; Didžiokas, Rimantas

2014-01-01

Paper analyses several cluster models on the basis of competitiveness: Nine-factor model, Double diamond model, Funnel model of cluster determinants, Destination Competitiveness and sustainability models, which are related to Porter’s Diamond model and concentrate to the classical one - adopt M. Porter’s Diamond model methodology to the evaluation of Lithuanian Maritime sector’s clustering on the basis of competitiveness. Despite the advances in cluster research, this model remains a complex ...
Crouch gait patterns defined using k-means cluster analysis are related to underlying clinical pathology.

Science.gov (United States)

Rozumalski, Adam; Schwartz, Michael H

2009-08-01

In this study a gait classification method was developed and applied to subjects with Cerebral palsy who walk with excessive knee flexion at initial contact. Sagittal plane gait data, simplified using the gait features method, is used as input into a k-means cluster analysis to determine homogeneous groups. Several clinical domains were explored to determine if the clusters are related to underlying pathology. These domains included age, joint range-of-motion, strength, selective motor control, and spasticity. Principal component analysis is used to determine one overall score for each of the multi-joint domains (strength, selective motor control, and spasticity). The current study shows that there are five clusters among children with excessive knee flexion at initial contact. These clusters were labeled, in order of increasing gait pathology: (1) mild crouch with mild equinus, (2) moderate crouch, (3) moderate crouch with anterior pelvic tilt, (4) moderate crouch with equinus, and (5) severe crouch. Further analysis showed that age, range-of-motion, strength, selective motor control, and spasticity were significantly different between the clusters (p<0.001). The general tendency was for the clinical domains to worsen as gait pathology increased. This new classification tool can be used to define homogeneous groups of subjects in crouch gait, which can help guide treatment decisions and outcomes assessment.
A Distributed Agent Implementation of Multiple Species Flocking Model for Document Partitioning Clustering

Energy Technology Data Exchange (ETDEWEB)

Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL

2006-01-01

The Flocking model, first proposed by Craig Reynolds, is one of the first bio-inspired computational collective behavior models that has many popular applications, such as animation. Our early research has resulted in a flock clustering algorithm that can achieve better performance than the Kmeans or the Ant clustering algorithms for data clustering. This algorithm generates a clustering of a given set of data through the embedding of the highdimensional data items on a two-dimensional grid for efficient clustering result retrieval and visualization. In this paper, we propose a bio-inspired clustering model, the Multiple Species Flocking clustering model (MSF), and present a distributed multi-agent MSF approach for document clustering.
AUTOMATIC EXTRACTION OF ROCK JOINTS FROM LASER SCANNED DATA BY MOVING LEAST SQUARES METHOD AND FUZZY K-MEANS CLUSTERING

Directory of Open Access Journals (Sweden)

S. Oh

2012-09-01

Full Text Available Recent development of laser scanning device increased the capability of representing rock outcrop in a very high resolution. Accurate 3D point cloud model with rock joint information can help geologist to estimate stability of rock slope on-site or off-site. An automatic plane extraction method was developed by computing normal directions and grouping them in similar direction. Point normal was calculated by moving least squares (MLS method considering every point within a given distance to minimize error to the fitting plane. Normal directions were classified into a number of dominating clusters by fuzzy K-means clustering. Region growing approach was exploited to discriminate joints in a point cloud. Overall procedure was applied to point cloud with about 120,000 points, and successfully extracted joints with joint information. The extraction procedure was implemented to minimize number of input parameters and to construct plane information into the existing point cloud for less redundancy and high usability of the point cloud itself.
Alloy design as an inverse problem of cluster expansion models

DEFF Research Database (Denmark)

Larsen, Peter Mahler; Kalidindi, Arvind R.; Schmidt, Søren

2017-01-01

Central to a lattice model of an alloy system is the description of the energy of a given atomic configuration, which can be conveniently developed through a cluster expansion. Given a specific cluster expansion, the ground state of the lattice model at 0 K can be solved by finding the configurat......Central to a lattice model of an alloy system is the description of the energy of a given atomic configuration, which can be conveniently developed through a cluster expansion. Given a specific cluster expansion, the ground state of the lattice model at 0 K can be solved by finding...... the inverse problem in terms of energetically distinct configurations, using a constraint satisfaction model to identify constructible configurations, and show that a convex hull can be used to identify ground states. To demonstrate the approach, we solve for all ground states for a binary alloy in a 2D...
Electromagnetic properties of 6Li in a cluster model with breathing clusters

International Nuclear Information System (INIS)

Kruppa, A.T.; Beck, R.; Dickmann, F.

1987-01-01

Electromagnetic properties of 6 Li are studied using a microscopic (α+δ) cluster model. In addition to the ground state of the clusters, their breathing excited states are included in the wave function in order to take into account the distortion of the clusters. The elastic charge form factor is in good agreement with experiment up to a momentum transfer of 8 fm -2 . The ground state magnetic form factor and the inelastic charge form factor are also well described. The effect of the breathing states of α on the form factors proves to be negligible except at high momentum transfer. The ground-state charge density, rms charge radius, the magnetic dipole moment and a reduced transition strength are also obtained in fair agreement with experiment. (author)
Fuzzy Rules for Ant Based Clustering Algorithm

Directory of Open Access Journals (Sweden)

Amira Hamdi

2016-01-01

Full Text Available This paper provides a new intelligent technique for semisupervised data clustering problem that combines the Ant System (AS algorithm with the fuzzy c-means (FCM clustering algorithm. Our proposed approach, called F-ASClass algorithm, is a distributed algorithm inspired by foraging behavior observed in ant colonyT. The ability of ants to find the shortest path forms the basis of our proposed approach. In the first step, several colonies of cooperating entities, called artificial ants, are used to find shortest paths in a complete graph that we called graph-data. The number of colonies used in F-ASClass is equal to the number of clusters in dataset. Hence, the partition matrix of dataset founded by artificial ants is given in the second step, to the fuzzy c-means technique in order to assign unclassified objects generated in the first step. The proposed approach is tested on artificial and real datasets, and its performance is compared with those of K-means, K-medoid, and FCM algorithms. Experimental section shows that F-ASClass performs better according to the error rate classification, accuracy, and separation index.
Supplier Risk Assessment Based on Best-Worst Method and K-Means Clustering: A Case Study

Directory of Open Access Journals (Sweden)

Merve Er Kara

2018-04-01

Full Text Available Supplier evaluation and selection is one of the most critical strategic decisions for developing a competitive and sustainable organization. Companies have to consider supplier related risks and threats in their purchasing decisions. In today’s competitive and risky business environment, it is very important to work with reliable suppliers. This study proposes a clustering based approach to group suppliers based on their risk profile. Suppliers of a company in the heavy-machinery sector are assessed based on 17 qualitative and quantitative risk types. The weights of the criteria are determined by using the Best-Worst method. Four factors are extracted by applying Factor Analysis to the supplier risk data. Then k-means clustering algorithm is applied to group core suppliers of the company based on the four risk factors. Three clusters are created with different risk exposure levels. The interpretation of the results provides insights for risk management actions and supplier development programs to mitigate supplier risk.
Cluster Analysis of Customer Reviews Extracted from Web Pages

Directory of Open Access Journals (Sweden)

S. Shivashankar

2010-01-01

Full Text Available As e-commerce is gaining popularity day by day, the web has become an excellent source for gathering customer reviews / opinions by the market researchers. The number of customer reviews that a product receives is growing at very fast rate (It could be in hundreds or thousands. Customer reviews posted on the websites vary greatly in quality. The potential customer has to read necessarily all the reviews irrespective of their quality to make a decision on whether to purchase the product or not. In this paper, we make an attempt to assess are view based on its quality, to help the customer make a proper buying decision. The quality of customer review is assessed as most significant, more significant, significant and insignificant.A novel and effective web mining technique is proposed for assessing a customer review of a particular product based on the feature clustering techniques, namely, k-means method and fuzzy c-means method. This is performed in three steps : (1Identify review regions and extract reviews from it, (2 Extract and cluster the features of reviews by a clustering technique and then assign weights to the features belonging to each of the clusters (groups and (3 Assess the review by considering the feature weights and group belongingness. The k-means and fuzzy c-means clustering techniques are implemented and tested on customer reviews extracted from web pages. Performance of these techniques are analyzed.
The global Minmax k-means algorithm.

Science.gov (United States)

Wang, Xiaoyan; Bai, Yanping

2016-01-01

The global k -means algorithm is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure from suitable initial positions, and employs k -means to minimize the sum of the intra-cluster variances. However the global k -means algorithm sometimes results singleton clusters and the initial positions sometimes are bad, after a bad initialization, poor local optimal can be easily obtained by k -means algorithm. In this paper, we modified the global k -means algorithm to eliminate the singleton clusters at first, and then we apply MinMax k -means clustering error method to global k -means algorithm to overcome the effect of bad initialization, proposed the global Minmax k -means algorithm. The proposed clustering method is tested on some popular data sets and compared to the k -means algorithm, the global k -means algorithm and the MinMax k -means algorithm. The experiment results show our proposed algorithm outperforms other algorithms mentioned in the paper.
Emergence of clustering in an acquaintance model without homophily

International Nuclear Information System (INIS)

Bhat, Uttam; Krapivsky, P L; Redner, S

2014-01-01

We introduce an agent-based acquaintance model in which social links are created by processes in which there is no explicit homophily. In spite of the homogeneous nature of the social interactions, highly-clustered social networks can arise. The crucial feature of our model is that of variable transitive interactions. Namely, when an agent introduces two unconnected friends, the rate at which a connection actually occurs between them depends on the number of their mutual acquaintances. As this transitive interaction rate is varied, the social network undergoes a dramatic clustering transition. Close to the transition, the network consists of a collection of well-defined communities. As a function of time, the network can also undergo an incomplete gelation transition, in which the gel, or giant cluster, does not constitute the entire network, even at infinite time. Some of the clustering properties of our model also arise, but in a more gradual manner, in Facebook networks. Finally, we discuss a more realistic variant of our original model in which network realizations can be constructed that quantitatively match Facebook networks. (paper)
Emergence of clustering in an acquaintance model without homophily

Science.gov (United States)

Bhat, Uttam; Krapivsky, P. L.; Redner, S.

2014-11-01

We introduce an agent-based acquaintance model in which social links are created by processes in which there is no explicit homophily. In spite of the homogeneous nature of the social interactions, highly-clustered social networks can arise. The crucial feature of our model is that of variable transitive interactions. Namely, when an agent introduces two unconnected friends, the rate at which a connection actually occurs between them depends on the number of their mutual acquaintances. As this transitive interaction rate is varied, the social network undergoes a dramatic clustering transition. Close to the transition, the network consists of a collection of well-defined communities. As a function of time, the network can also undergo an incomplete gelation transition, in which the gel, or giant cluster, does not constitute the entire network, even at infinite time. Some of the clustering properties of our model also arise, but in a more gradual manner, in Facebook networks. Finally, we discuss a more realistic variant of our original model in which network realizations can be constructed that quantitatively match Facebook networks.
Cluster state generation in one-dimensional Kitaev honeycomb model via shortcut to adiabaticity

Science.gov (United States)

Kyaw, Thi Ha; Kwek, Leong-Chuan

2018-04-01

We propose a mean to obtain computationally useful resource states also known as cluster states, for measurement-based quantum computation, via transitionless quantum driving algorithm. The idea is to cool the system to its unique ground state and tune some control parameters to arrive at computationally useful resource state, which is in one of the degenerate ground states. Even though there is set of conserved quantities already present in the model Hamiltonian, which prevents the instantaneous state to go to any other eigenstate subspaces, one cannot quench the control parameters to get the desired state. In that case, the state will not evolve. With involvement of the shortcut Hamiltonian, we obtain cluster states in fast-forward manner. We elaborate our proposal in the one-dimensional Kitaev honeycomb model, and show that the auxiliary Hamiltonian needed for the counterdiabatic driving is of M-body interaction.
Estimating overall exposure effects for the clustered and censored outcome using random effect Tobit regression models.

Science.gov (United States)

Wang, Wei; Griswold, Michael E

2016-11-30

The random effect Tobit model is a regression model that accommodates both left- and/or right-censoring and within-cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood-based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the 'Average Predicted Value' method to estimate the model-predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi-Newton optimization algorithm with Gauss-Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Quantitative properties of clustering within modern microscopic nuclear models

International Nuclear Information System (INIS)

Volya, A.; Tchuvil’sky, Yu. M.

2016-01-01

A method for studying cluster spectroscopic properties of nuclear fragmentation, such as spectroscopic amplitudes, cluster form factors, and spectroscopic factors, is developed on the basis of modern precision nuclear models that take into account the mixing of large-scale shell-model configurations. Alpha-cluster channels are considered as an example. A mathematical proof of the need for taking into account the channel-wave-function renormalization generated by exchange terms of the antisymmetrization operator (Fliessbach effect) is given. Examples where this effect is confirmed by a high quality of the description of experimental data are presented. By and large, the method in question extends substantially the possibilities for studying clustering phenomena in nuclei and for improving the quality of their description.
Hyperon-nucleon and hyperon-hyperon interaction in the quark cluster model

International Nuclear Information System (INIS)

Straub, U.

1988-01-01

The nonrelativistic quark cluster model is used for the description of the hyperon-nucleon and hyperon-hyperon interaction. The different mass of the quarks is consistently regarded in the Hamiltonian and in the shape of the spatial wave functions of the quarks. The six-quark wave function is completely antisymmetrisized. By means of the resonating-group method the dynamic equations for the determination of the binding and scattering states of the six-quark problem are formulated. The corresponding resonating-group kernels are explicitely given. We calculate the lambda-nucleon and sigma-nucleon interaction. The sigma-nucleon scattering in the isospin (T=3/2) channel can be treated in a one-channel calculation. The sigma-nucleon (T=1/2) interaction and the lambda-nucleon interaction are studied in a coupled two-channel calculation. From a fit of the experimental lambda-nucleon interaction cross section the strength of the sigma-meson exchange is determined. The calculation of the sigma-nucleon scattering follows then completely parameterless. The agreement of the theory with the experiment is good. Subsequently the cluster model with this parameter is applied to the dihyperon which is a possibly bound state of two up quarks, two down quarks, and two strange quarks. We solve for this a coupled three-channel calculation. The cluster model presented here gives a binding energy of the dihyperon of (20±5) MeV below the lambda-lambda threshold. The mass of the dihyperon is predicted by this as (2211±5) MeV. (orig.) [de
Extension of the Si:C Stressor Thickness by Using Multiple ClusterCarbon Species

International Nuclear Information System (INIS)

Sekar, Karuppanan; Krull, Wade

2011-01-01

ClusterCarbon implantation is now well established as an attractive alternative for producing stress in advanced NMOS devices. ClusterCarbon has the advantage over monomer carbon implant in it's self-amorphization feature, eliminating the need for PAI implantation while producing highly substitutional carbon incorporation. To date, the limitation of this approach has been the high energy limit, due to the extraction limit of the available production tools for the preferred carbon species, which has been the C7Hx molecule. It is noted that the C7 species is produced by the breakup of the parent C14H14 molecule in the ion source. It is further noted that the preferred method of producing the Si:C stress layer is a multiple implant sequence with ClusterCarbon implants at various energies and doses designed to produce a carbon profile which is constant in-depth. The stressor thickness limit using C7 is known to be about 40 nm, which is less than the stressor thickness used in the conventional SiGe process for PMOS. In this work, it is shown that utilizing the C5 molecule which is also available from the breakup of C14H14 enables the stressor layer thickness to be extended to at least 60 nm, which is consistent with the conventional SiGe process. It will be shown that one additional C5 implant, performed after a standard C7 multiple implant sequence, can produce the extension of the stressor thickness while maintaining the flat depth profile. A detailed process characterization will be shown for this new process sequence.
High-conductance states in a mean-field cortical network model

CERN Document Server

Lerchner, A; Hertz, J

2004-01-01

Measured responses from visual cortical neurons show that spike times tend to be correlated rather than exactly Poisson distributed. Fano factors vary and are usually greater than 1 due to the tendency of spikes being clustered into bursts. We show that this behavior emerges naturally in a balanced cortical network model with random connectivity and conductance-based synapses. We employ mean field theory with correctly colored noise to describe temporal correlations in the neuronal activity. Our results illuminate the connection between two independent experimental findings: high conductance states of cortical neurons in their natural environment, and variable non-Poissonian spike statistics with Fano factors greater than 1.

Optimized data fusion for K-means Laplacian clustering

Science.gov (United States)

Yu, Shi; Liu, Xinhai; Tranchevent, Léon-Charles; Glänzel, Wolfgang; Suykens, Johan A. K.; De Moor, Bart; Moreau, Yves

2011-01-01

Motivation: We propose a novel algorithm to combine multiple kernels and Laplacians for clustering analysis. The new algorithm is formulated on a Rayleigh quotient objective function and is solved as a bi-level alternating minimization procedure. Using the proposed algorithm, the coefficients of kernels and Laplacians can be optimized automatically. Results: Three variants of the algorithm are proposed. The performance is systematically validated on two real-life data fusion applications. The proposed Optimized Kernel Laplacian Clustering (OKLC) algorithms perform significantly better than other methods. Moreover, the coefficients of kernels and Laplacians optimized by OKLC show some correlation with the rank of performance of individual data source. Though in our evaluation the K values are predefined, in practical studies, the optimal cluster number can be consistently estimated from the eigenspectrum of the combined kernel Laplacian matrix. Availability: The MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/oklc.html. Contact: shiyu@uchicago.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20980271
A density functional theory study on structures, stabilities, and electronic and magnetic properties of Au{sub n}C (n = 1–9) clusters

Energy Technology Data Exchange (ETDEWEB)

Hou, Xiao-Fei; Yan, Li-Li; Huang, Teng; Hong, Yu; Miao, Shou-Kui [Laboratory of Atmospheric Physico-Chemistry, Anhui Institute of Optics & Fine Mechanics, Chinese Academy of Sciences, Hefei, Anhui 230031 (China); Peng, Xiu-Qiu [School of Environmental Science & Optoelectronic Technology, University of Science and Technology of China, Hefei, Anhui 230026 (China); Liu, Yi-Rong, E-mail: liuyirong@aiofm.ac.cn [Laboratory of Atmospheric Physico-Chemistry, Anhui Institute of Optics & Fine Mechanics, Chinese Academy of Sciences, Hefei, Anhui 230031 (China); Huang, Wei, E-mail: huangwei6@ustc.edu.cn [Laboratory of Atmospheric Physico-Chemistry, Anhui Institute of Optics & Fine Mechanics, Chinese Academy of Sciences, Hefei, Anhui 230031 (China); School of Environmental Science & Optoelectronic Technology, University of Science and Technology of China, Hefei, Anhui 230026 (China)

2016-06-15

The equilibrium geometric structures, relative stabilities, electronic stabilities, and electronic and magnetic properties of the Au{sub n}C and Au{sub n+1} (n = 1–9) clusters are systematically investigated using density functional theory (DFT) with hyper-generalized gradient approximation (GGA). The optimized geometries show that one Au atom added to the Au{sub n−1}C cluster is the dominant growth pattern for the Au{sub n}C clusters. In contrast to the pure gold clusters, the Au{sub n}C clusters are most stable in a quasi-planar or three-dimensional (3D) structure because the C dopant induces the local non-planarity, with exceptions of the Au{sub 6,8}C clusters who have 2D structures. The analysis of the relative and electronic stabilities reveals that the Au{sub 4}C and Au{sub 6} clusters are the most stable in the series of studied clusters, respectively. In addition, a natural bond orbital (NBO) analysis shows that the charges in the Au{sub n}C clusters transfer from the Au{sub n} host to the C atom. Moreover, the Au and C atoms interact with each other mostly via covalent bond rather than ionic bond, which can be confirmed through the average ionic character of the Au–C bond. Meanwhile, the charges mainly transfer between 2s and 2p orbitals within the C atom, and among 5d, 6s, and 6p orbitals within the Au atom for the Au{sub n}C clusters. As for the magnetic properties of the Au{sub n}C clusters, the total magnetic moments are 1 μ{sub B} for n = odd clusters, with the total magnetic moments mainly locating on the C atoms for Au{sub 1,3,9}C and on the Au{sub n} host for Au{sub 5,7}C clusters. However, the total magnetic moments of the Au{sub n}C clusters are zero for n = even clusters. Simultaneously, the magnetic moments mainly locate on the 2p orbital within the C atom and on the 5d, 6s orbitals within the Au atom.
Comprehensive cluster-theory analysis of the magnetic structures and excitations in CoCl2·2H2O

DEFF Research Database (Denmark)

Jensen, Jens; Larsen, Jacob; Hansen, Ursula B.

2018-01-01

The magnetic properties of CoCl2·2H2O are analyzed in the mean-field/random-phase approximation using a basis of clusters with four spins along the c-axis chains of Co ions. The model gives a unifying account of the bulk properties, the spin waves, and the higher-order cluster-spin excitations...... at a transverse field of 160 kOe and is found to agree closely with their observations....
Fuzzy C-means classification for corrosion evolution of steel images

Science.gov (United States)

Trujillo, Maite; Sadki, Mustapha

2004-05-01

An unavoidable problem of metal structures is their exposure to rust degradation during their operational life. Thus, the surfaces need to be assessed in order to avoid potential catastrophes. There is considerable interest in the use of patch repair strategies which minimize the project costs. However, to operate such strategies with confidence in the long useful life of the repair, it is essential that the condition of the existing coatings and the steel substrate can be accurately quantified and classified. This paper describes the application of fuzzy set theory for steel surfaces classification according to the steel rust time. We propose a semi-automatic technique to obtain image clustering using the Fuzzy C-means (FCM) algorithm and we analyze two kinds of data to study the classification performance. Firstly, we investigate the use of raw images" pixels without any pre-processing methods and neighborhood pixels. Secondly, we apply Gaussian noise to the images with different standard deviation to study the FCM method tolerance to Gaussian noise. The noisy images simulate the possible perturbations of the images due to the weather or rust deposits in the steel surfaces during typical on-site acquisition procedures
On tidal radius determination for a globular cluster

International Nuclear Information System (INIS)

Ninkovic, S.

1985-01-01

A tidal radius determination for a globular cluster based on its density minimum, which is caused by the galactic tidal forces and derivable from a model of the Galaxy, is proposed. Results obtained on the basis of the Schmidt model for two clusters are in a satisfactory agreement with those obtained earlier by means of other methods. A mass determination for the clusters through the tidal radius, when the latter one is identified with the cluster perigalactic distance, yields unusually large mass values. Probably, the tidal radius should be identified with the instantaneous galactocentric distance. Use of models more recent than the Schmidt one indicates that a globular cluster may contain a significant portion of an invisible interstellar matter. (author)
ClusterTAD: an unsupervised machine learning approach to detecting topologically associated domains of chromosomes from Hi-C data.

Science.gov (United States)

Oluwadare, Oluwatosin; Cheng, Jianlin

2017-11-14

With the development of chromosomal conformation capturing techniques, particularly, the Hi-C technique, the study of the spatial conformation of a genome is becoming an important topic in bioinformatics and computational biology. The Hi-C technique can generate genome-wide chromosomal interaction (contact) data, which can be used to investigate the higher-level organization of chromosomes, such as Topologically Associated Domains (TAD), i.e., locally packed chromosome regions bounded together by intra chromosomal contacts. The identification of the TADs for a genome is useful for studying gene regulation, genomic interaction, and genome function. Here, we formulate the TAD identification problem as an unsupervised machine learning (clustering) problem, and develop a new TAD identification method called ClusterTAD. We introduce a novel method to represent chromosomal contacts as features to be used by the clustering algorithm. Our results show that ClusterTAD can accurately predict the TADs on a simulated Hi-C data. Our method is also largely complementary and consistent with existing methods on the real Hi-C datasets of two mouse cells. The validation with the chromatin immunoprecipitation (ChIP) sequencing (ChIP-Seq) data shows that the domain boundaries identified by ClusterTAD have a high enrichment of CTCF binding sites, promoter-related marks, and enhancer-related histone modifications. As ClusterTAD is based on a proven clustering approach, it opens a new avenue to apply a large array of clustering methods developed in the machine learning field to the TAD identification problem. The source code, the results, and the TADs generated for the simulated and real Hi-C datasets are available here: https://github.com/BDM-Lab/ClusterTAD .
White blood cell segmentation by color-space-based k-means clustering.

Science.gov (United States)

Zhang, Congcong; Xiao, Xiaoyan; Li, Xiaomei; Chen, Ying-Jie; Zhen, Wu; Chang, Jun; Zheng, Chengyun; Liu, Zhi

2014-09-01

White blood cell (WBC) segmentation, which is important for cytometry, is a challenging issue because of the morphological diversity of WBCs and the complex and uncertain background of blood smear images. This paper proposes a novel method for the nucleus and cytoplasm segmentation of WBCs for cytometry. A color adjustment step was also introduced before segmentation. Color space decomposition and k-means clustering were combined for segmentation. A database including 300 microscopic blood smear images were used to evaluate the performance of our method. The proposed segmentation method achieves 95.7% and 91.3% overall accuracy for nucleus segmentation and cytoplasm segmentation, respectively. Experimental results demonstrate that the proposed method can segment WBCs effectively with high accuracy.
Cardiometabolic risk clustering in spinal cord injury: results of exploratory factor analysis.

Science.gov (United States)

Libin, Alexander; Tinsley, Emily A; Nash, Mark S; Mendez, Armando J; Burns, Patricia; Elrod, Matt; Hamm, Larry F; Groah, Suzanne L

2013-01-01

Evidence suggests an elevated prevalence of cardiometabolic risks among persons with spinal cord injury (SCI); however, the unique clustering of risk factors in this population has not been fully explored. The purpose of this study was to describe unique clustering of cardiometabolic risk factors differentiated by level of injury. One hundred twenty-one subjects (mean 37 ± 12 years; range, 18-73) with chronic C5 to T12 motor complete SCI were studied. Assessments included medical histories, anthropometrics and blood pressure, and fasting serum lipids, glucose, insulin, and hemoglobin A1c (HbA1c). The most common cardiometabolic risk factors were overweight/obesity, high levels of low-density lipoprotein (LDL-C), and low levels of high-density lipoprotein (HDL-C). Risk clustering was found in 76.9% of the population. Exploratory principal component factor analysis using varimax rotation revealed a 3-factor model in persons with paraplegia (65.4% variance) and a 4-factor solution in persons with tetraplegia (73.3% variance). The differences between groups were emphasized by the varied composition of the extracted factors: Lipid Profile A (total cholesterol [TC] and LDL-C), Body Mass-Hypertension Profile (body mass index [BMI], systolic blood pressure [SBP], and fasting insulin [FI]); Glycemic Profile (fasting glucose and HbA1c), and Lipid Profile B (TG and HDL-C). BMI and SBP formed a separate factor only in persons with tetraplegia. Although the majority of the population with SCI has risk clustering, the composition of the risk clusters may be dependent on level of injury, based on a factor analysis group comparison. This is clinically plausible and relevant as tetraplegics tend to be hypo- to normotensive and more sedentary, resulting in lower HDL-C and a greater propensity toward impaired carbohydrate metabolism.
PENERAPAN DATAMINING PADA POPULASI DAGING AYAM RAS PEDAGING DI INDONESIA BERDASARKAN PROVINSI MENGGUNAKAN K-MEANS CLUSTERING

Directory of Open Access Journals (Sweden)

Mhd Gading Sadewo

2017-09-01

Full Text Available Ayam bukanlah makanan yang asing bagi penduduk Indonesia. Makanan tersebut sangat mudah dijumpai dalam kehidupan masyarakat sehari-hari. Namun tingkat konsumsi daging ayam di Indonesia masih tergolong rendah dibandingkan dengan Negara tetangga. Penelitian ini membahas tentang Penerapan Datamining Pada Populasi Daging Ayam Ras Pedaging di Indonesia Berdasarkan Provinsi Menggunakan K-Means Clustering. Sumber data penelitian ini dikumpulkan berdasarkan dokumen-dokumen keterangan populasi daging ayam yang dihasilkan oleh Badan Pusat Statistik Nasional. Data yang digunakan dalam penelitian ini adalah data dari tahun 2009-2016 yang terdiri dari 34 provinsi. Variable yang digunakan (1 jumlah populasi dari tahun 2009-2016. Data akan diolah dengan melakukan clushtering dalam 3 clushter yaitu clusther tingkat populasi tinggi, clusther tingkat populasi sedang dan rendah. Centroid data untuk cluster tingkat populasi tinggi 4711403141, Centroid data untuk cluster tingkat populasi sedang 304240647, dan Centroid data untuk cluster tingkat populasi rendah 554200. Sehingga diperoleh penilaian berdasarkan indeks populasi daging ayam dengan 1 provinsi tingkat populasi tinggi yaitu Jawa Barat, 6 provinsi tingkat populasi sedang yaitu Sumatera Utara, Jawa Tengah, Jawa Timur, Banten, Kalimantan Selatan dan Kalimantan Timur, dan 27 provinsi lainnya termasuk tingkat populasi rendah. Hal ini dapat menjadi masukan kepada pemerintah, provinsi yang menjadi perhatian lebih pada populasi daging ayam berdasarkan cluster yang telah dilakukan
Comparison of Intra-cluster and M87 Halo Orphan Globular Clusters in the Virgo Cluster

Science.gov (United States)

Louie, Tiffany Kaye; Tuan, Jin Zong; Martellini, Adhara; Guhathakurta, Puragra; Toloba, Elisa; Peng, Eric; Longobardi, Alessia; Lim, Sungsoon

2018-01-01

We present a study of “orphan” globular clusters (GCs) — GCs with no identifiable nearby host galaxy — discovered in NGVS, a 104 deg2 CFHT/MegaCam imaging survey. At the distance of the Virgo cluster, GCs are bright enough to make good spectroscopic targets and many are barely resolved in good ground-based seeing. Our orphan GC sample is derived from a subset of NGVS-selected GC candidates that were followed up with Keck/DEIMOS spectroscopy. While our primary spectroscopic targets were candidate GC satellites of Virgo dwarf elliptical and ultra-diffuse galaxies, many objects turned out to be non-satellites based on a radial velocity mismatch with the Virgo galaxy they are projected close to. Using a combination of spectral characteristics (e.g., absorption vs. emission), Gaussian mixture modeling of radial velocity and positions, and extreme deconvolution analysis of ugrizk photometry and image morphology, these non-satellites were classified into: (1) intra-cluster GCs (ICGCs) in the Virgo cluster, (2) GCs in the outer halo of M87, (3) foreground Milky Way stars, and (4) background galaxies. The statistical distinction between ICGCs and M87 halo GCs is based on velocity distributions (mean of 1100 vs. 1300 km/s and dispersions of 700 vs. 400 km/s, respectively) and radial distribution (diffuse vs. centrally concentrated, respectively). We used coaddition to increase the spectral SNR for the two classes of orphan GCs and measured the equivalent widths (EWs) of the Mg b and H-beta absorption lines. These EWs were compared to single stellar population models to obtain mean age and metallicity estimates. The ICGCs and M87 halo GCs have = –0.6+/–0.3 and –0.4+/–0.3 dex, respectively, and mean ages of >~ 5 and >~ 10 Gyr, respectively. This suggests the M87 halo GCs formed in relatively high-mass galaxies that avoided being tidally disrupted by M87 until they were close to the cluster center, while IGCCs formed in relatively low-mass galaxies that were
Bagged K-means clustering of metabolome data

NARCIS (Netherlands)

Hageman, J. A.; van den Berg, R. A.; Westerhuis, J. A.; Hoefsloot, H. C. J.; Smilde, A. K.

2006-01-01

Clustering of metabolomics data can be hampered by noise originating from biological variation, physical sampling error and analytical error. Using data analysis methods which are not specially suited for dealing with noisy data will yield sub optimal solutions. Bootstrap aggregating (bagging) is a
Penerapan Fuzzy C-Means Untuk Penentuan Besar Uang Kuliah Tunggal Mahasiswa Baru

Directory of Open Access Journals (Sweden)

Ariyady Kurniawan Muchsin

2015-12-01

Full Text Available In accordance with the mandate of the 1945 Constitution article 31 concerning the education authorities have issued various policies to realize the cost of education is getting cheaper and affordable to all people, one of which is the system UKT (Tuition Single which is partially Tuition Single (BKT which were passed to each student based on their economic capabilities. UKT grouping mechanism is still done manually by Udayana University which resulted in the value of equity for prospective new students to their economic capacity is still lacking. Therefore, it needs a mechanism for charging and determination UKT which can be done online, so as to improve efficiency and effectiveness. The next solution that can be done is by using classification techniques using Fuzzy C-Means (FCM and Beni Xie Index to determine the optimum clusters in the process of determining the type UKT so as to meet the values ??of justice for prospective new students.
Impact of Mean Cell Hemoglobin on Hb A1c-Defined Glycemia Status.

Science.gov (United States)

Rodriguez-Segade, Santiago; Garcia, Javier Rodriguez; García-López, José M; Gude, Francisco; Casanueva, Felipe F; Rs-Alonso, Santiago; Camiña, Félix

2016-12-01

Several hematological alterations are associated with altered hemoglobin A 1c (Hb A 1c ). However, there have been no reports of their influence on the rates of exceeding standard Hb A 1c thresholds by patients for whom Hb A 1c determination is requested in clinical practice. The initial data set included the first profiles (complete blood counts, Hb A 1c , fasting glucose, and renal and hepatic parameters) of all adult patients for whom such a profile was requested between 2008 and 2013 inclusive. After appropriate exclusions, 21844 patients remained in the study. Linear and logistic regression models were adjusted for demographic, hematological, and biochemical variables excluded from the predictors. Mean corpuscular hemoglobin (MCH) and mean corpuscular volume (MCV) correlated negatively with Hb A 1c . Fasting glucose, MCH, and age emerged as predictors of Hb A 1c in a stepwise regression that discarded sex, hemoglobin, MCV, mean corpuscular hemoglobin concentration (MCHC), serum creatinine, and liver disease. Mean Hb A 1c in MCH interdecile intervals fell from 6.8% (51 mmol/mol) in the lowest (≤27.5 pg) to 6.0% (43 mmol/mol) in the highest (>32.5 pg), with similar results for MCV. After adjustment for fasting glucose and other correlates of Hb A 1c , a 1 pg increase in MCH reduced the odds of Hb A 1c -defined dysglycemia, diabetes and poor glycemia control by 10%-14%. For at least 25% of patients, low or high MCH or MCV levels are associated with increased risk of an erroneous Hb A 1c -based identification of glycemia status. Although causality has not been demonstrated, these parameters should be taken into account in interpreting Hb A 1c levels in clinical practice. © 2016 American Association for Clinical Chemistry.
Synthetic properties of models of globular clusters

Energy Technology Data Exchange (ETDEWEB)

Angeletti, L; Dolcetta, R; Giannone, P. (Rome Univ. (Italy). Osservatorio Astronomico)

1980-05-01

Synthetic and projected properties of models of globular clusters have been computed on the basis of stellar evolution and time changes of the dynamical cluster structure. Clusters with five and eight stellar groups (each group consisting of stars with the same mass) were studied. Mass loss from evolved stars was taken into account. Observational features were obtained at ages of 10-19 x 10/sup 9/ yr. The basic importance of the horizontal- and asymptotic-branch stars was pointed out. A comparison of the results with observed data of M3 is discussed with the purpose of obtaining general indications rather than a specific fit.
Synthetic properties of models of globular clusters

International Nuclear Information System (INIS)

Angeletti, L.; Dolcetta, R.; Giannone, P.

1980-01-01

Synthetic and projected properties of models of globular clusters have been computed on the basis of stellar evolution and time changes of the dynamical cluster structure. Clusters with five and eight stellar groups (each group consisting of stars with the same mass) were studied. Mass loss from evolved stars was taken into account. Observational features were obtained at ages of 10-19 x 10 9 yr. The basic importance of the horizontal- and asymptotic-branch stars was pointed out. A comparison of the results with observed data of M3 is discussed with the purpose of obtaining general indications rather than a specific fit. (orig.)
Towards Accurate Modelling of Galaxy Clustering on Small Scales: Testing the Standard ΛCDM + Halo Model

Science.gov (United States)

Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron K.; Scoccimarro, Roman; Piscionere, Jennifer A.; Wibking, Benjamin D.

2018-04-01

Interpreting the small-scale clustering of galaxies with halo models can elucidate the connection between galaxies and dark matter halos. Unfortunately, the modelling is typically not sufficiently accurate for ruling out models statistically. It is thus difficult to use the information encoded in small scales to test cosmological models or probe subtle features of the galaxy-halo connection. In this paper, we attempt to push halo modelling into the "accurate" regime with a fully numerical mock-based methodology and careful treatment of statistical and systematic errors. With our forward-modelling approach, we can incorporate clustering statistics beyond the traditional two-point statistics. We use this modelling methodology to test the standard ΛCDM + halo model against the clustering of SDSS DR7 galaxies. Specifically, we use the projected correlation function, group multiplicity function and galaxy number density as constraints. We find that while the model fits each statistic separately, it struggles to fit them simultaneously. Adding group statistics leads to a more stringent test of the model and significantly tighter constraints on model parameters. We explore the impact of varying the adopted halo definition and cosmological model and find that changing the cosmology makes a significant difference. The most successful model we tried (Planck cosmology with Mvir halos) matches the clustering of low luminosity galaxies, but exhibits a 2.3σ tension with the clustering of luminous galaxies, thus providing evidence that the "standard" halo model needs to be extended. This work opens the door to adding interesting freedom to the halo model and including additional clustering statistics as constraints.
A COMPARISON OF TWO FUZZY CLUSTERING TECHNIQUES

Directory of Open Access Journals (Sweden)

Samarjit Das

2013-10-01

Full Text Available - In fuzzy clustering, unlike hard clustering, depending on the membership value, a single object may belong exactly to one cluster or partially to more than one cluster. Out of a number of fuzzy clustering techniques Bezdek’s Fuzzy C-Means and GustafsonKessel clustering techniques are well known where Euclidian distance and Mahalanobis distance are used respectively as a measure of similarity. We have applied these two fuzzy clustering techniques on a dataset of individual differences consisting of fifty feature vectors of dimension (feature three. Based on some validity measures we have tried to see the performances of these two clustering techniques from three different aspects- first, by initializing the membership values of the feature vectors considering the values of the three features separately one at a time, secondly, by changing the number of the predefined clusters and thirdly, by changing the size of the dataset.
APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

DEFF Research Database (Denmark)

Zong, Yu; Xu, Guandong; Jin, Pin

2011-01-01

algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C1,C2,…,Cm}; (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone...... as the initial tag clustering result and then assign the rest tags into the corresponding clusters based on the similarity. Experimental results on three real world datasets namely MedWorm, MovieLens and Dmoz demonstrate the effectiveness and the superiority of the proposed method against the traditional...... Agglomerative Clustering on tagging data, which possess the inherent drawbacks, such as the sensitivity of initialization. In this paper, we instead make use of the approximate backbone of tag clustering results to find out better tag clusters. In particular, we propose an APProximate backbonE-based Clustering...
Markov Chain Model-Based Optimal Cluster Heads Selection for Wireless Sensor Networks

Directory of Open Access Journals (Sweden)

Gulnaz Ahmed

2017-02-01

Full Text Available The longer network lifetime of Wireless Sensor Networks (WSNs is a goal which is directly related to energy consumption. This energy consumption issue becomes more challenging when the energy load is not properly distributed in the sensing area. The hierarchal clustering architecture is the best choice for these kind of issues. In this paper, we introduce a novel clustering protocol called Markov chain model-based optimal cluster heads (MOCHs selection for WSNs. In our proposed model, we introduce a simple strategy for the optimal number of cluster heads selection to overcome the problem of uneven energy distribution in the network. The attractiveness of our model is that the BS controls the number of cluster heads while the cluster heads control the cluster members in each cluster in such a restricted manner that a uniform and even load is ensured in each cluster. We perform an extensive range of simulation using five quality measures, namely: the lifetime of the network, stable and unstable region in the lifetime of the network, throughput of the network, the number of cluster heads in the network, and the transmission time of the network to analyze the proposed model. We compare MOCHs against Sleep-awake Energy Efficient Distributed (SEED clustering, Artificial Bee Colony (ABC, Zone Based Routing (ZBR, and Centralized Energy Efficient Clustering (CEEC using the above-discussed quality metrics and found that the lifetime of the proposed model is almost 1095, 2630, 3599, and 2045 rounds (time steps greater than SEED, ABC, ZBR, and CEEC, respectively. The obtained results demonstrate that the MOCHs is better than SEED, ABC, ZBR, and CEEC in terms of energy efficiency and the network throughput.
COCOA Code for Creating Mock Observations of Star Cluster Models

OpenAIRE

Askar, Abbas; Giersz, Mirek; Pych, Wojciech; Dalessandro, Emanuele

2017-01-01

We introduce and present results from the COCOA (Cluster simulatiOn Comparison with ObservAtions) code that has been developed to create idealized mock photometric observations using results from numerical simulations of star cluster evolution. COCOA is able to present the output of realistic numerical simulations of star clusters carried out using Monte Carlo or \\textit{N}-body codes in a way that is useful for direct comparison with photometric observations. In this paper, we describe the C...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.