Everitt, Brian S; Leese, Morven; Stahl, Daniel
2011-01-01
Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons
Cluster-based analysis of multi-model climate ensembles
Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.
2018-06-01
Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and
International Nuclear Information System (INIS)
Romli
1997-01-01
Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods
Analysis of the dynamical cluster approximation for the Hubbard model
Aryanpour, K.; Hettler, M. H.; Jarrell, M.
2002-01-01
We examine a central approximation of the recently introduced Dynamical Cluster Approximation (DCA) by example of the Hubbard model. By both analytical and numerical means we study non-compact and compact contributions to the thermodynamic potential. We show that approximating non-compact diagrams by their cluster analogs results in a larger systematic error as compared to the compact diagrams. Consequently, only the compact contributions should be taken from the cluster, whereas non-compact ...
Mucha, Hans-Joachim; Sofyan, Hizir
2000-01-01
As an explorative technique, duster analysis provides a description or a reduction in the dimension of the data. It classifies a set of observations into two or more mutually exclusive unknown groups based on combinations of many variables. Its aim is to construct groups in such a way that the profiles of objects in the same groups are relatively homogenous whereas the profiles of objects in different groups are relatively heterogeneous. Clustering is distinct from classification techniques, ...
Topic modeling for cluster analysis of large biological and medical datasets.
Zhao, Weizhong; Zou, Wen; Chen, James J
2014-01-01
The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting
A comparison of heuristic and model-based clustering methods for dietary pattern analysis.
Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia
2016-02-01
Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.
Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.
Li, Zhaonan; Xu, Xinyi; Shen, Junshan
2017-11-10
In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.
Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis
Directory of Open Access Journals (Sweden)
Chao Zhang
2017-09-01
Full Text Available A wireless-powered sensor network (WPSN consisting of one hybrid access point (HAP, a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.
Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis.
Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan
2017-09-27
A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.
3D Building Models Segmentation Based on K-Means++ Cluster Analysis
Zhang, C.; Mao, B.
2016-10-01
3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model) 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid) and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.
3D BUILDING MODELS SEGMENTATION BASED ON K-MEANS++ CLUSTER ANALYSIS
Directory of Open Access Journals (Sweden)
C. Zhang
2016-10-01
Full Text Available 3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.
Li, Guohui; Zhang, Songling; Yang, Hong
2017-01-01
Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed ...
Cluster analysis for applications
Anderberg, Michael R
1973-01-01
Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o
The dynamics of cyclone clustering in re-analysis and a high-resolution climate model
Priestley, Matthew; Pinto, Joaquim; Dacre, Helen; Shaffrey, Len
2017-04-01
Extratropical cyclones have a tendency to occur in groups (clusters) in the exit of the North Atlantic storm track during wintertime, potentially leading to widespread socioeconomic impacts. The Winter of 2013/14 was the stormiest on record for the UK and was characterised by the recurrent clustering of intense extratropical cyclones. This clustering was associated with a strong, straight and persistent North Atlantic 250 hPa jet with Rossby wave-breaking (RWB) on both flanks, pinning the jet in place. Here, we provide for the first time an analysis of all clustered events in 36 years of the ERA-Interim Re-analysis at three latitudes (45˚ N, 55˚ N, 65˚ N) encompassing various regions of Western Europe. The relationship between the occurrence of RWB and cyclone clustering is studied in detail. Clustering at 55˚ N is associated with an extended and anomalously strong jet flanked on both sides by RWB. However, clustering at 65(45)˚ N is associated with RWB to the south (north) of the jet, deflecting the jet northwards (southwards). A positive correlation was found between the intensity of the clustering and RWB occurrence to the north and south of the jet. However, there is considerable spread in these relationships. Finally, analysis has shown that the relationships identified in the re-analysis are also present in a high-resolution coupled global climate model (HiGEM). In particular, clustering is associated with the same dynamical conditions at each of our three latitudes in spite of the identified biases in frequency and intensity of RWB.
A Model-Based Cluster Analysis of Maternal Emotion Regulation and Relations to Parenting Behavior.
Shaffer, Anne; Whitehead, Monica; Davis, Molly; Morelen, Diana; Suveg, Cynthia
2017-10-15
In a diverse community sample of mothers (N = 108) and their preschool-aged children (M age = 3.50 years), this study conducted person-oriented analyses of maternal emotion regulation (ER) based on a multimethod assessment incorporating physiological, observational, and self-report indicators. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia (RSA), and RSA suppression across two laboratory tasks. Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior. A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed. © 2017 Family Process Institute.
Marketing research cluster analysis
Directory of Open Access Journals (Sweden)
Marić Nebojša
2002-01-01
Full Text Available One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.
Marketing research cluster analysis
Marić Nebojša
2002-01-01
One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.
Lawson, Andrew B
2002-01-01
Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space and space-time, spatial and spatio-temporal process modelling, nonparametric methods for clustering, and spatio-temporal ...
Directory of Open Access Journals (Sweden)
Lingli Jiang
2011-01-01
Full Text Available This paper proposes a new approach combining autoregressive (AR model and fuzzy cluster analysis for bearing fault diagnosis and degradation assessment. AR model is an effective approach to extract the fault feature, and is generally applied to stationary signals. However, the fault vibration signals of a roller bearing are non-stationary and non-Gaussian. Aiming at this problem, the set of parameters of the AR model is estimated based on higher-order cumulants. Consequently, the AR parameters are taken as the feature vectors, and fuzzy cluster analysis is applied to perform classification and pattern recognition. Experiments analysis results show that the proposed method can be used to identify various types and severities of fault bearings. This study is significant for non-stationary and non-Gaussian signal analysis, fault diagnosis and degradation assessment.
Nuclear clustering - a cluster core model study
International Nuclear Information System (INIS)
Paul Selvi, G.; Nandhini, N.; Balasubramaniam, M.
2015-01-01
Nuclear clustering, similar to other clustering phenomenon in nature is a much warranted study, since it would help us in understanding the nature of binding of the nucleons inside the nucleus, closed shell behaviour when the system is highly deformed, dynamics and structure at extremes. Several models account for the clustering phenomenon of nuclei. We present in this work, a cluster core model study of nuclear clustering in light mass nuclei
A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.
Mo, Qianxing; Shen, Ronglai; Guo, Cui; Vannucci, Marina; Chan, Keith S; Hilsenbeck, Susan G
2018-01-01
Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Supercomputer and cluster performance modeling and analysis efforts:2004-2006.
Energy Technology Data Exchange (ETDEWEB)
Sturtevant, Judith E.; Ganti, Anand; Meyer, Harold (Hal) Edward; Stevenson, Joel O.; Benner, Robert E., Jr. (.,; .); Goudy, Susan Phelps; Doerfler, Douglas W.; Domino, Stefan Paul; Taylor, Mark A.; Malins, Robert Joseph; Scott, Ryan T.; Barnette, Daniel Wayne; Rajan, Mahesh; Ang, James Alfred; Black, Amalia Rebecca; Laub, Thomas William; Vaughan, Courtenay Thomas; Franke, Brian Claude
2007-02-01
This report describes efforts by the Performance Modeling and Analysis Team to investigate performance characteristics of Sandia's engineering and scientific applications on the ASC capability and advanced architecture supercomputers, and Sandia's capacity Linux clusters. Efforts to model various aspects of these computers are also discussed. The goals of these efforts are to quantify and compare Sandia's supercomputer and cluster performance characteristics; to reveal strengths and weaknesses in such systems; and to predict performance characteristics of, and provide guidelines for, future acquisitions and follow-on systems. Described herein are the results obtained from running benchmarks and applications to extract performance characteristics and comparisons, as well as modeling efforts, obtained during the time period 2004-2006. The format of the report, with hypertext links to numerous additional documents, purposefully minimizes the document size needed to disseminate the extensive results from our research.
Modeling, Stability Analysis and Active Stabilization of Multiple DC-Microgrids Clusters
DEFF Research Database (Denmark)
Shafiee, Qobad; Dragicevic, Tomislav; Vasquez, Juan Carlos
2014-01-01
), and more especially during interconnection with other MGs, creating dc MG clusters. This paper develops a small signal model for dc MGs from the control point of view, in order to study stability analysis and investigate effects of CPLs and line impedances between the MGs on stability of these systems....... This model can be also used to synthesis and study dynamics of control loops in dc MGs and also dc MG clusters. An active stabilization method is proposed to be implemented as a dc active power filter (APF) inside the MGs in order to not only increase damping of dc MGs at the presence of CPLs but also...... to improve their stability while connecting to the other MGs. Simulation results are provided to evaluate the developed models and demonstrate the effectiveness of proposed active stabilization technique....
A Novel Clustering Model Based on Set Pair Analysis for the Energy Consumption Forecast in China
Directory of Open Access Journals (Sweden)
Mingwu Wang
2014-01-01
Full Text Available The energy consumption forecast is important for the decision-making of national economic and energy policies. But it is a complex and uncertainty system problem affected by the outer environment and various uncertainty factors. Herein, a novel clustering model based on set pair analysis (SPA was introduced to analyze and predict energy consumption. The annual dynamic relative indicator (DRI of historical energy consumption was adopted to conduct a cluster analysis with Fisher’s optimal partition method. Combined with indicator weights, group centroids of DRIs for influence factors were transferred into aggregating connection numbers in order to interpret uncertainty by identity-discrepancy-contrary (IDC analysis. Moreover, a forecasting model based on similarity to group centroid was discussed to forecast energy consumption of a certain year on the basis of measured values of influence factors. Finally, a case study predicting China’s future energy consumption as well as comparison with the grey method was conducted to confirm the reliability and validity of the model. The results indicate that the method presented here is more feasible and easier to use and can interpret certainty and uncertainty of development speed of energy consumption and influence factors as a whole.
Directory of Open Access Journals (Sweden)
Ma Jinhui
2013-01-01
Full Text Available Abstracts Background The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE and cluster-specific (i.e. random-effects logistic regression (RELR models for analyzing data from cluster randomized trials (CRTs with missing binary responses. Methods In this simulation study, clustered responses were generated from a beta-binomial distribution. The number of clusters per trial arm, the number of subjects per cluster, intra-cluster correlation coefficient, and the percentage of missing data were allowed to vary. Under the assumption of covariate dependent missingness, missing outcomes were handled by complete case analysis, standard multiple imputation (MI and within-cluster MI strategies. Data were analyzed using GEE and RELR. Performance of the methods was assessed using standardized bias, empirical standard error, root mean squared error (RMSE, and coverage probability. Results GEE performs well on all four measures — provided the downward bias of the standard error (when the number of clusters per arm is small is adjusted appropriately — under the following scenarios: complete case analysis for CRTs with a small amount of missing data; standard MI for CRTs with variance inflation factor (VIF 50. RELR performs well only when a small amount of data was missing, and complete case analysis was applied. Conclusion GEE performs well as long as appropriate missing data strategies are adopted based on the design of CRTs and the percentage of missing data. In contrast, RELR does not perform well when either standard or within-cluster MI strategy is applied prior to the analysis.
Firdausiah Mansur, Andi Besse; Yusof, Norazah
2013-01-01
Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…
Effect of Policy Analysis on Indonesia’s Maritime Cluster Development Using System Dynamics Modeling
Nursyamsi, A.; Moeis, A. O.; Komarudin
2018-03-01
As an archipelago with two third of its territory consist of water, Indonesia should address more attention to its maritime industry development. One of the catalyst to fasten the maritime industry growth is by developing a maritime cluster. The purpose of this research is to gain understanding of the effect if Indonesia implement maritime cluster policy to the growth of maritime economic and its role to enhance the maritime cluster performance, hence enhancing Indonesia’s maritime industry as well. The result of the constructed system dynamic model simulation shows that with the effect of maritime cluster, the growth of employment rate and maritime economic is much bigger that the business as usual case exponentially. The result implies that the government should act fast to form a legitimate cluster maritime organizer institution so that there will be a synergize, sustainable, and positive maritime cluster environment that will benefit the performance of Indonesia’s maritime industry.
Comprehensive cluster analysis with Transitivity Clustering.
Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan
2011-03-01
Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
FORMATION OF A INNOVATION REGIONAL CLUSTER MODEL
Directory of Open Access Journals (Sweden)
G. S. Merzlikina
2015-01-01
Full Text Available Summary. As a result of investigation of science and methodical approaches related problems of building and development of innovation clusters there were some issues in functional assignments of innovation and production clusters. Because of those issues, article’s authors differ conceptions of innovation cluster and production cluster, as they explain notion of innovation-production cluster. The main goal of this article is to reveal existing organizational issues in cluster building and its successful development. Based on regional clusters building analysis carried out there was typical practical structure of cluster members interaction revealed. This structure also have its cons, as following: absence cluster orientation to marketing environment, lack of members’ prolonged relations’ building and development system, along with ineffective management of information, financial and material streams within cluster, narrow competence difference and responsibility zones between cluster members, lack of transparence of cluster’s action, low environment changes adaptivity, hard to use cluster members’ intellectual property, and commercialization of hi-tech products. When all those issues listed above come together, it reduces life activity of existing models of innovative cluster-building along with practical opportunity of cluster realization. Because of that, authors offer an upgraded innovative-productive cluster building model with more efficient business processes management system, which includes advanced innovative cluster structure, competence matrix and subcluster responsibility zone. Suggested model differs from other ones by using unified innovative product development control center, which also controls production and marketing realization.
DEFF Research Database (Denmark)
Thomsen, C E; Rosenfalck, A; Nørregaard Christensen, K
1991-01-01
The brain activity electroencephalogram (EEG) was recorded from 30 healthy women scheduled for hysterectomy. The patients were anaesthetized with isoflurane, halothane or etomidate/fentanyl. A multiparametric method was used for extraction of amplitude and frequency information from the EEG....... The method applied autoregressive modelling of the signal, segmented in 2 s fixed intervals. The features from the EEG segments were used for learning and for classification. The learning process was unsupervised and hierarchical clustering analysis was used to construct a learning set of EEG amplitude......-frequency patterns for each of the three anaesthetic drugs. These EEG patterns were assigned to a colour code corresponding to similar clinical states. A common learning set could be used for all patients anaesthetized with the same drug. The classification process could be performed on-line and the results were...
International Nuclear Information System (INIS)
Horiuchi, H.; Ikeda, K.
1986-01-01
This article reviews the development of the cluster model study. The stress is put on two points; one is how the cluster structure has come to be regarded as a fundamental structure in light nuclei together with the shell-model structure, and the other is how at present the cluster model is extended to and connected with the studies of the various subjects many of which are in the neighbouring fields. The authors the present the main theme with detailed explanations of the fundamentals of the microscopic cluster model which have promoted the development of the cluster mode. Examples of the microscopic cluster model study of light nuclear structure are given
[Cluster analysis in biomedical researches].
Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D
2013-01-01
Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.
Potts Model with Invisible Colors : Random-Cluster Representation and Pirogov–Sinai Analysis
Enter, Aernout C.D. van; Iacobelli, Giulio; Taati, Siamak
We study a recently introduced variant of the ferromagnetic Potts model consisting of a ferromagnetic interaction among q “visible” colors along with the presence of r non-interacting “invisible” colors. We introduce a random-cluster representation for the model, for which we prove the existence of
On the Modeling and Analysis of Heterogeneous Radio Access Networks using a Poisson Cluster Process
DEFF Research Database (Denmark)
Suryaprakash, Vinay; Møller, Jesper; Fettweis, Gerhard P.
processes, some of which are alluded to (later) in this paper. We model a heterogeneous network consisting of two types of base stations by using a particular Poisson cluster process model. The main contributions are two-fold. First, a complete description of the interference in heterogeneous networks...
Wu, Hsin-Hung; Lin, Shih-Yen; Liu, Chih-Wei
2014-01-01
This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs.
Lin, Shih-Yen; Liu, Chih-Wei
2014-01-01
This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs. PMID:25045741
Directory of Open Access Journals (Sweden)
Hsin-Hung Wu
2014-01-01
Full Text Available This study combines cluster analysis and LRFM (length, recency, frequency, and monetary model in a pediatric dental clinic in Taiwan to analyze patients’ values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients’ needs.
Directory of Open Access Journals (Sweden)
Yihang Yin
2015-08-01
Full Text Available Wireless sensor networks (WSNs have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA. First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong
2015-08-07
Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
The covariance matrix of the Potts model: A random cluster analysis
International Nuclear Information System (INIS)
Borgs, C.; Chayes, J.T.
1996-01-01
We consider the covariance matrix, G mn = q 2 x ,m); δ(σ y ,n)>, of the d-dimensional q-states Potts model, rewriting it in the random cluster representation of Fortuin and Kasteleyn. In many of the q ordered phases, we identify the eigenvalues of this matrix both in terms of representations of the unbroken symmetry group of the model and in terms of random cluster connectivities and covariances, thereby attributing algebraic significance to these stochastic geometric quantities. We also show that the correlation length and the correlation length corresponding to the decay rate of one on the eigenvalues in the same as the inverse decay rate of the diameter of finite clusters. For dimension of d=2, we show that this correlation length and the correlation length of two-point function with free boundary conditions at the corresponding dual temperature are equal up to a factor of two. For systems with first-order transitions, this relation helps to resolve certain inconsistencies between recent exact and numerical work on correlation lengths at the self-dual point β o . For systems with second order transitions, this relation implies the equality of the correlation length exponents from above below threshold, as well as an amplitude ratio of two. In the course of proving the above results, we establish several properties of independent interest, including left continuity of the inverse correlation length with free boundary conditions and upper semicontinuity of the decay rate for finite clusters in all dimensions, and left continuity of the two-dimensional free boundary condition percolation probability at β o . We also introduce DLR equations for the random cluster model and use them to establish ergodicity of the free measure. In order to prove these results, we introduce a new class of events which we call decoupling events and two inequalities for these events
Evaluating Mixture Modeling for Clustering: Recommendations and Cautions
Steinley, Douglas; Brusco, Michael J.
2011-01-01
This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…
Microscopic cluster model analysis of 14O+p elastic scattering
International Nuclear Information System (INIS)
Baye, D.; Descouvemont, P.; Leo, F.
2005-01-01
The 14 O+p elastic scattering is discussed in detail in a fully microscopic cluster model. The 14 O cluster is described by a closed p shell for protons and a closed p3/2 subshell for neutrons in the translation-invariant harmonic-oscillator model. The exchange and spin-orbit parameters of the effective forces are tuned on the energy levels of the 15 C mirror system. With the generator-coordinate and microscopic R-matrix methods, phase shifts and cross sections are calculated for the 14 O+p elastic scattering. An excellent agreement is found with recent experimental data. A comparison is performed with phenomenological R-matrix fits. Resonances properties in 15 F are discussed
Directory of Open Access Journals (Sweden)
Fateh Nassim Melzi
2017-09-01
Full Text Available The large amount of data collected by smart meters is a valuable resource that can be used to better understand consumer behavior and optimize electricity consumption in cities. This paper presents an unsupervised classification approach for extracting typical consumption patterns from data generated by smart electric meters. The proposed approach is based on a constrained Gaussian mixture model whose parameters vary according to the day type (weekday, Saturday or Sunday. The proposed methodology is applied to a real dataset of Irish households collected by smart meters over one year. For each cluster, the model provides three consumption profiles that depend on the day type. In the first instance, the model is applied on the electricity consumption of users during one month to extract groups of consumers who exhibit similar consumption behaviors. The clustering results are then crossed with contextual variables available for the households to show the close links between electricity consumption and household socio-economic characteristics. At the second instance, the evolution of the consumer behavior from one month to another is assessed through variations of cluster sizes over time. The results show that the consumer behavior evolves over time depending on the contextual variables such as temperature fluctuations and calendar events.
Modeling and analysis of the spectrum of the globular cluster NGC 2419
Sharina, M. E.; Shimansky, V. V.; Davoust, E.
2013-06-01
The properties of the stellar population of the unusual object NGC 2419 are studied; this is the most distant high-mass globular cluster of the Galaxy's outer halo, and a spectrum taken with the 1.93-m telescope of the Haute Provence Observatory displays elemental abundance anomalies. Since traditional high-resolution spectroscopicmethods are applicable to bright stars only, spectroscopic information for the cluster's stellar population as a whole, integrated along the spectrograph slit placed in various positions, is used. Population synthesis is carried out for the spectrum of NGC 2419 using synthetic spectra calculated from a grid of stellar model atmospheres, based on the theoretical isochrone from the literature that best fits the color-magnitude diagram of the cluster. The derived age (12.6 billion years), metallicity ([Fe/H] = -2.25 dex), and abundances of helium ( Y = 0.26) and other chemical elements (a total of 14) are in a good qualitative agreement with estimates from the literature made from high-resolution spectra of eight red giants in the cluster. The influence on the spectrum of deviations from local thermodynamic equilibrium is considered for several elements. The derived abundance of α-elements ([ α/Fe] = 0.13 dex, as the mean of [O/Fe], [Mg/Fe], and [Ca/Fe]) differs from the mean value in the literature ([ α/Fe] = 0.4 for the eight brightest red giants) and may be explained by recently discovered in NGC2419 large [a/Fe] dispersion. Further studies of the integrated properties of the stellar population in NGC 2419 using higher-resolution spectrographs in various wavelength ranges should help improve our understanding of the cluster's chemical anomalies.
Cluster Based Text Classification Model
DEFF Research Database (Denmark)
Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock
2011-01-01
We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...
Zhang, Juping; Yang, Chan; Jin, Zhen; Li, Jia
2018-07-14
In this paper, the correlation coefficients between nodes in states are used as dynamic variables, and we construct SIR epidemic dynamic models with correlation coefficients by using the pair approximation method in static networks and dynamic networks, respectively. Considering the clustering coefficient of the network, we analytically investigate the existence and the local asymptotic stability of each equilibrium of these models and derive threshold values for the prevalence of diseases. Additionally, we obtain two equivalent epidemic thresholds in dynamic networks, which are compared with the results of the mean field equations. Copyright © 2018 Elsevier Ltd. All rights reserved.
Tsui, Sharon; Denison, Julie A; Kennedy, Caitlin E; Chang, Larry W; Koole, Olivier; Torpey, Kwasi; Van Praag, Eric; Farley, Jason; Ford, Nathan; Stuart, Leine; Wabwire-Mangen, Fred
2017-12-06
Organization of HIV care and treatment services, including clinic staffing and services, may shape clinical and financial outcomes, yet there has been little attempt to describe different models of HIV care in sub-Saharan Africa (SSA). Information about the relative benefits and drawbacks of different models could inform the scale-up of antiretroviral therapy (ART) and associated services in resource-limited settings (RLS), especially in light of expanded client populations with country adoption of WHO's test and treat recommendation. We characterized task-shifting/task-sharing practices in 19 diverse ART clinics in Tanzania, Uganda, and Zambia and used cluster analysis to identify unique models of service provision. We ran descriptive statistics to explore how the clusters varied by environmental factors and programmatic characteristics. Finally, we employed the Delphi Method to make systematic use of expert opinions to ensure that the cluster variables were meaningful in the context of actual task-shifting of ART services in SSA. The cluster analysis identified three task-shifting/task-sharing models. The main differences across models were the availability of medical doctors, the scope of clinical responsibility assigned to nurses, and the use of lay health care workers. Patterns of healthcare staffing in HIV service delivery were associated with different environmental factors (e.g., health facility levels, urban vs. rural settings) and programme characteristics (e.g., community ART distribution or integrated tuberculosis treatment on-site). Understanding the relative advantages and disadvantages of different models of care can help national programmes adapt to increased client load, select optimal adherence strategies within decentralized models of care, and identify differentiated models of care for clients to meet the growing needs of long-term ART patients who require more complicated treatment management.
Gooya, Ali; Lekadir, Karim; Alba, Xenia; Swift, Andrew J; Wild, Jim M; Frangi, Alejandro F
2015-01-01
Construction of Statistical Shape Models (SSMs) from arbitrary point sets is a challenging problem due to significant shape variation and lack of explicit point correspondence across the training data set. In medical imaging, point sets can generally represent different shape classes that span healthy and pathological exemplars. In such cases, the constructed SSM may not generalize well, largely because the probability density function (pdf) of the point sets deviates from the underlying assumption of Gaussian statistics. To this end, we propose a generative model for unsupervised learning of the pdf of point sets as a mixture of distinctive classes. A Variational Bayesian (VB) method is proposed for making joint inferences on the labels of point sets, and the principal modes of variations in each cluster. The method provides a flexible framework to handle point sets with no explicit point-to-point correspondences. We also show that by maximizing the marginalized likelihood of the model, the optimal number of clusters of point sets can be determined. We illustrate this work in the context of understanding the anatomical phenotype of the left and right ventricles in heart. To this end, we use a database containing hearts of healthy subjects, patients with Pulmonary Hypertension (PH), and patients with Hypertrophic Cardiomyopathy (HCM). We demonstrate that our method can outperform traditional PCA in both generalization and specificity measures.
Integrative cluster analysis in bioinformatics
Abu-Jamous, Basel; Nandi, Asoke K
2015-01-01
Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o
Directory of Open Access Journals (Sweden)
Laurent Berge
2012-01-01
Full Text Available This paper presents the R package HDclassif which is devoted to the clustering and the discriminant analysis of high-dimensional data. The classification methods proposed in the package result from a new parametrization of the Gaussian mixture model which combines the idea of dimension reduction and model constraints on the covariance matrices. The supervised classification method using this parametrization is called high dimensional discriminant analysis (HDDA. In a similar manner, the associated clustering method iscalled high dimensional data clustering (HDDC and uses the expectation-maximization algorithm for inference. In order to correctly t the data, both methods estimate the specific subspace and the intrinsic dimension of the groups. Due to the constraints on the covariance matrices, the number of parameters to estimate is significantly lower than other model-based methods and this allows the methods to be stable and efficient in high dimensions. Two introductory examples illustrated with R codes allow the user to discover the hdda and hddc functions. Experiments on simulated and real datasets also compare HDDC and HDDA with existing classification methods on high-dimensional datasets. HDclassif is a free software and distributed under the general public license, as part of the R software project.
Cluster Correlation in Mixed Models
Gardini, A.; Bonometto, S. A.; Murante, G.; Yepes, G.
2000-10-01
We evaluate the dependence of the cluster correlation length, rc, on the mean intercluster separation, Dc, for three models with critical matter density, vanishing vacuum energy (Λ=0), and COBE normalization: a tilted cold dark matter (tCDM) model (n=0.8) and two blue mixed models with two light massive neutrinos, yielding Ωh=0.26 and 0.14 (MDM1 and MDM2, respectively). All models approach the observational value of σ8 (and hence the observed cluster abundance) and are consistent with the observed abundance of damped Lyα systems. Mixed models have a motivation in recent results of neutrino physics; they also agree with the observed value of the ratio σ8/σ25, yielding the spectral slope parameter Γ, and nicely fit Las Campanas Redshift Survey (LCRS) reconstructed spectra. We use parallel AP3M simulations, performed in a wide box (of side 360 h-1 Mpc) and with high mass and distance resolution, enabling us to build artificial samples of clusters, whose total number and mass range allow us to cover the same Dc interval inspected through Automatic Plate Measuring Facility (APM) and Abell cluster clustering data. We find that the tCDM model performs substantially better than n=1 critical density CDM models. Our main finding, however, is that mixed models provide a surprisingly good fit to cluster clustering data.
Cluster analysis of track structure
International Nuclear Information System (INIS)
Michalik, V.
1991-01-01
One of the possibilities of classifying track structures is application of conventional partition techniques of analysis of multidimensional data to the track structure. Using these cluster algorithms this paper attempts to find characteristics of radiation reflecting the spatial distribution of ionizations in the primary particle track. An absolute frequency distribution of clusters of ionizations giving the mean number of clusters produced by radiation per unit of deposited energy can serve as this characteristic. General computation techniques used as well as methods of calculations of distributions of clusters for different radiations are discussed. 8 refs.; 5 figs
Directory of Open Access Journals (Sweden)
Utro Filippo
2008-10-01
Full Text Available Abstract Background Inferring cluster structure in microarray datasets is a fundamental task for the so-called -omic sciences. It is also a fundamental question in Statistics, Data Analysis and Classification, in particular with regard to the prediction of the number of clusters in a dataset, usually established via internal validation measures. Despite the wealth of internal measures available in the literature, new ones have been recently proposed, some of them specifically for microarray data. Results We consider five such measures: Clest, Consensus (Consensus Clustering, FOM (Figure of Merit, Gap (Gap Statistics and ME (Model Explorer, in addition to the classic WCSS (Within Cluster Sum-of-Squares and KL (Krzanowski and Lai index. We perform extensive experiments on six benchmark microarray datasets, using both Hierarchical and K-means clustering algorithms, and we provide an analysis assessing both the intrinsic ability of a measure to predict the correct number of clusters in a dataset and its merit relative to the other measures. We pay particular attention both to precision and speed. Moreover, we also provide various fast approximation algorithms for the computation of Gap, FOM and WCSS. The main result is a hierarchy of those measures in terms of precision and speed, highlighting some of their merits and limitations not reported before in the literature. Conclusion Based on our analysis, we draw several conclusions for the use of those internal measures on microarray data. We report the main ones. Consensus is by far the best performer in terms of predictive power and remarkably algorithm-independent. Unfortunately, on large datasets, it may be of no use because of its non-trivial computer time demand (weeks on a state of the art PC. FOM is the second best performer although, quite surprisingly, it may not be competitive in this scenario: it has essentially the same predictive power of WCSS but it is from 6 to 100 times slower in time
Haghighi, Babak; Choi, Jiwoong; Choi, Sanghun; Hoffman, Eric A.; Lin, Ching-Long
2017-11-01
Accurate modeling of small airway diameters in patients with chronic obstructive pulmonary disease (COPD) is a crucial step toward patient-specific CFD simulations of regional airflow and particle transport. We proposed to use computed tomography (CT) imaging-based cluster membership to identify structural characteristics of airways in each cluster and use them to develop cluster-specific airway diameter models. We analyzed 284 COPD smokers with airflow limitation, and 69 healthy controls. We used multiscale imaging-based cluster analysis (MICA) to classify smokers into 4 clusters. With representative cluster patients and healthy controls, we performed multiple regressions to quantify variation of airway diameters by generation as well as by cluster. The cluster 2 and 4 showed more diameter decrease as generation increases than other clusters. The cluster 4 had more rapid decreases of airway diameters in the upper lobes, while cluster 2 in the lower lobes. We then used these regression models to estimate airway diameters in CT unresolved regions to obtain pressure-volume hysteresis curves using a 1D resistance model. These 1D flow solutions can be used to provide the patient-specific boundary conditions for 3D CFD simulations in COPD patients. Support for this study was provided, in part, by NIH Grants U01-HL114494, R01-HL112986 and S10-RR022421.
Linear regression models and k-means clustering for statistical analysis of fNIRS data.
Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro
2015-02-01
We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.
Boser, Quinn A; Valevicius, Aïda M; Lavoie, Ewen B; Chapman, Craig S; Pilarski, Patrick M; Hebert, Jacqueline S; Vette, Albert H
2018-04-27
Quantifying angular joint kinematics of the upper body is a useful method for assessing upper limb function. Joint angles are commonly obtained via motion capture, tracking markers placed on anatomical landmarks. This method is associated with limitations including administrative burden, soft tissue artifacts, and intra- and inter-tester variability. An alternative method involves the tracking of rigid marker clusters affixed to body segments, calibrated relative to anatomical landmarks or known joint angles. The accuracy and reliability of applying this cluster method to the upper body has, however, not been comprehensively explored. Our objective was to compare three different upper body cluster models with an anatomical model, with respect to joint angles and reliability. Non-disabled participants performed two standardized functional upper limb tasks with anatomical and cluster markers applied concurrently. Joint angle curves obtained via the marker clusters with three different calibration methods were compared to those from an anatomical model, and between-session reliability was assessed for all models. The cluster models produced joint angle curves which were comparable to and highly correlated with those from the anatomical model, but exhibited notable offsets and differences in sensitivity for some degrees of freedom. Between-session reliability was comparable between all models, and good for most degrees of freedom. Overall, the cluster models produced reliable joint angles that, however, cannot be used interchangeably with anatomical model outputs to calculate kinematic metrics. Cluster models appear to be an adequate, and possibly advantageous alternative to anatomical models when the objective is to assess trends in movement behavior. Copyright © 2018 Elsevier Ltd. All rights reserved.
Single-cluster dynamics for the random-cluster model
Deng, Y.; Qian, X.; Blöte, H.W.J.
2009-01-01
We formulate a single-cluster Monte Carlo algorithm for the simulation of the random-cluster model. This algorithm is a generalization of the Wolff single-cluster method for the q-state Potts model to noninteger values q>1. Its results for static quantities are in a satisfactory agreement with those
Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.
Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K
2013-03-01
Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.
Ranciati, Saverio; Viroli, Cinzia; Wit, Ernst C.
2017-01-01
Model-based clustering is a technique widely used to group a collection of units into mutually exclusive groups. There are, however, situations in which an observation could in principle belong to more than one cluster. In the context of next-generation sequencing (NGS) experiments, for example, the
Energy Technology Data Exchange (ETDEWEB)
Hasler, Nicole; Bulbul, Esra; Bonamente, Massimiliano; Landry, David [Department of Physics, University of Alabama, Huntsville, AL 35899 (United States); Carlstrom, John E.; Culverhouse, Thomas L.; Gralla, Megan; Greer, Christopher; Hennessy, Ryan; Leitch, Erik M.; Mantz, Adam; Marrone, Daniel P.; Plagge, Thomas [Kavli Institute for Cosmological Physics, University of Chicago, Chicago, IL 60637 (United States); Hawkins, David; Lamb, James W.; Muchovej, Stephen [Owens Valley Radio Observatory, California Institute of Technology, Big Pine, CA 93513 (United States); Joy, Marshall; Kolodziejczak, Jeffery [Space Science-VP62, NASA Marshall Space Flight Center, Huntsville, AL 35812 (United States); Miller, Amber [Columbia Astrophysics Laboratory, Columbia University, New York, NY 10027 (United States); Mroczkowski, Tony [Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA 19104 (United States); and others
2012-04-01
We perform a joint analysis of X-ray and Sunyaev-Zel'dovich effect data using an analytic model that describes the gas properties of galaxy clusters. The joint analysis allows the measurement of the cluster gas mass fraction profile and Hubble constant independent of cosmological parameters. Weak cosmological priors are used to calculate the overdensity radius within which the gas mass fractions are reported. Such an analysis can provide direct constraints on the evolution of the cluster gas mass fraction with redshift. We validate the model and the joint analysis on high signal-to-noise data from the Chandra X-ray Observatory and the Sunyaev-Zel'dovich Array for two clusters, A2631 and A2204.
International Nuclear Information System (INIS)
Hasler, Nicole; Bulbul, Esra; Bonamente, Massimiliano; Landry, David; Carlstrom, John E.; Culverhouse, Thomas L.; Gralla, Megan; Greer, Christopher; Hennessy, Ryan; Leitch, Erik M.; Mantz, Adam; Marrone, Daniel P.; Plagge, Thomas; Hawkins, David; Lamb, James W.; Muchovej, Stephen; Joy, Marshall; Kolodziejczak, Jeffery; Miller, Amber; Mroczkowski, Tony
2012-01-01
We perform a joint analysis of X-ray and Sunyaev-Zel'dovich effect data using an analytic model that describes the gas properties of galaxy clusters. The joint analysis allows the measurement of the cluster gas mass fraction profile and Hubble constant independent of cosmological parameters. Weak cosmological priors are used to calculate the overdensity radius within which the gas mass fractions are reported. Such an analysis can provide direct constraints on the evolution of the cluster gas mass fraction with redshift. We validate the model and the joint analysis on high signal-to-noise data from the Chandra X-ray Observatory and the Sunyaev-Zel'dovich Array for two clusters, A2631 and A2204.
Modeling and Analysis of a Spectrum of the Globular Cluster NGC 2419
Sharina, M. E.; Shimansky, V. V.; Davoust, E.
2012-01-01
NGC 2419 is the most distant massive globular cluster in the outer Galactic halo. It is unusual also due to the chemical peculiarities found in its red giant stars in recent years. We study the stellar population of this unusual object using spectra obtained at the 1.93-m telescope of the Haute-Provence Observatory. At variance with commonly used methods of high-resolution spectroscopy applicable only to bright stars, we employ spectroscopic information on the integrated light of the cluster....
Directory of Open Access Journals (Sweden)
Ibgtc Bowala
2017-06-01
Full Text Available With the rapid growth of financial markets, analyzers are paying more attention on predictions. Stock data are time series data, with huge amounts. Feasible solution for handling the increasing amount of data is to use a cluster for parallel processing, and Hadoop parallel computing platform is a typical representative. There are various statistical models for forecasting time series data, but accurate clusters are a pre-requirement. Clustering analysis for time series data is one of the main methods for mining time series data for many other analysis processes. However, general clustering algorithms cannot perform clustering for time series data because series data has a special structure and a high dimensionality has highly co-related values due to high noise level. A novel model for time series clustering is presented using BIRCH, based on piecewise SVD, leading to a novel dimension reduction approach. Highly co-related features are handled using SVD with a novel approach for dimensionality reduction in order to keep co-related behavior optimal and then use BIRCH for clustering. The algorithm is a novel model that can handle massive time series data. Finally, this new model is successfully applied to real stock time series data of Yahoo finance with satisfactory results.
Cluster analysis for portfolio optimization
Vincenzo Tola; Fabrizio Lillo; Mauro Gallegati; Rosario N. Mantegna
2005-01-01
We consider the problem of the statistical uncertainty of the correlation matrix in the optimization of a financial portfolio. We show that the use of clustering algorithms can improve the reliability of the portfolio in terms of the ratio between predicted and realized risk. Bootstrap analysis indicates that this improvement is obtained in a wide range of the parameters N (number of assets) and T (investment horizon). The predicted and realized risk level and the relative portfolio compositi...
Quark cluster model and confinement
International Nuclear Information System (INIS)
Koike, Yuji; Yazaki, Koichi
2000-01-01
How confinement of quarks is implemented for multi-hadron systems in the quark cluster model is reviewed. In order to learn the nature of the confining interaction for fermions we first study 1+1 dimensional QED and QCD, in which the gauge field can be eliminated exactly and generates linear interaction of fermions. Then, we compare the two-body potential model, the flip-flop model and the Born-Oppenheimer approach in the strong coupling lattice QCD for the meson-meson system. Having shown how the long-range attraction between hadrons, van der Waals interaction, shows up in the two-body potential model, we discuss two distinct attempts beyond the two-body potential model: one is a many-body potential model, the flip-flop model, and the other is the Born-Oppenheimer approach in the strong coupling lattice QCD. We explain how the emergence of the long-range attraction is avoided in these attempts. Finally, we present the results of the application of the flip-flop model to the baryon-baryon scattering in the quark cluster model. (author)
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.
Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin
2017-08-31
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis
2015-01-01
A. Porter and my advisor. The text is primarily written by me. Chapter 5 is a version of [46] where my contribution is all of the analytical ...inn Euclidean space, a variational method refers to using calculus of variation techniques to find the minimizer (or maximizer) of a functional (energy... geometric inter- pretation of modularity optimization contrasts with existing interpretations (e.g., probabilistic ones or in terms of the Potts model
Cluster Risk of Walking Scenarios Based on Macroscopic Flow Model and Crowding Force Analysis
Directory of Open Access Journals (Sweden)
Xiaohong Li
2018-02-01
Full Text Available In recent years, accidents always happen in confined space such as metro stations because of congestion. Various researchers investigated the patterns of dense crowd behaviors in different scenarios via simulations or experiments and proposed methods for avoiding accidents. In this study, a classic continuum macroscopic model was applied to simulate the crowded pedestrian flow in typical scenarios such as at bottlenecks or with an obstacle. The Lax–Wendroff finite difference scheme and artificial viscosity filtering method were used to discretize the model to identify high-density risk areas. Furthermore, we introduced a contact crowding force test of the interactions among pedestrians at bottlenecks. Results revealed that in the most dangerous area, the individual on the corner position bears the maximum pressure in such scenarios is 90.2 N, and there is an approximate exponential relationship between crowding force and density indicated by our data. The results and findings presented in this paper can facilitate more reasonable and precise simulation models by utilizing crowding force and crowd density and ensure the safety of pedestrians in high-density scenarios.
Analysis of the crystal lattice instability for cage–cluster systems using the superatom model
Energy Technology Data Exchange (ETDEWEB)
Serebrennikov, D. A., E-mail: dserebrennikov@innopark.kantiana.ru, E-mail: dimafania@mail.ru; Clementyev, E. S. [I. Kant Baltic Federal University, “Functional Nanomaterials” Scientific–Educational Center (Russian Federation); Alekseev, P. A. [“Kurchatov Institute” National Research Center (Russian Federation)
2016-09-15
We have investigated the lattice dynamics for a number of rare-earth hexaborides based on the superatom model within which the boron octahedron is substituted by one superatom with a mass equal to the mass of six boron atoms. Phenomenological models have been constructed for the acoustic and lowenergy optical phonon modes in RB{sub 6} (R = La, Gd, Tb, Dy) compounds. Using DyB{sub 6} as an example, we have studied the anomalous softening of longitudinal acoustic phonons in several crystallographic directions, an effect that is also typical of GdB{sub 6} and TbB{sub 6}. The softening of the acoustic branches is shown to be achieved through the introduction of negative interatomic force constants between rare-earth ions. We discuss the structural instability of hexaborides based on 4f elements, the role of valence instability in the lattice dynamics, and the influence of the number of f electrons on the degree of softening of phonon modes.
Directory of Open Access Journals (Sweden)
Marco Merafina
2014-12-01
Full Text Available We discuss the possibility to analyze the problem of gravothermal catastrophe in a new way, by obtaining thermodynamical equations to apply to a selfgravitating system. By using the King distribution function in the framework of statistical mechanics we treat the globular clusters evolution as a sequence of quasi-equilibrium thermodynamical states.
Co-clustering models, algorithms and applications
Govaert, Gérard
2013-01-01
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixture
Cluster analysis in phenotyping a Portuguese population.
Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J
2015-09-03
Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.
Factor Analysis for Clustered Observations.
Longford, N. T.; Muthen, B. O.
1992-01-01
A two-level model for factor analysis is defined, and formulas for a scoring algorithm for this model are derived. A simple noniterative method based on decomposition of total sums of the squares and cross-products is discussed and illustrated with simulated data and data from the Second International Mathematics Study. (SLD)
Cluster analysis of activity-time series in motor learning
DEFF Research Database (Denmark)
Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A
2002-01-01
Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...
Identifying Clusters with Mixture Models that Include Radial Velocity Observations
Czarnatowicz, Alexis; Ybarra, Jason E.
2018-01-01
The study of stellar clusters plays an integral role in the study of star formation. We present a cluster mixture model that considers radial velocity data in addition to spatial data. Maximum likelihood estimation through the Expectation-Maximization (EM) algorithm is used for parameter estimation. Our mixture model analysis can be used to distinguish adjacent or overlapping clusters, and estimate properties for each cluster.Work supported by awards from the Virginia Foundation for Independent Colleges (VFIC) Undergraduate Science Research Fellowship and The Research Experience @Bridgewater (TREB).
Elangasinghe, M. A.; Singhal, N.; Dirks, K. N.; Salmond, J. A.; Samarasinghe, S.
2014-09-01
This paper uses artificial neural networks (ANN), combined with k-means clustering, to understand the complex time series of PM10 and PM2.5 concentrations at a coastal location of New Zealand based on data from a single site. Out of available meteorological parameters from the network (wind speed, wind direction, solar radiation, temperature, relative humidity), key factors governing the pattern of the time series concentrations were identified through input sensitivity analysis performed on the trained neural network model. The transport pathways of particulate matter under these key meteorological parameters were further analysed through bivariate concentration polar plots and k-means clustering techniques. The analysis shows that the external sources such as marine aerosols and local sources such as traffic and biomass burning contribute equally to the particulate matter concentrations at the study site. These results are in agreement with the results of receptor modelling by the Auckland Council based on Positive Matrix Factorization (PMF). Our findings also show that contrasting concentration-wind speed relationships exist between marine aerosols and local traffic sources resulting in very noisy and seemingly large random PM10 concentrations. The inclusion of cluster rankings as an input parameter to the ANN model showed a statistically significant (p advanced air dispersion models.
Directory of Open Access Journals (Sweden)
A. V. Brykin
2013-01-01
Full Text Available The cluster principle development in the world of electronics is one of the most effective examples of high-tech industry. The author considers the possibility of using clusters to modernize the Russian economy.
Validating clustering of molecular dynamics simulations using polymer models
Directory of Open Access Journals (Sweden)
Phillips Joshua L
2011-11-01
Full Text Available Abstract Background Molecular dynamics (MD simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. Results We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. Conclusions We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our
Robust cluster analysis and variable selection
Ritter, Gunter
2014-01-01
Clustering remains a vibrant area of research in statistics. Although there are many books on this topic, there are relatively few that are well founded in the theoretical aspects. In Robust Cluster Analysis and Variable Selection, Gunter Ritter presents an overview of the theory and applications of probabilistic clustering and variable selection, synthesizing the key research results of the last 50 years. The author focuses on the robust clustering methods he found to be the most useful on simulated data and real-time applications. The book provides clear guidance for the varying needs of bot
Analysis of Aspects of Innovation in a Brazilian Cluster
Directory of Open Access Journals (Sweden)
Adriana Valélia Saraceni
2012-09-01
Full Text Available Innovation through clustering has become very important on the increased significance that interaction represents on innovation and learning process concept. This study aims to identify whereas a case analysis on innovation process in a cluster represents on the learning process. Therefore, this study is developed in two stages. First, we used a preliminary case study verifying a cluster innovation analysis and it Innovation Index, for further, exploring a combined body of theory and practice. Further, the second stage is developed by exploring the learning process concept. Both stages allowed us building a theory model for the learning process development in clusters. The main results of the model development come up with a mechanism of improvement implementation on clusters when case studies are applied.
Exact WKB analysis and cluster algebras
International Nuclear Information System (INIS)
Iwaki, Kohei; Nakanishi, Tomoki
2014-01-01
We develop the mutation theory in the exact WKB analysis using the framework of cluster algebras. Under a continuous deformation of the potential of the Schrödinger equation on a compact Riemann surface, the Stokes graph may change the topology. We call this phenomenon the mutation of Stokes graphs. Along the mutation of Stokes graphs, the Voros symbols, which are monodromy data of the equation, also mutate due to the Stokes phenomenon. We show that the Voros symbols mutate as variables of a cluster algebra with surface realization. As an application, we obtain the identities of Stokes automorphisms associated with periods of cluster algebras. The paper also includes an extensive introduction of the exact WKB analysis and the surface realization of cluster algebras for nonexperts. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Cluster algebras in mathematical physics’. (paper)
From virtual clustering analysis to self-consistent clustering analysis: a mathematical study
Tang, Shaoqiang; Zhang, Lei; Liu, Wing Kam
2018-03-01
In this paper, we propose a new homogenization algorithm, virtual clustering analysis (VCA), as well as provide a mathematical framework for the recently proposed self-consistent clustering analysis (SCA) (Liu et al. in Comput Methods Appl Mech Eng 306:319-341, 2016). In the mathematical theory, we clarify the key assumptions and ideas of VCA and SCA, and derive the continuous and discrete Lippmann-Schwinger equations. Based on a key postulation of "once response similarly, always response similarly", clustering is performed in an offline stage by machine learning techniques (k-means and SOM), and facilitates substantial reduction of computational complexity in an online predictive stage. The clear mathematical setup allows for the first time a convergence study of clustering refinement in one space dimension. Convergence is proved rigorously, and found to be of second order from numerical investigations. Furthermore, we propose to suitably enlarge the domain in VCA, such that the boundary terms may be neglected in the Lippmann-Schwinger equation, by virtue of the Saint-Venant's principle. In contrast, they were not obtained in the original SCA paper, and we discover these terms may well be responsible for the numerical dependency on the choice of reference material property. Since VCA enhances the accuracy by overcoming the modeling error, and reduce the numerical cost by avoiding an outer loop iteration for attaining the material property consistency in SCA, its efficiency is expected even higher than the recently proposed SCA algorithm.
ANALISIS SEGMENTASI PELANGGAN MENGGUNAKAN KOMBINASI RFM MODEL DAN TEKNIK CLUSTERING
Directory of Open Access Journals (Sweden)
Beta Estri Adiana
2018-04-01
Full Text Available Intense competition in the business field motivates a small and medium enterprises (SMEs to manage customer services to the maximal. Improve of customer royalty by grouping cunstomers into some of groups and determining appropriate and effective marketing strategies for each group. Customer segmentation can be performed by data mining approach with clustering method. The main purpose of this paper is customer segmentation and measure their loyalty to a SME’s product. Using CRISP-DM method which consist of six phases, namely business understanding, data understanding, data preparatuin, modeling, evaluation and deployment. The K-Means algorithm is used for cluster formation and RapidMiner as a tool used to evaluate the result of clusters. Cluster formation is based on RFM (recency, frequency, monetary analysis. Davies Bouldin Index (DBI is used to find the optimal number of clusters (k. The customers are divided into 3 clusters, total of customer in first cluster is 30 customers who entered in typical customer category, the second cluster there are 8 customer whho entered in superstar customer and 89 customers in third cluster is dormant cluster category.
Alpha cluster model and spectrum of 16O
International Nuclear Information System (INIS)
Bauhoff, W.; Schultheis, H.; Schultheis, R.
1983-01-01
The structure of 16 O is studied in the alpha cluster model with parity and angular-momentum projection for several nucleon-nucleon interactions. The method differs from previous studies in that the states of positive and negative parity are determined without the customary restriction of the variational space to cluster positions with certain assumed symmetries. It is demonstrated that the alpha cluster model of 16 O is capable of explaining most of the experimental T = O levels up to about 15 MeV excitation. A shell-model analysis of the excited cluster-model states shows the necessity of including a very large number of shells. The evidence for the recently proposed tetrahedral symmetry of some excited states is also discussed
Cluster analysis of word frequency dynamics
Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.
2015-01-01
This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.
Cluster analysis of word frequency dynamics
International Nuclear Information System (INIS)
Maslennikova, Yu S; Bochkarev, V V; Belashova, I A
2015-01-01
This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations
Synthetic properties of models of globular clusters
Energy Technology Data Exchange (ETDEWEB)
Angeletti, L; Dolcetta, R; Giannone, P. (Rome Univ. (Italy). Osservatorio Astronomico)
1980-05-01
Synthetic and projected properties of models of globular clusters have been computed on the basis of stellar evolution and time changes of the dynamical cluster structure. Clusters with five and eight stellar groups (each group consisting of stars with the same mass) were studied. Mass loss from evolved stars was taken into account. Observational features were obtained at ages of 10-19 x 10/sup 9/ yr. The basic importance of the horizontal- and asymptotic-branch stars was pointed out. A comparison of the results with observed data of M3 is discussed with the purpose of obtaining general indications rather than a specific fit.
Synthetic properties of models of globular clusters
International Nuclear Information System (INIS)
Angeletti, L.; Dolcetta, R.; Giannone, P.
1980-01-01
Synthetic and projected properties of models of globular clusters have been computed on the basis of stellar evolution and time changes of the dynamical cluster structure. Clusters with five and eight stellar groups (each group consisting of stars with the same mass) were studied. Mass loss from evolved stars was taken into account. Observational features were obtained at ages of 10-19 x 10 9 yr. The basic importance of the horizontal- and asymptotic-branch stars was pointed out. A comparison of the results with observed data of M3 is discussed with the purpose of obtaining general indications rather than a specific fit. (orig.)
Topics in modelling of clustered data
Aerts, Marc; Ryan, Louise M; Geys, Helena
2002-01-01
Many methods for analyzing clustered data exist, all with advantages and limitations in particular applications. Compiled from the contributions of leading specialists in the field, Topics in Modelling of Clustered Data describes the tools and techniques for modelling the clustered data often encountered in medical, biological, environmental, and social science studies. It focuses on providing a comprehensive treatment of marginal, conditional, and random effects models using, among others, likelihood, pseudo-likelihood, and generalized estimating equations methods. The authors motivate and illustrate all aspects of these models in a variety of real applications. They discuss several variations and extensions, including individual-level covariates and combined continuous and discrete outcomes. Flexible modelling with fractional and local polynomials, omnibus lack-of-fit tests, robustification against misspecification, exact, and bootstrap inferential procedures all receive extensive treatment. The application...
Cluster analysis of obesity and asthma phenotypes.
Directory of Open Access Journals (Sweden)
E Rand Sutherland
Full Text Available Asthma is a heterogeneous disease with variability among patients in characteristics such as lung function, symptoms and control, body weight, markers of inflammation, and responsiveness to glucocorticoids (GC. Cluster analysis of well-characterized cohorts can advance understanding of disease subgroups in asthma and point to unsuspected disease mechanisms. We utilized an hypothesis-free cluster analytical approach to define the contribution of obesity and related variables to asthma phenotype.In a cohort of clinical trial participants (n = 250, minimum-variance hierarchical clustering was used to identify clinical and inflammatory biomarkers important in determining disease cluster membership in mild and moderate persistent asthmatics. In a subset of participants, GC sensitivity was assessed via expression of GC receptor alpha (GCRα and induction of MAP kinase phosphatase-1 (MKP-1 expression by dexamethasone. Four asthma clusters were identified, with body mass index (BMI, kg/m(2 and severity of asthma symptoms (AEQ score the most significant determinants of cluster membership (F = 57.1, p<0.0001 and F = 44.8, p<0.0001, respectively. Two clusters were composed of predominantly obese individuals; these two obese asthma clusters differed from one another with regard to age of asthma onset, measures of asthma symptoms (AEQ and control (ACQ, exhaled nitric oxide concentration (F(ENO and airway hyperresponsiveness (methacholine PC(20 but were similar with regard to measures of lung function (FEV(1 (% and FEV(1/FVC, airway eosinophilia, IgE, leptin, adiponectin and C-reactive protein (hsCRP. Members of obese clusters demonstrated evidence of reduced expression of GCRα, a finding which was correlated with a reduced induction of MKP-1 expression by dexamethasoneObesity is an important determinant of asthma phenotype in adults. There is heterogeneity in expression of clinical and inflammatory biomarkers of asthma across obese individuals
Cluster model in reaction theory
International Nuclear Information System (INIS)
Adhikari, S.K.
1979-01-01
A recent work by Rosenberg on cluster states in reaction theory is reexamined and generalized to include energies above the threshold for breakup into four composite fragments. The problem of elastic scattering between two interacting composite fragments is reduced to an equivalent two-particle problem with an effective potential to be determined by extremum principles. For energies above the threshold for breakup into three or four composite fragments effective few-particle potentials are introduced and the problem is reduced to effective three- and four-particle problems. The equivalent three-particle equation contains effective two- and three-particle potentials. The effective potential in the equivalent four-particle equation has two-, three-, and four-body connected parts and a piece which has two independent two-body connected parts. In the equivalent three-particle problem we show how to include the effect of a weak three-body potential perturbatively. In the equivalent four-body problem an approximate simple calculational scheme is given when one neglects the four-particle potential the effect of which is presumably very small
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis
Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan
2016-01-01
Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein
2014-11-01
Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Observations and Modeling of Merging Galaxy Clusters
Golovich, Nathan Ryan
Context: Galaxy clusters grow hierarchically with continuous accretion bookended by major merging events that release immense gravitational potential energy (as much as ˜1065 erg). This energy creates an environment for rich astrophysics. Precise measurements of the dark matter halo, intracluster medium, and galaxy population have resulted in a number of important results including dark matter constraints and explanations of the generation of cosmic rays. However, since the timescale of major mergers (˜several Gyr) relegates observations of individual systems to mere snapshots, these results are difficult to understand under a consistent dynamical framework. While computationally expensive simulations are vital in this regard, the vastness of parameter space has necessitated simulations of idealized mergers that are unlikely to capture the full richness. Merger speeds, geometries, and timescales each have a profound consequential effect, but even these simple dynamical properties of the mergers are often poorly understood. A method to identify and constrain the best systems for probing the rich astrophysics of merging clusters is needed. Such a method could then be utilized to prioritize observational follow up and best inform proper exploration of dynamical phase space. Task: In order to identify and model a large number of systems, in this dissertation, we compile an ensemble of major mergers each containing radio relics. We then complete a pan-chromatic study of these 29 systems including wide field optical photometry, targeted optical spectroscopy of member galaxies, radio, and X-ray observations. We use the optical observations to model the galaxy substructure and estimate line of sight motion. In conjunction with the radio and X-ray data, these substructure models helped elucidate the most likely merger scenario for each system and further constrain the dynamical properties of each system. We demonstrate the power of this technique through detailed analyses
Fuzzy Clustering Methods and their Application to Fuzzy Modeling
DEFF Research Database (Denmark)
Kroszynski, Uri; Zhou, Jianjun
1999-01-01
Fuzzy modeling techniques based upon the analysis of measured input/output data sets result in a set of rules that allow to predict system outputs from given inputs. Fuzzy clustering methods for system modeling and identification result in relatively small rule-bases, allowing fast, yet accurate....... An illustrative synthetic example is analyzed, and prediction accuracy measures are compared between the different variants...
A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis
Directory of Open Access Journals (Sweden)
Shaoning Li
2017-01-01
Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.
Cluster analysis of Southeastern U.S. climate stations
Stooksbury, D. E.; Michaels, P. J.
1991-09-01
A two-step cluster analysis of 449 Southeastern climate stations is used to objectively determine general climate clusters (groups of climate stations) for eight southeastern states. The purpose is objectively to define regions of climatic homogeneity that should perform more robustly in subsequent climatic impact models. This type of analysis has been successfully used in many related climate research problems including the determination of corn/climate districts in Iowa (Ortiz-Valdez, 1985) and the classification of synoptic climate types (Davis, 1988). These general climate clusters may be more appropriate for climate research than the standard climate divisions (CD) groupings of climate stations, which are modifications of the agro-economic United States Department of Agriculture crop reporting districts. Unlike the CD's, these objectively determined climate clusters are not restricted by state borders and thus have reduced multicollinearity which makes them more appropriate for the study of the impact of climate and climatic change.
Heileman, K L; Tabrizian, M
2017-05-02
3-Dimensional cell cultures are more representative of the native environment than traditional cell cultures on flat substrates. As a result, 3-dimensional cell cultures have emerged as a very valuable model environment to study tumorigenesis, organogenesis and tissue regeneration. Many of these models encompass the formation of cell aggregates, which mimic the architecture of tumor and organ tissue. Dielectric impedance spectroscopy is a non-invasive, label free and real time technique, overcoming the drawbacks of established techniques to monitor cell aggregates. Here we introduce a platform to monitor cell aggregation in a 3-dimensional extracellular matrix using dielectric spectroscopy. The MCF10A breast epithelial cell line serves as a model for cell aggregation. The platform maintains sterile conditions during the multi-day assay while allowing continuous dielectric spectroscopy measurements. The platform geometry optimizes dielectric measurements by concentrating cells within the electrode sensing region. The cells show a characteristic dielectric response to aggregation which corroborates with finite element analysis computer simulations. By fitting the experimental dielectric spectra to the Cole-Cole equation, we demonstrated that the dispersion intensity Δε and the characteristic frequency f c are related to cell aggregate growth. In addition, microscopy can be performed directly on the platform providing information about cell position, density and morphology. This platform could yield many applications for studying the electrophysiological activity of cell aggregates.
Modelling baryonic effects on galaxy cluster mass profiles
Shirasaki, Masato; Lau, Erwin T.; Nagai, Daisuke
2018-06-01
Gravitational lensing is a powerful probe of the mass distribution of galaxy clusters and cosmology. However, accurate measurements of the cluster mass profiles are limited by uncertainties in cluster astrophysics. In this work, we present a physically motivated model of baryonic effects on the cluster mass profiles, which self-consistently takes into account the impact of baryons on the concentration as well as mass accretion histories of galaxy clusters. We calibrate this model using the Omega500 hydrodynamical cosmological simulations of galaxy clusters with varying baryonic physics. Our model will enable us to simultaneously constrain cluster mass, concentration, and cosmological parameters using stacked weak lensing measurements from upcoming optical cluster surveys.
Modelling Baryonic Effects on Galaxy Cluster Mass Profiles
Shirasaki, Masato; Lau, Erwin T.; Nagai, Daisuke
2018-03-01
Gravitational lensing is a powerful probe of the mass distribution of galaxy clusters and cosmology. However, accurate measurements of the cluster mass profiles are limited by uncertainties in cluster astrophysics. In this work, we present a physically motivated model of baryonic effects on the cluster mass profiles, which self-consistently takes into account the impact of baryons on the concentration as well as mass accretion histories of galaxy clusters. We calibrate this model using the Omega500 hydrodynamical cosmological simulations of galaxy clusters with varying baryonic physics. Our model will enable us to simultaneously constrain cluster mass, concentration, and cosmological parameters using stacked weak lensing measurements from upcoming optical cluster surveys.
Cluster analysis for determining distribution center location
Lestari Widaningrum, Dyah; Andika, Aditya; Murphiyanto, Richard Dimas Julian
2017-12-01
Determination of distribution facilities is highly important to survive in the high level of competition in today’s business world. Companies can operate multiple distribution centers to mitigate supply chain risk. Thus, new problems arise, namely how many and where the facilities should be provided. This study examines a fast-food restaurant brand, which located in the Greater Jakarta. This brand is included in the category of top 5 fast food restaurant chain based on retail sales. There were three stages in this study, compiling spatial data, cluster analysis, and network analysis. Cluster analysis results are used to consider the location of the additional distribution center. Network analysis results show a more efficient process referring to a shorter distance to the distribution process.
Complex scaling in the cluster model
International Nuclear Information System (INIS)
Kruppa, A.T.; Lovas, R.G.; Gyarmati, B.
1987-01-01
To find the positions and widths of resonances, a complex scaling of the intercluster relative coordinate is introduced into the resonating-group model. In the generator-coordinate technique used to solve the resonating-group equation the complex scaling requires minor changes in the formulae and code. The finding of the resonances does not need any preliminary guess or explicit reference to any asymptotic prescription. The procedure is applied to the resonances in the relative motion of two ground-state α clusters in 8 Be, but is appropriate for any systems consisting of two clusters. (author) 23 refs.; 5 figs
Cluster infall in the concordance LCDM model
Pivato, Maximiliano C.; Padilla, Nelson D.; Lambas, Diego G.
2005-01-01
We perform statistical analyses of the infall of dark-matter onto clusters in numerical simulations within the concordance LCDM model. By studying the infall profile around clusters of different mass, we find a linear relation between the maximum infall velocity and mass which reach 900km/s for the most massive groups. The maximum infall velocity and the group mass follow a suitable power law fit of the form, V_{inf}^{max} = (M/m_0)^{gamma}. By comparing the measured infall velocity to the li...
Huang, Yangxin; Lu, Xiaosun; Chen, Jiaqing; Liang, Juan; Zangmeister, Miriam
2017-10-27
Longitudinal and time-to-event data are often observed together. Finite mixture models are currently used to analyze nonlinear heterogeneous longitudinal data, which, by releasing the homogeneity restriction of nonlinear mixed-effects (NLME) models, can cluster individuals into one of the pre-specified classes with class membership probabilities. This clustering may have clinical significance, and be associated with clinically important time-to-event data. This article develops a joint modeling approach to a finite mixture of NLME models for longitudinal data and proportional hazard Cox model for time-to-event data, linked by individual latent class indicators, under a Bayesian framework. The proposed joint models and method are applied to a real AIDS clinical trial data set, followed by simulation studies to assess the performance of the proposed joint model and a naive two-step model, in which finite mixture model and Cox model are fitted separately.
Modeling blue stragglers in young clusters
International Nuclear Information System (INIS)
Lu Pin; Deng Licai; Zhang Xiaobin
2011-01-01
A grid of binary evolution models are calculated for the study of a blue straggler (BS) population in intermediate age (log Age = 7.85–8.95) star clusters. The BS formation via mass transfer and merging is studied systematically using our models. Both Case A and B close binary evolutionary tracks are calculated for a large range of parameters. The results show that BSs formed via Case B are generally bluer and even more luminous than those produced by Case A. Furthermore, the larger range in orbital separations of Case B models provides a probability of producing more BSs than in Case A. Based on the grid of models, several Monte-Carlo simulations of BS populations in the clusters in the age range are carried out. The results show that BSs formed via different channels populate different areas in the color magnitude diagram (CMD). The locations of BSs in CMD for a number of clusters are compared to our simulations as well. In order to investigate the influence of mass transfer efficiency in the models and simulations, a set of models is also calculated by implementing a constant mass transfer efficiency, β = 0.5, during Roche lobe overflow (Case A binary evolution excluded). The result shows BSs can be formed via mass transfer at any given age in both cases. However, the distributions of the BS populations on CMD are different.
2014-01-01
Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations
Semi-supervised consensus clustering for gene expression data analysis
Wang, Yunli; Pan, Youlian
2014-01-01
Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...
MANNER OF STOCKS SORTING USING CLUSTER ANALYSIS METHODS
Directory of Open Access Journals (Sweden)
Jana Halčinová
2014-06-01
Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.
Advanced analysis of forest fire clustering
Kanevski, Mikhail; Pereira, Mario; Golay, Jean
2017-04-01
Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index
Cluster Analysis in Rapeseed (Brassica Napus L.)
International Nuclear Information System (INIS)
Mahasi, J.M
2002-01-01
With widening edible deficit, Kenya has become increasingly dependent on imported edible oils. Many oilseed crops (e.g. sunflower, soya beans, rapeseed/mustard, sesame, groundnuts etc) can be grown in Kenya. But oilseed rape is preferred because it very high yielding (1.5 tons-4.0 tons/ha) with oil content of 42-46%. Other uses include fitting in various cropping systems as; relay/inter crops, rotational crops, trap crops and fodder. It is soft seeded hence oil extraction is relatively easy. The meal is high in protein and very useful in livestock supplementation. Rapeseed can be straight combined using adjusted wheat combines. The priority is to expand domestic oilseed production, hence the need to introduce improved rapeseed germplasm from other countries. The success of any crop improvement programme depends on the extent of genetic diversity in the material. Hence, it is essential to understand the adaptation of introduced genotypes and the similarities if any among them. Evaluation trials were carried out on 17 rapeseed genotypes (nine Canadian origin and eight of European origin) grown at 4 locations namely Endebess, Njoro, Timau and Mau Narok in three years (1992, 1993 and 1994). Results for 1993 were discarded due to severe drought. An analysis of variance was carried out only on seed yields and the treatments were found to be significantly different. Cluster analysis was then carried out on mean seed yields and based on this analysis; only one major group exists within the material. In 1992, varieties 2,3,8 and 9 didn't fall in the same cluster as the rest. Variety 8 was the only one not classified with the rest of the Canadian varieties. Three European varieties (2,3 and 9) were however not classified with the others. In 1994, varieties 10 and 6 didn't fall in the major cluster. Of these two, variety 10 is of Canadian origin. Varieties were more similar in 1994 than 1992 due to favorable weather. It is evident that, genotypes from different geographical
Wang, Yishu; Zhao, Hongyu; Deng, Minghua; Fang, Huaying; Yang, Dejie
2017-08-24
Epistatic miniarrary profile (EMAP) studies have enabled the mapping of large-scale genetic interaction networks and generated large amounts of data in model organisms. It provides an incredible set of molecular tools and advanced technologies that should be efficiently understanding the relationship between the genotypes and phenotypes of individuals. However, the network information gained from EMAP cannot be fully exploited using the traditional statistical network models. Because the genetic network is always heterogeneous, for example, the network structure features for one subset of nodes are different from those of the left nodes. Exponentialfamily random graph models (ERGMs) are a family of statistical models, which provide a principled and flexible way to describe the structural features (e.g. the density, centrality and assortativity) of an observed network. However, the single ERGM is not enough to capture this heterogeneity of networks. In this paper, we consider a mixture ERGM (MixtureEGRM) networks, which model a network with several communities, where each community is described by a single EGRM.
Outcome-Driven Cluster Analysis with Application to Microarray Data.
Directory of Open Access Journals (Sweden)
Jessie J Hsu
Full Text Available One goal of cluster analysis is to sort characteristics into groups (clusters so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes into groups of highly correlated genes that have the same effect on the outcome (recovery. We propose a random effects model where the genes within each group (cluster equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.
Tweets clustering using latent semantic analysis
Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul
2017-04-01
Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.
Cluster Analysis as an Analytical Tool of Population Policy
Directory of Open Access Journals (Sweden)
Oksana Mikhaylovna Shubat
2017-12-01
Full Text Available The predicted negative trends in Russian demography (falling birth rates, population decline actualize the need to strengthen measures of family and population policy. Our research purpose is to identify groups of Russian regions with similar characteristics in the family sphere using cluster analysis. The findings should make an important contribution to the field of family policy. We used hierarchical cluster analysis based on the Ward method and the Euclidean distance for segmentation of Russian regions. Clustering is based on four variables, which allowed assessing the family institution in the region. The authors used the data of Federal State Statistics Service from 2010 to 2015. Clustering and profiling of each segment has allowed forming a model of Russian regions depending on the features of the family institution in these regions. The authors revealed four clusters grouping regions with similar problems in the family sphere. This segmentation makes it possible to develop the most relevant family policy measures in each group of regions. Thus, the analysis has shown a high degree of differentiation of the family institution in the regions. This suggests that a unified approach to population problems’ solving is far from being effective. To achieve greater results in the implementation of family policy, a differentiated approach is needed. Methods of multidimensional data classification can be successfully applied as a relevant analytical toolkit. Further research could develop the adaptation of multidimensional classification methods to the analysis of the population problems in Russian regions. In particular, the algorithms of nonparametric cluster analysis may be of relevance in future studies.
On the shell model connection of the cluster model
International Nuclear Information System (INIS)
Cseh, J.; Levai, G.; Kato, K.
2000-01-01
Complete text of publication follows. The interrelation of basic nuclear structure models is a longstanding problem. The connection between the spherical shell model and the quadrupole collective model has been studied extensively, and symmetry considerations proved to be especially useful in this respect. A collective band was interpreted in the shell model language long ago as a set of states (of the valence nucleons) with a specific SU(3) symmetry. Furthermore, the energies of these rotational states are obtained to a good approximation as eigenvalues of an SU(3) dynamically symmetric shell model Hamiltonian. On the other hand the relation of the shell model and cluster model is less well explored. The connection of the harmonic oscillator (i.e. SU(3)) bases of the two approaches is known, but it was established only for the unrealistic harmonic oscillator interactions. Here we investigate the question: Can an SU(3) dynamically symmetric interaction provide a similar connection between the spherical shell model and the cluster model, like the one between the shell and collective models? In other words: whether or not the energy of the states of the cluster bands, defined by a specific SU(3) symmetries, can be obtained from a shell model Hamiltonian (with SU(3) dynamical symmetry). We carried out calculations within the framework of the semimicroscopic algebraic cluster model, in which not only the cluster model space is obtained from the full shell model space by an SU(3) symmetry-dictated truncation, but SU(3) dynamically symmetric interactions are also applied. Actually, Hamiltonians of this kind proved to be successful in describing the gross features of cluster states in a wide energy range. The novel feature of the present work is that we apply exclusively shell model interactions. The energies obtained from such a Hamiltonian for several bands of the ( 12 C, 14 C, 16 O, 20 Ne, 40 Ca) + α systems turn out to be in good agreement with the experimental
Statistical mechanics of the cluster Ising model
International Nuclear Information System (INIS)
Smacchia, Pietro; Amico, Luigi; Facchi, Paolo; Fazio, Rosario; Florio, Giuseppe; Pascazio, Saverio; Vedral, Vlatko
2011-01-01
We study a Hamiltonian system describing a three-spin-1/2 clusterlike interaction competing with an Ising-like antiferromagnetic interaction. We compute free energy, spin-correlation functions, and entanglement both in the ground and in thermal states. The model undergoes a quantum phase transition between an Ising phase with a nonvanishing magnetization and a cluster phase characterized by a string order. Any two-spin entanglement is found to vanish in both quantum phases because of a nontrivial correlation pattern. Nevertheless, the residual multipartite entanglement is maximal in the cluster phase and dependent on the magnetization in the Ising phase. We study the block entropy at the critical point and calculate the central charge of the system, showing that the criticality of the system is beyond the Ising universality class.
Statistical mechanics of the cluster Ising model
Energy Technology Data Exchange (ETDEWEB)
Smacchia, Pietro [SISSA - via Bonomea 265, I-34136, Trieste (Italy); Amico, Luigi [CNR-MATIS-IMM and Dipartimento di Fisica e Astronomia Universita di Catania, C/O ed. 10, viale Andrea Doria 6, I-95125 Catania (Italy); Facchi, Paolo [Dipartimento di Matematica and MECENAS, Universita di Bari, I-70125 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); Fazio, Rosario [NEST, Scuola Normale Superiore and Istituto Nanoscienze - CNR, 56126 Pisa (Italy); Center for Quantum Technology, National University of Singapore, 117542 Singapore (Singapore); Florio, Giuseppe; Pascazio, Saverio [Dipartimento di Fisica and MECENAS, Universita di Bari, I-70126 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); Vedral, Vlatko [Center for Quantum Technology, National University of Singapore, 117542 Singapore (Singapore); Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117542 (Singapore); Department of Physics, University of Oxford, Clarendon Laboratory, Oxford, OX1 3PU (United Kingdom)
2011-08-15
We study a Hamiltonian system describing a three-spin-1/2 clusterlike interaction competing with an Ising-like antiferromagnetic interaction. We compute free energy, spin-correlation functions, and entanglement both in the ground and in thermal states. The model undergoes a quantum phase transition between an Ising phase with a nonvanishing magnetization and a cluster phase characterized by a string order. Any two-spin entanglement is found to vanish in both quantum phases because of a nontrivial correlation pattern. Nevertheless, the residual multipartite entanglement is maximal in the cluster phase and dependent on the magnetization in the Ising phase. We study the block entropy at the critical point and calculate the central charge of the system, showing that the criticality of the system is beyond the Ising universality class.
Clustering properties of dynamical dark energy models
International Nuclear Information System (INIS)
Avelino, P. P.; Beca, L. M. G.; Martins, C. J. A. P.
2008-01-01
We provide a generic but physically clear discussion of the clustering properties of dark energy models. We explicitly show that in quintessence-type models the dark energy fluctuations, on scales smaller than the Hubble radius, are of the order of the perturbations to the Newtonian gravitational potential, hence necessarily small on cosmological scales. Moreover, comparable fluctuations are associated with different gauge choices. We also demonstrate that the often used homogeneous approximation is unrealistic, and that the so-called dark energy mutation is a trivial artifact of an effective, single fluid description. Finally, we discuss the particular case where the dark energy fluid is nonminimally coupled to dark matter
The Quantitative Analysis of Chennai Automotive Industry Cluster
Bhaskaran, Ethirajan
2016-07-01
Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity
Multisource Images Analysis Using Collaborative Clustering
Directory of Open Access Journals (Sweden)
Pierre Gançarski
2008-04-01
Full Text Available The development of very high-resolution (VHR satellite imagery has produced a huge amount of data. The multiplication of satellites which embed different types of sensors provides a lot of heterogeneous images. Consequently, the image analyst has often many different images available, representing the same area of the Earth surface. These images can be from different dates, produced by different sensors, or even at different resolutions. The lack of machine learning tools using all these representations in an overall process constraints to a sequential analysis of these various images. In order to use all the information available simultaneously, we propose a framework where different algorithms can use different views of the scene. Each one works on a different remotely sensed image and, thus, produces different and useful information. These algorithms work together in a collaborative way through an automatic and mutual refinement of their results, so that all the results have almost the same number of clusters, which are statistically similar. Finally, a unique result is produced, representing a consensus among the information obtained by each clustering method on its own image. The unified result and the complementarity of the single results (i.e., the agreement between the clustering methods as well as the disagreement lead to a better understanding of the scene. The experiments carried out on multispectral remote sensing images have shown that this method is efficient to extract relevant information and to improve the scene understanding.
Hierarchical modeling of cluster size in wildlife surveys
Royle, J. Andrew
2008-01-01
Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).
Rapidity correlations at fixed multiplicity in cluster emission models
Berger, M C
1975-01-01
Rapidity correlations in the central region among hadrons produced in proton-proton collisions of fixed final state multiplicity n at NAL and ISR energies are investigated in a two-step framework in which clusters of hadrons are emitted essentially independently, via a multiperipheral-like model, and decay isotropically. For n>or approximately=/sup 1///sub 2/(n), these semi-inclusive distributions are controlled by the reaction mechanism which dominates production in the central region. Thus, data offer cleaner insight into the properties of this mechanism than can be obtained from fully inclusive spectra. A method of experimental analysis is suggested to facilitate the extraction of new dynamical information. It is shown that the n independence of the magnitude of semi-inclusive correlation functions reflects directly the structure of the internal cluster multiplicity distribution. This conclusion is independent of certain assumptions concerning the form of the single cluster density in rapidity space. (23 r...
Three-Dimensional Modeling of Fracture Clusters in Geothermal Reservoirs
Energy Technology Data Exchange (ETDEWEB)
Ghassemi, Ahmad [Univ. of Oklahoma, Norman, OK (United States)
2017-08-11
The objective of this is to develop a 3-D numerical model for simulating mode I, II, and III (tensile, shear, and out-of-plane) propagation of multiple fractures and fracture clusters to accurately predict geothermal reservoir stimulation using the virtual multi-dimensional internal bond (VMIB). Effective development of enhanced geothermal systems can significantly benefit from improved modeling of hydraulic fracturing. In geothermal reservoirs, where the temperature can reach or exceed 350oC, thermal and poro-mechanical processes play an important role in fracture initiation and propagation. In this project hydraulic fracturing of hot subsurface rock mass will be numerically modeled by extending the virtual multiple internal bond theory and implementing it in a finite element code, WARP3D, a three-dimensional finite element code for solid mechanics. The new constitutive model along with the poro-thermoelastic computational algorithms will allow modeling the initiation and propagation of clusters of fractures, and extension of pre-existing fractures. The work will enable the industry to realistically model stimulation of geothermal reservoirs. The project addresses the Geothermal Technologies Office objective of accurately predicting geothermal reservoir stimulation (GTO technology priority item). The project goal will be attained by: (i) development of the VMIB method for application to 3D analysis of fracture clusters; (ii) development of poro- and thermoelastic material sub-routines for use in 3D finite element code WARP3D; (iii) implementation of VMIB and the new material routines in WARP3D to enable simulation of clusters of fractures while accounting for the effects of the pore pressure, thermal stress and inelastic deformation; (iv) simulation of 3D fracture propagation and coalescence and formation of clusters, and comparison with laboratory compression tests; and (v) application of the model to interpretation of injection experiments (planned by our
Constructing storyboards based on hierarchical clustering analysis
Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu
2005-07-01
There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.
Determining characteristic principal clusters in the “cluster-plus-glue-atom” model
International Nuclear Information System (INIS)
Du, Jinglian; Wen, Bin; 2NeT Lab, Wilfrid Laurier University, Waterloo, 75 University Ave West, Ontario N2L 3C5 (Canada))" data-affiliation=" (M2NeT Lab, Wilfrid Laurier University, Waterloo, 75 University Ave West, Ontario N2L 3C5 (Canada))" >Melnik, Roderick; Kawazoe, Yoshiyuki
2014-01-01
The “cluster-plus-glue-atom” model can easily describe the structure of complex metallic alloy phases. However, the biggest obstacle limiting the application of this model is that it is difficult to determine the characteristic principal cluster. In the case when interatomic force constants (IFCs) inside the cluster lead to stronger interaction than the interaction between the clusters, a new rule for determining the characteristic principal cluster in the “cluster-plus-glue-atom” model has been proposed on the basis of IFCs. To verify this new rule, the alloy phases in Cu–Zr and Al–Ni–Zr systems have been tested, and our results indicate that the present new rule for determining characteristic principal clusters is effective and reliable
Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.
Directory of Open Access Journals (Sweden)
Jeban Ganesalingam
2009-09-01
Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.
Cluster Analysis of Maize Inbred Lines
Directory of Open Access Journals (Sweden)
Jiban Shrestha
2016-12-01
Full Text Available The determination of diversity among inbred lines is important for heterosis breeding. Sixty maize inbred lines were evaluated for their eight agro morphological traits during winter season of 2011 to analyze their genetic diversity. Clustering was done by average linkage method. The inbred lines were grouped into six clusters. Inbred lines grouped into Clusters II had taller plants with maximum number of leaves. The cluster III was characterized with shorter plants with minimum number of leaves. The inbred lines categorized into cluster V had early flowering whereas the group into cluster VI had late flowering time. The inbred lines grouped into the cluster III were characterized by higher value of anthesis silking interval (ASI and those of cluster VI had lower value of ASI. These results showed that the inbred lines having widely divergent clusters can be utilized in hybrid breeding programme.
Latent Clustering Models for Outlier Identification in Telecom Data
Directory of Open Access Journals (Sweden)
Ye Ouyang
2016-01-01
Full Text Available Collected telecom data traffic has boomed in recent years, due to the development of 4G mobile devices and other similar high-speed machines. The ability to quickly identify unexpected traffic data in this stream is critical for mobile carriers, as it can be caused by either fraudulent intrusion or technical problems. Clustering models can help to identify issues by showing patterns in network data, which can quickly catch anomalies and highlight previously unseen outliers. In this article, we develop and compare clustering models for telecom data, focusing on those that include time-stamp information management. Two main models are introduced, solved in detail, and analyzed: Gaussian Probabilistic Latent Semantic Analysis (GPLSA and time-dependent Gaussian Mixture Models (time-GMM. These models are then compared with other different clustering models, such as Gaussian model and GMM (which do not contain time-stamp information. We perform computation on both sample and telecom traffic data to show that the efficiency and robustness of GPLSA make it the superior method to detect outliers and provide results automatically with low tuning parameters or expertise requirement.
Cluster analysis of rural, urban, and curbside atmospheric particle size data.
Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M
2009-07-01
Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.
On the shell-model-connection of the cluster model
International Nuclear Information System (INIS)
Cseh, J.
2000-01-01
Complete text of publication follows. The interrelation of basic nuclear structure models is a longstanding problem. The connection between the spherical shell model and the quadrupole collective model has been studied extensively, and symmetry considerations proved to be especially useful in this respect. A collective band was interpreted in the shell model language long ago [1] as a set of states (of the valence nucleons) with a specific SU(3) symmetry. Furthermore, the energies of these rotational states are obtained to a good approximation as eigenvalues of an SU(3) dynamically symmetric shell model Hamiltonian. On the other hand the relation of the shell model and cluster model is less well explored. The connection of the harmonic oscillator (i.e. SU(3)) bases of the two approaches is known [2] but it was established only for the unrealistic harmonic oscillator interactions. Here we investigate the question: Can an SU(3) dynamically symmetric interaction provide a similar connection between the spherical shell model and the cluster model, like the one between the shell and collective models? In other words: whether or not the energy of the states of the cluster bands, defined by a specific SU(3) symmetries, can be obtained from a shell model Hamiltonian (with SU(3) dynamical symmetry). We carried out calculations within the framework of the semimicroscopic algebraic cluster model [3,4] in order to find an answer to this question, which seems to be affirmative. In particular, the energies obtained from such a Hamiltonian for several bands of the ( 12 C, 14 C, 16 O, 20 Ne, 40 Ca) + α systems turn out to be in good agreement with the experimental values. The present results show that the simple and transparent SU(3) connection between the spherical shell model and the cluster model is valid not only for the harmonic oscillator interactions, but for much more general (SU(3) dynamically symmetric) Hamiltonians as well, which result in realistic energy spectra. Via
The Productivity Analysis of Chennai Automotive Industry Cluster
Bhaskaran, E.
2014-07-01
Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.
blockcluster: An R Package for Model-Based Co-Clustering
Directory of Open Access Journals (Sweden)
Parmeet Singh Bhatia
2017-02-01
Full Text Available Simultaneous clustering of rows and columns, usually designated by bi-clustering, coclustering or block clustering, is an important technique in two way data analysis. A new standard and efficient approach has been recently proposed based on the latent block model (Govaert and Nadif 2003 which takes into account the block clustering problem on both the individual and variable sets. This article presents our R package blockcluster for co-clustering of binary, contingency and continuous data based on these very models. In this document, we will give a brief review of the model-based block clustering methods, and we will show how the R package blockcluster can be used for co-clustering.
Collaborative filtering recommendation model based on fuzzy clustering algorithm
Yang, Ye; Zhang, Yunhua
2018-05-01
As one of the most widely used algorithms in recommender systems, collaborative filtering algorithm faces two serious problems, which are the sparsity of data and poor recommendation effect in big data environment. In traditional clustering analysis, the object is strictly divided into several classes and the boundary of this division is very clear. However, for most objects in real life, there is no strict definition of their forms and attributes of their class. Concerning the problems above, this paper proposes to improve the traditional collaborative filtering model through the hybrid optimization of implicit semantic algorithm and fuzzy clustering algorithm, meanwhile, cooperating with collaborative filtering algorithm. In this paper, the fuzzy clustering algorithm is introduced to fuzzy clustering the information of project attribute, which makes the project belong to different project categories with different membership degrees, and increases the density of data, effectively reduces the sparsity of data, and solves the problem of low accuracy which is resulted from the inaccuracy of similarity calculation. Finally, this paper carries out empirical analysis on the MovieLens dataset, and compares it with the traditional user-based collaborative filtering algorithm. The proposed algorithm has greatly improved the recommendation accuracy.
Modelling of heterogeneous clustering in aluminium
International Nuclear Information System (INIS)
Smith, A.E.; Bourgeois, L.; Nie, J.-F.; Muddle, B.C.
2003-01-01
Full text: Ab initio modelling of heterogeneous clustering in aluminium has been carried out in order to study the precipitation hardening of alloys. This process is based on the addition of small amounts of solute element to the pure metal. With increasing computational power, atomic scale effects can now be better simulated to determine the nature of the hardening mechanism. Comparisons are made between results obtained from two computational packages. These are the Linear Augmented Plane Wave WEEN2K and the plane wave pseudopotential density functional theory package fhi98md. The study of the optimal geometry of very small size clusters inside aluminium has begun with the testing of initial convergence conditions by determination of binding energies for a variety of super cell sizes of the aluminium host crystal. These are compared with total energy calculations for small size precipitates of copper and transition metals of fixed geometry. Such local optimal determinations are seen as precursors to full Monte Carlo calculations of the notional best local geometry for larger precipitates
Phase Transitions in Algebraic Cluster Models
International Nuclear Information System (INIS)
Yepez-Martinez, H.; Cseh, J.; Hess, P.O.
2006-01-01
same, and the states are said to form a (soft) band. The phase-transitions, as well as the persistence of the quasidynamical symmetries in the algebraic models of quadrupole collectivity have extensively been studied. In a recent work [1] we have addressed these questions in relation with another important collectivity of nuclei, i.e. clusterization. Two models were considered, a phenomenological one, containing no Pauli-principle, and a semimicroscopic one, which is based on a microscopically determined model space, being free from the Pauli-forbidden states. The interactions were treated in a phenomenologic and algebraic way in both cases. In this respect the two models have a similar group-structure. We have studied the SU(3) - SO(4) phase transition, related to the description of the relative motion in terms of the vibron model (in its simplest form in the phenomenological model and in a properly truncated form in the semimicroscopic description). The analytical study of the large-N limit of both models shows a first order phase transition. We have carried out numerical calculations as well. Three binary cluster systems were chosen, in which the number of open-shell clusters were zero, one and two, respectively. The numerical studies show that the phase transition is smoothed out for finite N systems, but some fingerprints of it still can be seen. The appearance of the quasidynamical SU(3) symmetry has also been studied, when moving away from the limit of the real SU(3) dynamical symmetry. It turned out that in each case, when there is a real dynamical symmetry in the limiting case (in the sense that a well-defined SU(3) quantum number can be associated to a band), this symmetry survives as quasidynamical symmetry at least up to the critical value of the control parameter. (author)
HORIZONTAL BRANCH MORPHOLOGY OF GLOBULAR CLUSTERS: A MULTIVARIATE STATISTICAL ANALYSIS
International Nuclear Information System (INIS)
Jogesh Babu, G.; Chattopadhyay, Tanuka; Chattopadhyay, Asis Kumar; Mondal, Saptarshi
2009-01-01
The proper interpretation of horizontal branch (HB) morphology is crucial to the understanding of the formation history of stellar populations. In the present study a multivariate analysis is used (principal component analysis) for the selection of appropriate HB morphology parameter, which, in our case, is the logarithm of effective temperature extent of the HB (log T effHB ). Then this parameter is expressed in terms of the most significant observed independent parameters of Galactic globular clusters (GGCs) separately for coherent groups, obtained in a previous work, through a stepwise multiple regression technique. It is found that, metallicity ([Fe/H]), central surface brightness (μ v ), and core radius (r c ) are the significant parameters to explain most of the variations in HB morphology (multiple R 2 ∼ 0.86) for GGC elonging to the bulge/disk while metallicity ([Fe/H]) and absolute magnitude (M v ) are responsible for GGC belonging to the inner halo (multiple R 2 ∼ 0.52). The robustness is tested by taking 1000 bootstrap samples. A cluster analysis is performed for the red giant branch (RGB) stars of the GGC belonging to Galactic inner halo (Cluster 2). A multi-episodic star formation is preferred for RGB stars of GGC belonging to this group. It supports the asymptotic giant branch (AGB) model in three episodes instead of two as suggested by Carretta et al. for halo GGC while AGB model is suggested to be revisited for bulge/disk GGC.
Cluster decay analysis and related structure effects of fissionable ...
Indian Academy of Sciences (India)
2015-08-01
Aug 1, 2015 ... Collective clusterization approach of dynamical cluster decay model (DCM) has been ... fusion–fission process resulting in the emission of symmetric and/or ... represents the relative separation distance between two fragments or clusters ... decay constant λ or decay half-life T1/2 is defined as λ = (ln 2/T1/2) ...
Analysis of RXTE data on Clusters of Galaxies
Petrosian, Vahe
2004-01-01
This grant provided support for the reduction, analysis and interpretation of of hard X-ray (HXR, for short) observations of the cluster of galaxies RXJO658--5557 scheduled for the week of August 23, 2002 under the RXTE Cycle 7 program (PI Vahe Petrosian, Obs. ID 70165). The goal of the observation was to search for and characterize the shape of the HXR component beyond the well established thermal soft X-ray (SXR) component. Such hard components have been detected in several nearby clusters. distant cluster would provide information on the characteristics of this radiation at a different epoch in the evolution of the imiverse and shed light on its origin. We (Petrosian, 2001) have argued that thermal bremsstrahlung, as proposed earlier, cannot be the mechanism for the production of the HXRs and that the most likely mechanism is Compton upscattering of the cosmic microwave radiation by relativistic electrons which are known to be present in the clusters and be responsible for the observed radio emission. Based on this picture we estimated that this cluster, in spite of its relatively large distance, will have HXR signal comparable to the other nearby ones. The planned observation of a relatively The proposed RXTE observations were carried out and the data have been analyzed. We detect a hard X-ray tail in the spectrum of this cluster with a flux very nearly equal to our predicted value. This has strengthen the case for the Compton scattering model. We intend the data obtained via this observation to be a part of a larger data set. We have identified other clusters of galaxies (in archival RXTE and other instrument data sets) with sufficiently high quality data where we can search for and measure (or at least put meaningful limits) on the strength of the hard component. With these studies we expect to clarify the mechanism for acceleration of particles in the intercluster medium and provide guidance for future observations of this intriguing phenomenon by instrument
Comparing clustering models in bank customers: Based on Fuzzy relational clustering approach
Directory of Open Access Journals (Sweden)
Ayad Hendalianpour
2016-11-01
Full Text Available Clustering is absolutely useful information to explore data structures and has been employed in many places. It organizes a set of objects into similar groups called clusters, and the objects within one cluster are both highly similar and dissimilar with the objects in other clusters. The K-mean, C-mean, Fuzzy C-mean and Kernel K-mean algorithms are the most popular clustering algorithms for their easy implementation and fast work, but in some cases we cannot use these algorithms. Regarding this, in this paper, a hybrid model for customer clustering is presented that is applicable in five banks of Fars Province, Shiraz, Iran. In this way, the fuzzy relation among customers is defined by using their features described in linguistic and quantitative variables. As follows, the customers of banks are grouped according to K-mean, C-mean, Fuzzy C-mean and Kernel K-mean algorithms and the proposed Fuzzy Relation Clustering (FRC algorithm. The aim of this paper is to show how to choose the best clustering algorithms based on density-based clustering and present a new clustering algorithm for both crisp and fuzzy variables. Finally, we apply the proposed approach to five datasets of customer's segmentation in banks. The result of the FCR shows the accuracy and high performance of FRC compared other clustering methods.
Experimental Tests of the Algebraic Cluster Model
Gai, Moshe
2018-02-01
The Algebraic Cluster Model (ACM) of Bijker and Iachello that was proposed already in 2000 has been recently applied to 12C and 16O with much success. We review the current status in 12C with the outstanding observation of the ground state rotational band composed of the spin-parity states of: 0+, 2+, 3-, 4± and 5-. The observation of the 4± parity doublet is a characteristic of (tri-atomic) molecular configuration where the three alpha- particles are arranged in an equilateral triangular configuration of a symmetric spinning top. We discuss future measurement with electron scattering, 12C(e,e’) to test the predicted B(Eλ) of the ACM.
Parameters of oscillation generation regions in open star cluster models
Danilov, V. M.; Putkov, S. I.
2017-07-01
We determine the masses and radii of central regions of open star cluster (OCL) models with small or zero entropy production and estimate the masses of oscillation generation regions in clustermodels based on the data of the phase-space coordinates of stars. The radii of such regions are close to the core radii of the OCL models. We develop a new method for estimating the total OCL masses based on the cluster core mass, the cluster and cluster core radii, and radial distribution of stars. This method yields estimates of dynamical masses of Pleiades, Praesepe, and M67, which agree well with the estimates of the total masses of the corresponding clusters based on proper motions and spectroscopic data for cluster stars.We construct the spectra and dispersion curves of the oscillations of the field of azimuthal velocities v φ in OCL models. Weak, low-amplitude unstable oscillations of v φ develop in cluster models near the cluster core boundary, and weak damped oscillations of v φ often develop at frequencies close to the frequencies of more powerful oscillations, which may reduce the non-stationarity degree in OCL models. We determine the number and parameters of such oscillations near the cores boundaries of cluster models. Such oscillations points to the possible role that gradient instability near the core of cluster models plays in the decrease of the mass of the oscillation generation regions and production of entropy in the cores of OCL models with massive extended cores.
Exactly soluble models for surface partition of large clusters
International Nuclear Information System (INIS)
Bugaev, K.A.; Bugaev, K.A.; Elliott, J.B.
2007-01-01
The surface partition of large clusters is studied analytically within a framework of the 'Hills and Dales Model'. Three formulations are solved exactly by using the Laplace-Fourier transformation method. In the limit of small amplitude deformations, the 'Hills and Dales Model' gives the upper and lower bounds for the surface entropy coefficient of large clusters. The found surface entropy coefficients are compared with those of large clusters within the 2- and 3-dimensional Ising models
Statistical analysis of the spatial distribution of galaxies and clusters
International Nuclear Information System (INIS)
Cappi, Alberto
1993-01-01
This thesis deals with the analysis of the distribution of galaxies and clusters, describing some observational problems and statistical results. First chapter gives a theoretical introduction, aiming to describe the framework of the formation of structures, tracing the history of the Universe from the Planck time, t_p = 10"-"4"3 sec and temperature corresponding to 10"1"9 GeV, to the present epoch. The most usual statistical tools and models of the galaxy distribution, with their advantages and limitations, are described in chapter two. A study of the main observed properties of galaxy clustering, together with a detailed statistical analysis of the effects of selecting galaxies according to apparent magnitude or diameter, is reported in chapter three. Chapter four delineates some properties of groups of galaxies, explaining the reasons of discrepant results on group distributions. Chapter five is a study of the distribution of galaxy clusters, with different statistical tools, like correlations, percolation, void probability function and counts in cells; it is found the same scaling-invariant behaviour of galaxies. Chapter six describes our finding that rich galaxy clusters too belong to the fundamental plane of elliptical galaxies, and gives a discussion of its possible implications. Finally chapter seven reviews the possibilities offered by multi-slit and multi-fibre spectrographs, and I present some observational work on nearby and distant galaxy clusters. In particular, I show the opportunities offered by ongoing surveys of galaxies coupled with multi-object fibre spectrographs, focusing on the ESO Key Programme A galaxy redshift survey in the south galactic pole region to which I collaborate and on MEFOS, a multi-fibre instrument with automatic positioning. Published papers related to the work described in this thesis are reported in the last appendix. (author) [fr
An analysis of hospital brand mark clusters.
Vollmers, Stacy M; Miller, Darryl W; Kilic, Ozcan
2010-07-01
This study analyzed brand mark clusters (i.e., various types of brand marks displayed in combination) used by hospitals in the United States. The brand marks were assessed against several normative criteria for creating brand marks that are memorable and that elicit positive affect. Overall, results show a reasonably high level of adherence to many of these normative criteria. Many of the clusters exhibited pictorial elements that reflected benefits and that were conceptually consistent with the verbal content of the cluster. Also, many clusters featured icons that were balanced and moderately complex. However, only a few contained interactive imagery or taglines communicating benefits.
Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario
2011-01-17
Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.
Smartness and Italian Cities. A Cluster Analysis
Directory of Open Access Journals (Sweden)
Flavio Boscacci
2014-05-01
Full Text Available Smart cities have been recently recognized as the most pleasing and attractive places to live in; due to this, both scholars and policy-makers pay close attention to this topic. Specifically, urban “smartness” has been identified by plenty of characteristics that can be grouped into six dimensions (Giffinger et al. 2007: smart Economy (competitiveness, smart People (social and human capital, smart Governance (participation, smart Mobility (both ICTs and transport, smart Environment (natural resources, and smart Living (quality of life. According to this analytical framework, in the present paper the relation between urban attractiveness and the “smart” characteristics has been investigated in the 103 Italian NUTS3 province capitals in the year 2011. To this aim, a descriptive statistics has been followed by a regression analysis (OLS, where the dependent variable measuring the urban attractiveness has been proxied by housing market prices. Besides, a Cluster Analysis (CA has been developed in order to find differences and commonalities among the province capitals.The OLS results indicate that living, people and economy are the key drivers for achieving a better urban attractiveness. Environment, instead, keeps on playing a minor role. Besides, the CA groups the province capitals a
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2018-01-30
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Cluster-cluster correlations in the two-dimensional stationary Ising-model
International Nuclear Information System (INIS)
Klassmann, A.
1997-01-01
In numerical integration of the Cahn-Hillard equation, which describes Oswald rising in a two-phase matrix, N. Masbaum showed that spatial correlations between clusters scale with respect to the mean cluster size (itself a function of time). T. B. Liverpool showed by Monte Carlo simulations for the Ising model that the analogous correlations have a similar form. Both demonstrated that immediately around each cluster there is some depletion area followed by something like a ring of clusters of the same size as the original one. More precisely, it has been shown that the distribution of clusters around a given cluster looks like a sinus-curve decaying exponentially with respect to the distance to a constant value
Radiobiological analyse based on cell cluster models
International Nuclear Information System (INIS)
Lin Hui; Jing Jia; Meng Damin; Xu Yuanying; Xu Liangfeng
2010-01-01
The influence of cell cluster dimension on EUD and TCP for targeted radionuclide therapy was studied using the radiobiological method. The radiobiological features of tumor with activity-lack in core were evaluated and analyzed by associating EUD, TCP and SF.The results show that EUD will increase with the increase of tumor dimension under the activity homogeneous distribution. If the extra-cellular activity was taken into consideration, the EUD will increase 47%. Under the activity-lack in tumor center and the requirement of TCP=0.90, the α cross-fire influence of 211 At could make up the maximum(48 μm)3 activity-lack for Nucleus source, but(72 μm)3 for Cytoplasm, Cell Surface, Cell and Voxel sources. In clinic,the physician could prefer the suggested dose of Cell Surface source in case of the future of local tumor control for under-dose. Generally TCP could well exhibit the effect difference between under-dose and due-dose, but not between due-dose and over-dose, which makes TCP more suitable for the therapy plan choice. EUD could well exhibit the difference between different models and activity distributions,which makes it more suitable for the research work. When the user uses EUD to study the influence of activity inhomogeneous distribution, one should keep the consistency of the configuration and volume of the former and the latter models. (authors)
Liu, Fang; Cao, San-xing; Lu, Rui
2012-04-01
This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.
An incremental DPMM-based method for trajectory clustering, modeling, and retrieval.
Hu, Weiming; Li, Xi; Tian, Guodong; Maybank, Stephen; Zhang, Zhongfei
2013-05-01
Trajectory analysis is the basis for many applications, such as indexing of motion events in videos, activity recognition, and surveillance. In this paper, the Dirichlet process mixture model (DPMM) is applied to trajectory clustering, modeling, and retrieval. We propose an incremental version of a DPMM-based clustering algorithm and apply it to cluster trajectories. An appropriate number of trajectory clusters is determined automatically. When trajectories belonging to new clusters arrive, the new clusters can be identified online and added to the model without any retraining using the previous data. A time-sensitive Dirichlet process mixture model (tDPMM) is applied to each trajectory cluster for learning the trajectory pattern which represents the time-series characteristics of the trajectories in the cluster. Then, a parameterized index is constructed for each cluster. A novel likelihood estimation algorithm for the tDPMM is proposed, and a trajectory-based video retrieval model is developed. The tDPMM-based probabilistic matching method and the DPMM-based model growing method are combined to make the retrieval model scalable and adaptable. Experimental comparisons with state-of-the-art algorithms demonstrate the effectiveness of our algorithm.
MOCK OBSERVATIONS OF BLUE STRAGGLERS IN GLOBULAR CLUSTER MODELS
International Nuclear Information System (INIS)
Sills, Alison; Glebbeek, Evert; Chatterjee, Sourav; Rasio, Frederic A.
2013-01-01
We created artificial color-magnitude diagrams of Monte Carlo dynamical models of globular clusters and then used observational methods to determine the number of blue stragglers in those clusters. We compared these blue stragglers to various cluster properties, mimicking work that has been done for blue stragglers in Milky Way globular clusters to determine the dominant formation mechanism(s) of this unusual stellar population. We find that a mass-based prescription for selecting blue stragglers will select approximately twice as many blue stragglers than a selection criterion that was developed for observations of real clusters. However, the two numbers of blue stragglers are well-correlated, so either selection criterion can be used to characterize the blue straggler population of a cluster. We confirm previous results that the simplified prescription for the evolution of a collision or merger product in the BSE code overestimates their lifetimes. We show that our model blue stragglers follow similar trends with cluster properties (core mass, binary fraction, total mass, collision rate) as the true Milky Way blue stragglers as long as we restrict ourselves to model clusters with an initial binary fraction higher than 5%. We also show that, in contrast to earlier work, the number of blue stragglers in the cluster core does have a weak dependence on the collisional parameter Γ in both our models and in Milky Way globular clusters
International Nuclear Information System (INIS)
Whitehead, Alfred J.; McMillan, Stephen L. W.; Vesperini, Enrico; Portegies Zwart, Simon
2013-01-01
We perform a series of simulations of evolving star clusters using the Astrophysical Multipurpose Software Environment (AMUSE), a new community-based multi-physics simulation package, and compare our results to existing work. These simulations model a star cluster beginning with a King model distribution and a selection of power-law initial mass functions and contain a tidal cutoff. They are evolved using collisional stellar dynamics and include mass loss due to stellar evolution. After studying and understanding that the differences between AMUSE results and results from previous studies are understood, we explored the variation in cluster lifetimes due to the random realization noise introduced by transforming a King model to specific initial conditions. This random realization noise can affect the lifetime of a simulated star cluster by up to 30%. Two modes of star cluster dissolution were identified: a mass evolution curve that contains a runaway cluster dissolution with a sudden loss of mass, and a dissolution mode that does not contain this feature. We refer to these dissolution modes as 'dynamical' and 'relaxation' dominated, respectively. For Salpeter-like initial mass functions, we determined the boundary between these two modes in terms of the dynamical and relaxation timescales.
Cluster analysis for DNA methylation profiles having a detection threshold
Directory of Open Access Journals (Sweden)
Siegmund Kimberly D
2006-07-01
Full Text Available Abstract Background DNA methylation, a molecular feature used to investigate tumor heterogeneity, can be measured on many genomic regions using the MethyLight technology. Due to the combination of the underlying biology of DNA methylation and the MethyLight technology, the measurements, while being generated on a continuous scale, have a large number of 0 values. This suggests that conventional clustering methodology may not perform well on this data. Results We compare performance of existing methodology (such as k-means with two novel methods that explicitly allow for the preponderance of values at 0. We also consider how the ability to successfully cluster such data depends upon the number of informative genes for which methylation is measured and the correlation structure of the methylation values for those genes. We show that when data is collected for a sufficient number of genes, our models do improve clustering performance compared to methods, such as k-means, that do not explicitly respect the supposed biological realities of the situation. Conclusion The performance of analysis methods depends upon how well the assumptions of those methods reflect the properties of the data being analyzed. Differing technologies will lead to data with differing properties, and should therefore be analyzed differently. Consequently, it is prudent to give thought to what the properties of the data are likely to be, and which analysis method might therefore be likely to best capture those properties.
Taxonomical analysis of the Cancer cluster of galaxies
International Nuclear Information System (INIS)
Perea, J.; Olmo, A. del; Moles, M.
1986-01-01
A description is presented of the Cancer cluster of galaxies, based on a taxonomical analysis in (α,delta, Vsub(r)) space. Earlier results by previous authors on the lack of dynamical entity of the cluster are confirmed. The present analysis points out the existence of a binary structure in the most populated region of the complex. (author)
Using Cluster Analysis for Data Mining in Educational Technology Research
Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.
2012-01-01
Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…
Simultaneous Two-Way Clustering of Multiple Correspondence Analysis
Hwang, Heungsun; Dillon, William R.
2010-01-01
A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…
Spherical collapse and cluster counts in modified gravity models
International Nuclear Information System (INIS)
Martino, Matthew C.; Stabenau, Hans F.; Sheth, Ravi K.
2009-01-01
Modifications to the gravitational potential affect the nonlinear gravitational evolution of large scale structures in the Universe. To illustrate some generic features of such changes, we study the evolution of spherically symmetric perturbations when the modification is of Yukawa type; this is nontrivial, because we should not and do not assume that Birkhoff's theorem applies. We then show how to estimate the abundance of virialized objects in such models. Comparison with numerical simulations shows reasonable agreement: When normalized to have the same fluctuations at early times, weaker large scale gravity produces fewer massive halos. However, the opposite can be true for models that are normalized to have the same linear theory power spectrum today, so the abundance of rich clusters potentially places interesting constraints on such models. Our analysis also indicates that the formation histories and abundances of sufficiently low mass objects are unchanged from standard gravity. This explains why simulations have found that the nonlinear power spectrum at large k is unaffected by such modifications to the gravitational potential. In addition, the most massive objects in models with normalized cosmic microwave background and weaker gravity are expected to be similar to the high-redshift progenitors of the most massive objects in models with stronger gravity. Thus, the difference between the cluster and field galaxy populations is expected to be larger in models with stronger large scale gravity.
KMEANS CLUSTERING FOR HIDDEN MARKOV MODEL
Perrone, M.P.; Connell, S.D.
2004-01-01
An unsupervised kmeans clustering algorithm for hidden Markov models is described and applied to the task of generating subclass models for individual handwritten character classes. The algorithm is compared to a related clustering method and shown to give a relative change in the error rate of as
International Nuclear Information System (INIS)
Singh, BirBikram; Patra, S. K.; Gupta, Raj K.
2010-01-01
We have studied the (ground-state) cluster radioactive decays within the preformed cluster model (PCM) of Gupta and collaborators [R. K. Gupta, in Proceedings of the 5th International Conference on Nuclear Reaction Mechanisms, Varenna, edited by E. Gadioli (Ricerca Scientifica ed Educazione Permanente, Milano, 1988), p. 416; S. S. Malik and R. K. Gupta, Phys. Rev. C 39, 1992 (1989)]. The relativistic mean-field (RMF) theory is used to obtain the nuclear matter densities for the double folding procedure used to construct the cluster-daughter potential with M3Y nucleon-nucleon interaction including exchange effects. Following the PCM approach, we have deduced empirically the preformation probability P 0 emp from the experimental data on both the α- and exotic cluster-decays, specifically of parents in the trans-lead region having doubly magic 208 Pb or its neighboring nuclei as daughters. Interestingly, the RMF-densities-based nuclear potential supports the concept of preformation for both the α and heavier clusters in radioactive nuclei. P 0 α(emp) for α decays is almost constant (∼10 -2 -10 -3 ) for all the parent nuclei considered here, and P 0 c(emp) for cluster decays of the same parents decrease with the size of clusters emitted from different parents. The results obtained for P 0 c(emp) are reasonable and are within two to three orders of magnitude of the well-accepted phenomenological model of Blendowske-Walliser for light clusters.
Cluster analysis of activity-time series in motor learning
DEFF Research Database (Denmark)
Balslev, Daniela; Nielsen, Finn Årup; Frutiger, Sally A.
2002-01-01
Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing. Hum. Brain Mapping 15...
Clustering Analysis for Credit Default Probabilities in a Retail Bank Portfolio
Directory of Open Access Journals (Sweden)
Elena ANDREI (DRAGOMIR
2012-08-01
Full Text Available Methods underlying cluster analysis are very useful in data analysis, especially when the processed volume of data is very large, so that it becomes impossible to extract essential information, unless specific instruments are used to summarize and structure the gross information. In this context, cluster analysis techniques are used particularly, for systematic information analysis. The aim of this article is to build an useful model for banking field, based on data mining techniques, by dividing the groups of borrowers into clusters, in order to obtain a profile of the customers (debtors and good payers. We assume that a class is appropriate if it contains members that have a high degree of similarity and the standard method for measuring the similarity within a group shows the lowest variance. After clustering, data mining techniques are implemented on the cluster with bad debtors, reaching a very high accuracy after implementation. The paper is structured as follows: Section 2 describes the model for data analysis based on a specific scoring model that we proposed. In section 3, we present a cluster analysis using K-means algorithm and the DM models are applied on a specific cluster. Section 4 shows the conclusions.
An algebraic model for three-cluster giant molecules
International Nuclear Information System (INIS)
Hess, P.O.; Bijker, R.; Misicu, S.
2001-01-01
After an introduction to the algebraic U(7) model for three bodies, we present a relation of a geometrical description of three-cluster molecule to the algebraic U(7) model. Stiffness parameters of oscillations between each of two clusters are calculated and translated to the model parameter values of the algebraic model. The model is applied to the trinuclear system l32 Sn+ α + ll6 Pd which occurs in the ternary cold fission of 252 Cf. (Author)
Effective action and cluster properties of the abelian Higgs model
Energy Technology Data Exchange (ETDEWEB)
Balaban, T; Imbrie, J Z; Jaffe, A
1988-02-01
We continue our program to establish the Higgs mechanism and mass gap for the abelian Higgs model in two and three dimensions. We develop a multiscale cluster expansion for the high frequency modes of the theory, within a framework of iterated renormalization group transformations. The expansions yield decoupling properties needed for a proof of exponential decay of correlations. The result of this analysis is a gauge invariant unit lattice theory with a deep Higgs potential of the shape required to exhibit the Higgs mechanism.
Alizadeh, Bahram; Najjari, Saeid; Kadkhodaie-Ilkhchi, Ali
2012-08-01
Intelligent and statistical techniques were used to extract the hidden organic facies from well log responses in the Giant South Pars Gas Field, Persian Gulf, Iran. Kazhdomi Formation of Mid-Cretaceous and Kangan-Dalan Formations of Permo-Triassic Data were used for this purpose. Initially GR, SGR, CGR, THOR, POTA, NPHI and DT logs were applied to model the relationship between wireline logs and Total Organic Carbon (TOC) content using Artificial Neural Networks (ANN). The correlation coefficient (R2) between the measured and ANN predicted TOC equals to 89%. The performance of the model is measured by the Mean Squared Error function, which does not exceed 0.0073. Using Cluster Analysis technique and creating a binary hierarchical cluster tree the constructed TOC column of each formation was clustered into 5 organic facies according to their geochemical similarity. Later a second model with the accuracy of 84% was created by ANN to determine the specified clusters (facies) directly from well logs for quick cluster recognition in other wells of the studied field. Each created facies was correlated to its appropriate burial history curve. Hence each and every facies of a formation could be scrutinized separately and directly from its well logs, demonstrating the time and depth of oil or gas generation. Therefore potential production zone of Kazhdomi probable source rock and Kangan- Dalan reservoir formation could be identified while well logging operations (especially in LWD cases) were in progress. This could reduce uncertainty and save plenty of time and cost for oil industries and aid in the successful implementation of exploration and exploitation plans.
Directory of Open Access Journals (Sweden)
Simon Benjaminsson
2010-08-01
Full Text Available Non-parametric data-driven analysis techniques can be used to study datasets with few assumptions about the data and underlying experiment. Variations of Independent Component Analysis (ICA have been the methods mostly used on fMRI data, e.g. in finding resting-state networks thought to reflect the connectivity of the brain. Here we present a novel data analysis technique and demonstrate it on resting-state fMRI data. It is a generic method with few underlying assumptions about the data. The results are built from the statistical relations between all input voxels, resulting in a whole-brain analysis on a voxel level. It has good scalability properties and the parallel implementation is capable of handling large datasets and databases. From the mutual information between the activities of the voxels over time, a distance matrix is created for all voxels in the input space. Multidimensional scaling is used to put the voxels in a lower-dimensional space reflecting the dependency relations based on the distance matrix. By performing clustering in this space we can find the strong statistical regularities in the data, which for the resting-state data turns out to be the resting-state networks. The decomposition is performed in the last step of the algorithm and is computationally simple. This opens up for rapid analysis and visualization of the data on different spatial levels, as well as automatically finding a suitable number of decomposition components.
Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.
Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun
2017-01-01
Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.
Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann
2017-07-01
Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.
Allergen Sensitization Pattern by Sex: A Cluster Analysis in Korea.
Ohn, Jungyoon; Paik, Seung Hwan; Doh, Eun Jin; Park, Hyun-Sun; Yoon, Hyun-Sun; Cho, Soyun
2017-12-01
Allergens tend to sensitize simultaneously. Etiology of this phenomenon has been suggested to be allergen cross-reactivity or concurrent exposure. However, little is known about specific allergen sensitization patterns. To investigate the allergen sensitization characteristics according to gender. Multiple allergen simultaneous test (MAST) is widely used as a screening tool for detecting allergen sensitization in dermatologic clinics. We retrospectively reviewed the medical records of patients with MAST results between 2008 and 2014 in our Department of Dermatology. A cluster analysis was performed to elucidate the allergen-specific immunoglobulin (Ig)E cluster pattern. The results of MAST (39 allergen-specific IgEs) from 4,360 cases were analyzed. By cluster analysis, 39items were grouped into 8 clusters. Each cluster had characteristic features. When compared with female, the male group tended to be sensitized more frequently to all tested allergens, except for fungus allergens cluster. The cluster and comparative analysis results demonstrate that the allergen sensitization is clustered, manifesting allergen similarity or co-exposure. Only the fungus cluster allergens tend to sensitize female group more frequently than male group.
Globular cluster metallicity scale: evidence from stellar models
International Nuclear Information System (INIS)
Demarque, P.; King, C.R.; Diaz, A.
1982-01-01
Theoretical giant branches have been constructed to determine their relative positions for metallicities in the range -2.3 0 )/sub 0,g/ based on these models is presented which yields good agreement over the observed range of metallicities for galactic globular clusters and old disk clusters. The metallicity of 47 Tuc and M71 given by this calibration is about -0.8 dex. Subject headings: clusters, globular: stars: abundances: stars: interiors
Krivitsky, Pavel N; Handcock, Mark S; Raftery, Adrian E; Hoff, Peter D
2009-07-01
Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actor degrees. We propose a latent cluster random effects model to represent all of these features, and we describe a Bayesian estimation method for it. The model is applicable to both binary and non-binary network data. We illustrate the model using two real datasets. We also apply it to two simulated network datasets with the same, highly skewed, degree distribution, but very different network behavior: one unstructured and the other with transitivity and clustering. Models based on degree distributions, such as scale-free, preferential attachment and power-law models, cannot distinguish between these very different situations, but our model does.
Clustering of users of digital libraries through log file analysis
Directory of Open Access Journals (Sweden)
Juan Antonio Martínez-Comeche
2017-09-01
Full Text Available This study analyzes how users perform information retrieval tasks when introducing queries to the Hispanic Digital Library. Clusters of users are differentiated based on their distinct information behavior. The study used the log files collected by the server over a year and different possible clustering algorithms are compared. The k-means algorithm is found to be a suitable clustering method for the analysis of large log files from digital libraries. In the case of the Hispanic Digital Library the results show three clusters of users and the characteristic information behavior of each group is described.
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.
Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy
2016-01-01
Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.
Directory of Open Access Journals (Sweden)
Rosemary M McCloskey
2017-11-01
Full Text Available Clustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections. A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with large data sets. However, we have found that nonparametric clustering methods can be biased towards identifying clusters of diagnosis-where individuals are sampled sooner post-infection-rather than the clusters of rapid transmission that are meant to be potential foci for public health efforts. We develop a fundamentally new approach to genetic clustering based on fitting a Markov-modulated Poisson process (MMPP, which represents the evolution of transmission rates along the tree relating different infections. We evaluated this model-based method alongside five nonparametric clustering methods using both simulated and actual HIV sequence data sets. For simulated clusters of rapid transmission, the MMPP clustering method obtained higher mean sensitivity (85% and specificity (91% than the nonparametric methods. When we applied these clustering methods to published sequences from a study of HIV-1 genetic clusters in Seattle, USA, we found that the MMPP method categorized about half (46% as many individuals to clusters compared to the other methods. Furthermore, the mean internal branch lengths that approximate transmission rates were significantly shorter in clusters extracted using MMPP, but not by other methods. We determined that the computing time for the MMPP method scaled linearly with the size of trees, requiring about 30 seconds for a tree of 1,000 tips and about 20 minutes for 50,000 tips on a single computer. This new approach to genetic clustering has significant implications for the application of pathogen sequence analysis to public health, where
Higgs Pair Production: Choosing Benchmarks With Cluster Analysis
Carvalho, Alexandra; Dorigo, Tommaso; Goertz, Florian; Gottardo, Carlo A.; Tosi, Mia
2016-01-01
New physics theories often depend on a large number of free parameters. The precise values of those parameters in some cases drastically affect the resulting phenomenology of fundamental physics processes, while in others finite variations can leave it basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics of different models; a clustering algorithm using that metric may then allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmark points are then guaranteed to be sensitive to a large area of the parameter space. In this doc...
Ho, Hsuan-Fu; Hung, Chia-Chi
2008-01-01
Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…
A SURVEY ON DOCUMENT CLUSTERING APPROACH FOR COMPUTER FORENSIC ANALYSIS
Monika Raghuvanshi*, Rahul Patel
2016-01-01
In a forensic analysis, large numbers of files are examined. Much of the information comprises of in unstructured format, so it’s quite difficult task for computer forensic to perform such analysis. That’s why to do the forensic analysis of document within a limited period of time require a special approach such as document clustering. This paper review different document clustering algorithms methodologies for example K-mean, K-medoid, single link, complete link, average link in accorandance...
Lukyanov, V. K.; Kadrev, D. N.; Zemlyanaya, E. V.; Spasova, K.; Lukyanov, K. V.; Antonov, A. N.; Gaidarov, M. K.
2015-03-01
The density distributions of 10Be and 11Be nuclei obtained within the quantum Monte Carlo model and the generator coordinate method are used to calculate the microscopic optical potentials (OPs) and cross sections of elastic scattering of these nuclei on protons and 12C at energies E energy approximation. In this hybrid model of OP the free parameters are the depths of the real and imaginary parts obtained by fitting the experimental data. The well-known energy dependence of the volume integrals is used as a physical constraint to resolve the ambiguities of the parameter values. The role of the spin-orbit potential and the surface contribution to the OP is studied for an adequate description of available experimental elastic scattering cross-section data. Also, the cluster model, in which 11Be consists of a n -halo and the 10Be core, is adopted. Within the latter, the breakup cross sections of 11Be nucleus on 9Be,93Nb,181Ta , and 238U targets and momentum distributions of 10Be fragments are calculated and compared with the existing experimental data.
Participant intimacy: A cluster analysis of the intranuclear cascade
International Nuclear Information System (INIS)
Cugnon, J.; Knoll, J.; Randrup, J.
1981-01-01
The intranuclear cascade for relativistic nuclear collisions is analyzed in terms of clusters consisting of groups of nucleons which are dynamically linked to each other by violent interactions. The formation cross sections for the different cluster types as well as their intrinsic dynamics are studied and compared with the predictions of the linear cascade model ( rows-on-rows ). (orig.)
Quark cluster model in the three-nucleon system
International Nuclear Information System (INIS)
Osman, A.
1986-11-01
The quark cluster model is used to investigate the structure of the three-nucleon systems. The nucleon-nucleon interaction is proposed considering the colour-nucleon clusters and incorporating the quark degrees of freedom. The quark-quark potential in the quark compound bag model agrees with the central force potentials. The confinement potential reduces the short-range repulsion. The colour van der Waals force is determined. Then, the probability of quark clusters in the three-nucleon bound state systems are numerically calculated using realistic nuclear wave functions. The results of the present calculations show that quarks cluster themselves in three-quark systems building the quark cluster model for the trinucleon system. (author)
Clustering disaggregated load profiles using a Dirichlet process mixture model
International Nuclear Information System (INIS)
Granell, Ramon; Axon, Colin J.; Wallom, David C.H.
2015-01-01
Highlights: • We show that the Dirichlet process mixture model is scaleable. • Our model does not require the number of clusters as an input. • Our model creates clusters only by the features of the demand profiles. • We have used both residential and commercial data sets. - Abstract: The increasing availability of substantial quantities of power-use data in both the residential and commercial sectors raises the possibility of mining the data to the advantage of both consumers and network operations. We present a Bayesian non-parametric model to cluster load profiles from households and business premises. Evaluators show that our model performs as well as other popular clustering methods, but unlike most other methods it does not require the number of clusters to be predetermined by the user. We used the so-called ‘Chinese restaurant process’ method to solve the model, making use of the Dirichlet-multinomial distribution. The number of clusters grew logarithmically with the quantity of data, making the technique suitable for scaling to large data sets. We were able to show that the model could distinguish features such as the nationality, household size, and type of dwelling between the cluster memberships
Merging Galaxy Clusters: Analysis of Simulated Analogs
Nguyen, Jayke; Wittman, David; Cornell, Hunter
2018-01-01
The nature of dark matter can be better constrained by observing merging galaxy clusters. However, uncertainty in the viewing angle leads to uncertainty in dynamical quantities such as 3-d velocities, 3-d separations, and time since pericenter. The classic timing argument links these quantities via equations of motion, but neglects effects of nonzero impact parameter (i.e. it assumes velocities are parallel to the separation vector), dynamical friction, substructure, and larger-scale environment. We present a new approach using n-body cosmological simulations that naturally incorporate these effects. By uniformly sampling viewing angles about simulated cluster analogs, we see projected merger parameters in the many possible configurations of a given cluster. We select comparable simulated analogs and evaluate the likelihood of particular merger parameters as a function of viewing angle. We present viewing angle constraints for a sample of observed mergers including the Bullet cluster and El Gordo, and show that the separation vectors are closer to the plane of the sky than previously reported.
Fitting Latent Cluster Models for Networks with latentnet
Directory of Open Access Journals (Sweden)
Pavel N. Krivitsky
2007-12-01
Full Text Available latentnet is a package to fit and evaluate statistical latent position and cluster models for networks. Hoﬀ, Raftery, and Handcock (2002 suggested an approach to modeling networks based on positing the existence of an latent space of characteristics of the actors. Relationships form as a function of distances between these characteristics as well as functions of observed dyadic level covariates. In latentnet social distances are represented in a Euclidean space. It also includes a variant of the extension of the latent position model to allow for clustering of the positions developed in Handcock, Raftery, and Tantrum (2007.The package implements Bayesian inference for the models based on an Markov chain Monte Carlo algorithm. It can also compute maximum likelihood estimates for the latent position model and a two-stage maximum likelihood method for the latent position cluster model. For latent position cluster models, the package provides a Bayesian way of assessing how many groups there are, and thus whether or not there is any clustering (since if the preferred number of groups is 1, there is little evidence for clustering. It also estimates which cluster each actor belongs to. These estimates are probabilistic, and provide the probability of each actor belonging to each cluster. It computes four types of point estimates for the coefficients and positions: maximum likelihood estimate, posterior mean, posterior mode and the estimator which minimizes Kullback-Leibler divergence from the posterior. You can assess the goodness-of-fit of the model via posterior predictive checks. It has a function to simulate networks from a latent position or latent position cluster model.
Influence of birth cohort on age of onset cluster analysis in bipolar I disorder
DEFF Research Database (Denmark)
Bauer, M; Glenn, T; Alda, M
2015-01-01
Purpose: Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset...... cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. Results: There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After...... on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more...
Directory of Open Access Journals (Sweden)
Sorana D. BOLBOACĂ
2011-06-01
Full Text Available Aim: The properness of random assignment of compounds in training and validation sets was assessed using the generalized cluster technique. Material and Method: A quantitative Structure-Activity Relationship model using Molecular Descriptors Family on Vertices was evaluated in terms of assignment of carboquinone derivatives in training and test sets during the leave-many-out analysis. Assignment of compounds was investigated using five variables: observed anticancer activity and four structure descriptors. Generalized cluster analysis with K-means algorithm was applied in order to investigate if the assignment of compounds was or not proper. The Euclidian distance and maximization of the initial distance using a cross-validation with a v-fold of 10 was applied. Results: All five variables included in analysis proved to have statistically significant contribution in identification of clusters. Three clusters were identified, each of them containing both carboquinone derivatives belonging to training as well as to test sets. The observed activity of carboquinone derivatives proved to be normal distributed on every. The presence of training and test sets in all clusters identified using generalized cluster analysis with K-means algorithm and the distribution of observed activity within clusters sustain a proper assignment of compounds in training and test set. Conclusion: Generalized cluster analysis using the K-means algorithm proved to be a valid method in assessment of random assignment of carboquinone derivatives in training and test sets.
Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis.
Conley, Samantha
2017-12-01
The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.
RELICS: Strong Lens Models for Five Galaxy Clusters from the Reionization Lensing Cluster Survey
Cerny, Catherine; Sharon, Keren; Andrade-Santos, Felipe; Avila, Roberto J.; Bradač, Maruša; Bradley, Larry D.; Carrasco, Daniela; Coe, Dan; Czakon, Nicole G.; Dawson, William A.; Frye, Brenda L.; Hoag, Austin; Huang, Kuang-Han; Johnson, Traci L.; Jones, Christine; Lam, Daniel; Lovisari, Lorenzo; Mainali, Ramesh; Oesch, Pascal A.; Ogaz, Sara; Past, Matthew; Paterno-Mahler, Rachel; Peterson, Avery; Riess, Adam G.; Rodney, Steven A.; Ryan, Russell E.; Salmon, Brett; Sendra-Server, Irene; Stark, Daniel P.; Strolger, Louis-Gregory; Trenti, Michele; Umetsu, Keiichi; Vulcani, Benedetta; Zitrin, Adi
2018-06-01
Strong gravitational lensing by galaxy clusters magnifies background galaxies, enhancing our ability to discover statistically significant samples of galaxies at {\\boldsymbol{z}}> 6, in order to constrain the high-redshift galaxy luminosity functions. Here, we present the first five lens models out of the Reionization Lensing Cluster Survey (RELICS) Hubble Treasury Program, based on new HST WFC3/IR and ACS imaging of the clusters RXC J0142.9+4438, Abell 2537, Abell 2163, RXC J2211.7–0349, and ACT-CLJ0102–49151. The derived lensing magnification is essential for estimating the intrinsic properties of high-redshift galaxy candidates, and properly accounting for the survey volume. We report on new spectroscopic redshifts of multiply imaged lensed galaxies behind these clusters, which are used as constraints, and detail our strategy to reduce systematic uncertainties due to lack of spectroscopic information. In addition, we quantify the uncertainty on the lensing magnification due to statistical and systematic errors related to the lens modeling process, and find that in all but one cluster, the magnification is constrained to better than 20% in at least 80% of the field of view, including statistical and systematic uncertainties. The five clusters presented in this paper span the range of masses and redshifts of the clusters in the RELICS program. We find that they exhibit similar strong lensing efficiencies to the clusters targeted by the Hubble Frontier Fields within the WFC3/IR field of view. Outputs of the lens models are made available to the community through the Mikulski Archive for Space Telescopes.
COCOA code for creating mock observations of star cluster models
Askar, Abbas; Giersz, Mirek; Pych, Wojciech; Dalessandro, Emanuele
2018-04-01
We introduce and present results from the COCOA (Cluster simulatiOn Comparison with ObservAtions) code that has been developed to create idealized mock photometric observations using results from numerical simulations of star cluster evolution. COCOA is able to present the output of realistic numerical simulations of star clusters carried out using Monte Carlo or N-body codes in a way that is useful for direct comparison with photometric observations. In this paper, we describe the COCOA code and demonstrate its different applications by utilizing globular cluster (GC) models simulated with the MOCCA (MOnte Carlo Cluster simulAtor) code. COCOA is used to synthetically observe these different GC models with optical telescopes, perform point spread function photometry, and subsequently produce observed colour-magnitude diagrams. We also use COCOA to compare the results from synthetic observations of a cluster model that has the same age and metallicity as the Galactic GC NGC 2808 with observations of the same cluster carried out with a 2.2 m optical telescope. We find that COCOA can effectively simulate realistic observations and recover photometric data. COCOA has numerous scientific applications that maybe be helpful for both theoreticians and observers that work on star clusters. Plans for further improving and developing the code are also discussed in this paper.
Alloy design as an inverse problem of cluster expansion models
DEFF Research Database (Denmark)
Larsen, Peter Mahler; Kalidindi, Arvind R.; Schmidt, Søren
2017-01-01
Central to a lattice model of an alloy system is the description of the energy of a given atomic configuration, which can be conveniently developed through a cluster expansion. Given a specific cluster expansion, the ground state of the lattice model at 0 K can be solved by finding the configurat......Central to a lattice model of an alloy system is the description of the energy of a given atomic configuration, which can be conveniently developed through a cluster expansion. Given a specific cluster expansion, the ground state of the lattice model at 0 K can be solved by finding...... the inverse problem in terms of energetically distinct configurations, using a constraint satisfaction model to identify constructible configurations, and show that a convex hull can be used to identify ground states. To demonstrate the approach, we solve for all ground states for a binary alloy in a 2D...
A Flocking Based algorithm for Document Clustering Analysis
Energy Technology Data Exchange (ETDEWEB)
Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL
2006-01-01
Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.
Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.
Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost
2018-04-01
Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).
X-Ray Morphological Analysis of the Planck ESZ Clusters
Energy Technology Data Exchange (ETDEWEB)
Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Andrade-Santos, Felipe; Randall, Scott; Kraft, Ralph [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Ettori, Stefano [INAF, Osservatorio Astronomico di Bologna, via Ranzani 1, I-40127 Bologna (Italy); Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W. [Laboratoire AIM, IRFU/Service d’Astrophysique—CEA/DRF—CNRS—Université Paris Diderot, Bât. 709, CEA-Saclay, F-91191 Gif-sur-Yvette Cedex (France)
2017-09-01
X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.
Network Analysis Tools: from biological networks to clusters and pathways.
Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Vanderstocken, Gilles; van Helden, Jacques
2008-01-01
Network Analysis Tools (NeAT) is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding. The tools are interconnected to enable a stepwise analysis of the network through a complete analytical workflow. In this protocol, we present a typical case of utilization, where the tasks above are combined to decipher a protein-protein interaction network retrieved from the STRING database. The results returned by NeAT are typically subnetworks, networks enriched with additional information (i.e., clusters or paths) or tables displaying statistics. Typical networks comprising several thousands of nodes and arcs can be analyzed within a few minutes. The complete protocol can be read and executed in approximately 1 h.
A self-consistent model of rich clusters of galaxies. I. The galactic component of a cluster
International Nuclear Information System (INIS)
Konyukov, M.V.
1985-01-01
It is shown that to obtain the distribution function for the galactic component of a cluster reduces in the last analysis to solving the boundary-value problem for the gravitational potential of a self-consistent field. The distribution function is determined by two main parameters. An algorithm is constructed for the solution of the problem, and a program is set up to solve it. It is used to establish the region of values of the parameters in the problem for which solutions exist. The scheme proposed is extended to the case where there exists in the cluster a separate central body with a known density distribution (for example, a cD galaxy). A method is indicated for the estimation of the parameters of the model from the results of observations of clusters of galaxies in the optical range
Directory of Open Access Journals (Sweden)
Xiao-Juan Jiang
Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles
Reliability analysis of cluster-based ad-hoc networks
International Nuclear Information System (INIS)
Cook, Jason L.; Ramirez-Marquez, Jose Emmanuel
2008-01-01
The mobile ad-hoc wireless network (MAWN) is a new and emerging network scheme that is being employed in a variety of applications. The MAWN varies from traditional networks because it is a self-forming and dynamic network. The MAWN is free of infrastructure and, as such, only the mobile nodes comprise the network. Pairs of nodes communicate either directly or through other nodes. To do so, each node acts, in turn, as a source, destination, and relay of messages. The virtue of a MAWN is the flexibility this provides; however, the challenge for reliability analyses is also brought about by this unique feature. The variability and volatility of the MAWN configuration makes typical reliability methods (e.g. reliability block diagram) inappropriate because no single structure or configuration represents all manifestations of a MAWN. For this reason, new methods are being developed to analyze the reliability of this new networking technology. New published methods adapt to this feature by treating the configuration probabilistically or by inclusion of embedded mobility models. This paper joins both methods together and expands upon these works by modifying the problem formulation to address the reliability analysis of a cluster-based MAWN. The cluster-based MAWN is deployed in applications with constraints on networking resources such as bandwidth and energy. This paper presents the problem's formulation, a discussion of applicable reliability metrics for the MAWN, and illustration of a Monte Carlo simulation method through the analysis of several example networks
Time series clustering analysis of health-promoting behavior
Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng
2013-10-01
Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.
Shrimankar, D. D.; Sathe, S. R.
2016-01-01
Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
Shrimankar, D D; Sathe, S R
2016-01-01
Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Zhang, Bo; Liu, Wei; Zhang, Zhiwei; Qu, Yanping; Chen, Zhen; Albert, Paul S
2017-08-01
Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.
Cluster analysis of typhoid cases in Kota Bharu, Kelantan, Malaysia
Directory of Open Access Journals (Sweden)
Nazarudin Safian
2008-09-01
Full Text Available Typhoid fever is still a major public health problem globally as well as in Malaysia. This study was done to identify the spatial epidemiology of typhoid fever in the Kota Bharu District of Malaysia as a first step to developing more advanced analysis of the whole country. The main characteristic of the epidemiological pattern that interested us was whether typhoid cases occurred in clusters or whether they were evenly distributed throughout the area. We also wanted to know at what spatial distances they were clustered. All confirmed typhoid cases that were reported to the Kota Bharu District Health Department from the year 2001 to June of 2005 were taken as the samples. From the home address of the cases, the location of the house was traced and a coordinate was taken using handheld GPS devices. Spatial statistical analysis was done to determine the distribution of typhoid cases, whether clustered, random or dispersed. The spatial statistical analysis was done using CrimeStat III software to determine whether typhoid cases occur in clusters, and later on to determine at what distances it clustered. From 736 cases involved in the study there was significant clustering for cases occurring in the years 2001, 2002, 2003 and 2005. There was no significant clustering in year 2004. Typhoid clustering also occurred strongly for distances up to 6 km. This study shows that typhoid cases occur in clusters, and this method could be applicable to describe spatial epidemiology for a specific area. (Med J Indones 2008; 17: 175-82Keywords: typhoid, clustering, spatial epidemiology, GIS
Old star clusters: Bench tests of low mass stellar models
Directory of Open Access Journals (Sweden)
Salaris M.
2013-03-01
Full Text Available Old star clusters in the Milky Way and external galaxies have been (and still are traditionally used to constrain the age of the universe and the timescales of galaxy formation. A parallel avenue of old star cluster research considers these objects as bench tests of low-mass stellar models. This short review will highlight some recent tests of stellar evolution models that make use of photometric and spectroscopic observations of resolved old star clusters. In some cases these tests have pointed to additional physical processes efficient in low-mass stars, that are not routinely included in model computations. Moreover, recent results from the Kepler mission about the old open cluster NGC6791 are adding new tight constraints to the models.
Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis
de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.
2006-01-01
K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…
Guinot, C; Latreille, J; Tenenhaus, M; Malvy, D J
2001-04-01
Today's classifications of healthy skin are predominantly based on a very limited number of skin characteristics, such as skin oiliness or susceptibility to sun exposure. The aim of the present analysis was to set up a global classification of healthy facial skin, using mathematical models. This classification is based on clinical, biophysical skin characteristics and self-reported information related to the skin, as well as the results of a theoretical skin classification assessed separately for the frontal and the malar zones of the face. In order to maximize the predictive power of the models with a minimum of variables, the Partial Least Square (PLS) discriminant analysis method was used. The resulting PLS components were subjected to clustering analyses to identify the plausible number of clusters and to group the individuals according to their proximities. Using this approach, four PLS components could be constructed and six clusters were found relevant. So, from the 36 hypothetical combinations of the theoretical skin types classification, we tended to a strengthened six classes proposal. Our data suggest that the association of the PLS discriminant analysis and the clustering methods leads to a valid and simple way to classify healthy human skin and represents a potentially useful tool for cosmetic and dermatological research.
Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis
Directory of Open Access Journals (Sweden)
Rita Ismayilova
2014-01-01
Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.
International Nuclear Information System (INIS)
Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)
1985-01-01
The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references
Tokuda, Tomoki; Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Directory of Open Access Journals (Sweden)
Tomoki Tokuda
Full Text Available We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392
Using cluster analysis to organize and explore regional GPS velocities
Simpson, Robert W.; Thatcher, Wayne; Savage, James C.
2012-01-01
Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.
Directory of Open Access Journals (Sweden)
Gromov Vladislav Vladimirovich
2013-08-01
Full Text Available We used the complex mathematical modeling, multivariate statistical-analysis, fuzzy sets to analyze the financial and economic state of the crop production in Russian regions. We developed a system of indicators, detecting the state agricultural sector in the region, based on the results of correlation, factor, cluster analysis and statistics of the Federal State Statistics Service. We performed clustering analyses to divide regions of Russia on selected factors into five groups. A qualitative and quantitative characteristics of each cluster was received.
Clustering of European winter storms: A multi-model perspective
Renggli, Dominik; Buettner, Annemarie; Scherb, Anke; Straub, Daniel; Zimmerli, Peter
2016-04-01
The storm series over Europe in 1990 (Daria, Vivian, Wiebke, Herta) and 1999 (Anatol, Lothar, Martin) are very well known. Such clusters of severe events strongly affect the seasonally accumulated damage statistics. The (re)insurance industry has quantified clustering by using distribution assumptions deduced from the historical storm activity of the last 30 to 40 years. The use of storm series simulated by climate models has only started recently. Climate model runs can potentially represent 100s to 1000s of years, allowing a more detailed quantification of clustering than the history of the last few decades. However, it is unknown how sensitive the representation of clustering is to systematic biases. Using a multi-model ensemble allows quantifying that uncertainty. This work uses CMIP5 decadal ensemble hindcasts to study clustering of European winter storms from a multi-model perspective. An objective identification algorithm extracts winter storms (September to April) in the gridded 6-hourly wind data. Since the skill of European storm predictions is very limited on the decadal scale, the different hindcast runs are interpreted as independent realizations. As a consequence, the available hindcast ensemble represents several 1000 simulated storm seasons. The seasonal clustering of winter storms is quantified using the dispersion coefficient. The benchmark for the decadal prediction models is the 20th Century Reanalysis. The decadal prediction models are able to reproduce typical features of the clustering characteristics observed in the reanalysis data. Clustering occurs in all analyzed models over the North Atlantic and European region, in particular over Great Britain and Scandinavia as well as over Iberia (i.e. the exit regions of the North Atlantic storm track). Clustering is generally weaker in the models compared to reanalysis, although the differences between different models are substantial. In contrast to existing studies, clustering is driven by weak
Cheng, K.; Guo, L. M.; Wang, Y. K.; Zafar, M. T.
2017-11-01
In order to select effective samples in the large number of data of PV power generation years and improve the accuracy of PV power generation forecasting model, this paper studies the application of clustering analysis in this field and establishes forecasting model based on neural network. Based on three different types of weather on sunny, cloudy and rainy days, this research screens samples of historical data by the clustering analysis method. After screening, it establishes BP neural network prediction models using screened data as training data. Then, compare the six types of photovoltaic power generation prediction models before and after the data screening. Results show that the prediction model combining with clustering analysis and BP neural networks is an effective method to improve the precision of photovoltaic power generation.
CLUSTERS AS A MODEL OF ECONOMIC DEVELOPMENT OF SERBIA
Directory of Open Access Journals (Sweden)
Marko Laketa
2013-12-01
Full Text Available Insufficient competitiveness of small and medium enterprises in Serbia can be significantly improved by a system of business associations through clusters, business incubators and technology parks. This connection contributes to the growth and development of not only the cluster members, but has a regional and national dimension as well because without it there is no significant breakthrough on the international market. The process of association of small and medium enterprises in clusters and other forms of interconnection in Serbia is far from the required and potential level.The awareness on the importance of clusters in a local economic development through contributions to the advancement of small and medium sized enterprises is not yet sufficiently mature. Support to associating into clusters and usage of their benefits after the model of highly developed countries is the basis for leading a successful economic policy and in Serbia there are all necessary prerequisites for it.
A Distributed Flocking Approach for Information Stream Clustering Analysis
Energy Technology Data Exchange (ETDEWEB)
Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL
2006-01-01
Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.
Cluster analysis of clinical data identifies fibromyalgia subgroups.
Directory of Open Access Journals (Sweden)
Elisa Docampo
Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.
Modeling the formation of globular cluster systems in the Virgo cluster
International Nuclear Information System (INIS)
Li, Hui; Gnedin, Oleg Y.
2014-01-01
The mass distribution and chemical composition of globular cluster (GC) systems preserve fossil record of the early stages of galaxy formation. The observed distribution of GC colors within massive early-type galaxies in the ACS Virgo Cluster Survey (ACSVCS) reveals a multi-modal shape, which likely corresponds to a multi-modal metallicity distribution. We present a simple model for the formation and disruption of GCs that aims to match the ACSVCS data. This model tests the hypothesis that GCs are formed during major mergers of gas-rich galaxies and inherit the metallicity of their hosts. To trace merger events, we use halo merger trees extracted from a large cosmological N-body simulation. We select 20 halos in the mass range of 2 × 10 12 to 7 × 10 13 M ☉ and match them to 19 Virgo galaxies with K-band luminosity between 3 × 10 10 and 3 × 10 11 L ☉ . To set the [Fe/H] abundances, we use an empirical galaxy mass-metallicity relation. We find that a minimal merger ratio of 1:3 best matches the observed cluster metallicity distribution. A characteristic bimodal shape appears because metal-rich GCs are produced by late mergers between massive halos, while metal-poor GCs are produced by collective merger activities of less massive hosts at early times. The model outcome is robust to alternative prescriptions for cluster formation rate throughout cosmic time, but a gradual evolution of the mass-metallicity relation with redshift appears to be necessary to match the observed cluster metallicities. We also affirm the age-metallicity relation, predicted by an earlier model, in which metal-rich clusters are systematically several billion younger than their metal-poor counterparts.
Clustering Trajectories by Relevant Parts for Air Traffic Analysis.
Andrienko, Gennady; Andrienko, Natalia; Fuchs, Georg; Garcia, Jose Manuel Cordero
2018-01-01
Clustering of trajectories of moving objects by similarity is an important technique in movement analysis. Existing distance functions assess the similarity between trajectories based on properties of the trajectory points or segments. The properties may include the spatial positions, times, and thematic attributes. There may be a need to focus the analysis on certain parts of trajectories, i.e., points and segments that have particular properties. According to the analysis focus, the analyst may need to cluster trajectories by similarity of their relevant parts only. Throughout the analysis process, the focus may change, and different parts of trajectories may become relevant. We propose an analytical workflow in which interactive filtering tools are used to attach relevance flags to elements of trajectories, clustering is done using a distance function that ignores irrelevant elements, and the resulting clusters are summarized for further analysis. We demonstrate how this workflow can be useful for different analysis tasks in three case studies with real data from the domain of air traffic. We propose a suite of generic techniques and visualization guidelines to support movement data analysis by means of relevance-aware trajectory clustering.
Modeling familial clustered breast cancer using published data
Jonker, MA; Jacobi, CE; Hoogendoorn, WE; Nagelkerke, NJD; de Bock, GH; van Houwelingen, JC
2003-01-01
The purpose of this research was to model the familial clustering of breast cancer and to provide an accurate risk estimate for individuals from the general population, based on their family history of breast and ovarian cancer. We constructed a genetic model as an extension of a model by Claus et
Quantitative properties of clustering within modern microscopic nuclear models
International Nuclear Information System (INIS)
Volya, A.; Tchuvil’sky, Yu. M.
2016-01-01
A method for studying cluster spectroscopic properties of nuclear fragmentation, such as spectroscopic amplitudes, cluster form factors, and spectroscopic factors, is developed on the basis of modern precision nuclear models that take into account the mixing of large-scale shell-model configurations. Alpha-cluster channels are considered as an example. A mathematical proof of the need for taking into account the channel-wave-function renormalization generated by exchange terms of the antisymmetrization operator (Fliessbach effect) is given. Examples where this effect is confirmed by a high quality of the description of experimental data are presented. By and large, the method in question extends substantially the possibilities for studying clustering phenomena in nuclei and for improving the quality of their description.
A Clustered Extragalactic Foreground Model for the EoR
Murray, S. G.; Trott, C. M.; Jordan, C. H.
2018-05-01
We review an improved statistical model of extra-galactic point-source foregrounds first introduced in Murray et al. (2017), in the context of the Epoch of Reionization. This model extends the instrumentally-convolved foreground covariance used in inverse-covariance foreground mitigation schemes, by considering the cosmological clustering of the sources. In this short work, we show that over scales of k ~ (0.6, 40.)hMpc-1, ignoring source clustering is a valid approximation. This is in contrast to Murray et al. (2017), who found a possibility of false detection if the clustering was ignored. The dominant cause for this change is the introduction of a Galactic synchrotron component which shadows the clustering of sources.
Development of small scale cluster computer for numerical analysis
Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.
2017-09-01
In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.
Depth data research of GIS based on clustering analysis algorithm
Xiong, Yan; Xu, Wenli
2018-03-01
The data of GIS have spatial distribution. Geographic data has both spatial characteristics and attribute characteristics, and also changes with time. Therefore, the amount of data is very large. Nowadays, many industries and departments in the society are using GIS. However, without proper data analysis and mining scheme, GIS will not exert its maximum effectiveness and will waste a lot of data. In this paper, we use the geographic information demand of a national security department as the experimental object, combining the characteristics of GIS data, taking into account the characteristics of time, space, attributes and so on, and using cluster analysis algorithm. We further study the mining scheme for depth data, and get the algorithm model. This algorithm can automatically classify sample data, and then carry out exploratory analysis. The research shows that the algorithm model and the information mining scheme can quickly find hidden depth information from the surface data of GIS, thus improving the efficiency of the security department. This algorithm can also be extended to other fields.
Binary model for the coma cluster of galaxies
International Nuclear Information System (INIS)
Valtonen, M.J.; Byrd, G.G.
1979-01-01
We study the dynamics of galaxies in the Coma cluster and find that the cluster is probably dominated by a central binary of galaxies NGC 4874--NGC4889. We estimate their total mass to be about 3 x 10 14 M/sub sun/ by two independent methods (assuming in Hubble constant of 100 km s -1 Mpc -1 ). This binary is efficient in dynamically ejecting smaller galaxies, some of of which are seen in projection against the inner 3 0 radius of the cluster and which, if erroneously considered as bound members, cause a serious overestimate of the mass of the entire cluster. Taking account of the ejected galaxies, we estimate the total cluster mass to be 4--9 x 10 14 M/sub sun/, with a corresponding mass-to-light ratio for a typical galaxy in the range of 20--120 solar units. The origin of the secondary maximum observed in the radial surface density profile is studied. We consider it to be a remnant of a shell of galaxies which formed around the central binary. This shell expanded, then collapsed into the binary, and is now reexpanding. This is supported by the coincidence of the minimum in the cluster eccentricity and radical velocity dispersion at the same radial distance as the secondary maximum. Numerical simulations of a cluster model with a massive central binary and a spherical shell of test particles are performed, and they reproduce the observed shape, galaxy density, and radial velocity distributions in the Coma cluster fairly well. Consequences of extending the model to other clusters are discussed
Analysis of candidates for interacting galaxy clusters. I. A1204 and A2029/A2033
Gonzalez, Elizabeth Johana; de los Rios, Martín; Oio, Gabriel A.; Lang, Daniel Hernández; Tagliaferro, Tania Aguirre; Domínguez R., Mariano J.; Castellón, José Luis Nilo; Cuevas L., Héctor; Valotto, Carlos A.
2018-04-01
Context. Merging galaxy clusters allow for the study of different mass components, dark and baryonic, separately. Also, their occurrence enables to test the ΛCDM scenario, which can be used to put constraints on the self-interacting cross-section of the dark-matter particle. Aim. It is necessary to perform a homogeneous analysis of these systems. Hence, based on a recently presented sample of candidates for interacting galaxy clusters, we present the analysis of two of these cataloged systems. Methods: In this work, the first of a series devoted to characterizing galaxy clusters in merger processes, we perform a weak lensing analysis of clusters A1204 and A2029/A2033 to derive the total masses of each identified interacting structure together with a dynamical study based on a two-body model. We also describe the gas and the mass distributions in the field through a lensing and an X-ray analysis. This is the first of a series of works which will analyze these type of system in order to characterize them. Results: Neither merging cluster candidate shows evidence of having had a recent merger event. Nevertheless, there is dynamical evidence that these systems could be interacting or could interact in the future. Conclusions: It is necessary to include more constraints in order to improve the methodology of classifying merging galaxy clusters. Characterization of these clusters is important in order to properly understand the nature of these systems and their connection with dynamical studies.
Predicting healthcare outcomes in prematurely born infants using cluster analysis.
MacBean, Victoria; Lunt, Alan; Drysdale, Simon B; Yarzi, Muska N; Rafferty, Gerrard F; Greenough, Anne
2018-05-23
Prematurely born infants are at high risk of respiratory morbidity following neonatal unit discharge, though prediction of outcomes is challenging. We have tested the hypothesis that cluster analysis would identify discrete groups of prematurely born infants with differing respiratory outcomes during infancy. A total of 168 infants (median (IQR) gestational age 33 (31-34) weeks) were recruited in the neonatal period from consecutive births in a tertiary neonatal unit. The baseline characteristics of the infants were used to classify them into hierarchical agglomerative clusters. Rates of viral lower respiratory tract infections (LRTIs) were recorded for 151 infants in the first year after birth. Infants could be classified according to birth weight and duration of neonatal invasive mechanical ventilation (MV) into three clusters. Cluster one (MV ≤5 days) had few LRTIs. Clusters two and three (both MV ≥6 days, but BW ≥or <882 g respectively), had significantly higher LRTI rates. Cluster two had a higher proportion of infants experiencing respiratory syncytial virus LRTIs (P = 0.01) and cluster three a higher proportion of rhinovirus LRTIs (P < 0.001) CONCLUSIONS: Readily available clinical data allowed classification of prematurely born infants into one of three distinct groups with differing subsequent respiratory morbidity in infancy. © 2018 Wiley Periodicals, Inc.
Electromagnetic properties of 6Li in a cluster model with breathing clusters
International Nuclear Information System (INIS)
Kruppa, A.T.; Beck, R.; Dickmann, F.
1987-01-01
Electromagnetic properties of 6 Li are studied using a microscopic (α+δ) cluster model. In addition to the ground state of the clusters, their breathing excited states are included in the wave function in order to take into account the distortion of the clusters. The elastic charge form factor is in good agreement with experiment up to a momentum transfer of 8 fm -2 . The ground state magnetic form factor and the inelastic charge form factor are also well described. The effect of the breathing states of α on the form factors proves to be negligible except at high momentum transfer. The ground-state charge density, rms charge radius, the magnetic dipole moment and a reduced transition strength are also obtained in fair agreement with experiment. (author)
Energy Technology Data Exchange (ETDEWEB)
Kohnert, Aaron A.; Wirth, Brian D. [University of Tennessee, Knoxville, Tennessee 37996-2300 (United States)
2015-04-21
The microstructure that develops under low temperature irradiation in ferritic alloys is dominated by a high density of small (2–5 nm) defects. These defects have been widely observed to move via occasional discrete hops during in situ thin film irradiation experiments. Cluster dynamics models are used to describe the formation of these defects as an aggregation process of smaller clusters created as primary damage. Multiple assumptions regarding the mobility of these damage features are tested in the models, both with and without explicit consideration of such irradiation induced hops. Comparison with experimental data regarding the density of these defects demonstrates the importance of including such motions in a valid model. In particular, discrete hops inform the limited dependence of defect density on irradiation temperature observed in experiments, which the model was otherwise incapable of producing.
Jose M. Iniguez; Joseph L. Ganey; Peter J. Daugherty; John D. Bailey
2005-01-01
The objective of this study was to develop a rule based cover type classification system for the forest and woodland vegetation in the Sky Islands of southeastern Arizona. In order to develop such system we qualitatively and quantitatively compared a hierarchical (Wardâs) and a non-hierarchical (k-means) clustering method. Ecologically, unique groups and plots...
Molecular dynamics modelling of EGCG clusters on ceramide bilayers
Energy Technology Data Exchange (ETDEWEB)
Yeo, Jingjie; Cheng, Yuan; Li, Weifeng; Zhang, Yong-Wei [Institute of High Performance Computing, A*STAR, 138632 (Singapore)
2015-12-31
A novel method of atomistic modelling and characterization of both pure ceramide and mixed lipid bilayers is being developed, using only the General Amber ForceField. Lipid bilayers modelled as pure ceramides adopt hexagonal packing after equilibration, and the area per lipid and bilayer thickness are consistent with previously reported theoretical results. Mixed lipid bilayers are modelled as a combination of ceramides, cholesterol, and free fatty acids. This model is shown to be stable after equilibration. Green tea extract, also known as epigallocatechin-3-gallate, is introduced as a spherical cluster on the surface of the mixed lipid bilayer. It is demonstrated that the cluster is able to bind to the bilayers as a cluster without diffusing into the surrounding water.
GENERALISED MODEL BASED CONFIDENCE INTERVALS IN TWO STAGE CLUSTER SAMPLING
Directory of Open Access Journals (Sweden)
Christopher Ouma Onyango
2010-09-01
Full Text Available Chambers and Dorfman (2002 constructed bootstrap confidence intervals in model based estimation for finite population totals assuming that auxiliary values are available throughout a target population and that the auxiliary values are independent. They also assumed that the cluster sizes are known throughout the target population. We now extend to two stage sampling in which the cluster sizes are known only for the sampled clusters, and we therefore predict the unobserved part of the population total. Jan and Elinor (2008 have done similar work, but unlike them, we use a general model, in which the auxiliary values are not necessarily independent. We demonstrate that the asymptotic properties of our proposed estimator and its coverage rates are better than those constructed under the model assisted local polynomial regression model.
Cluster analysis of radionuclide concentrations in beach sand
de Meijer, R.J.; James, I.; Jennings, P.J.; Keoyers, J.E.
This paper presents a method in which natural radionuclide concentrations of beach sand minerals are traced along a stretch of coast by cluster analysis. This analysis yields two groups of mineral deposit with different origins. The method deviates from standard methods of following dispersal of
Characterizing Suicide in Toronto: An Observational Study and Cluster Analysis
Sinyor, Mark; Schaffer, Ayal; Streiner, David L
2014-01-01
Objective: To determine whether people who have died from suicide in a large epidemiologic sample form clusters based on demographic, clinical, and psychosocial factors. Method: We conducted a coroner’s chart review for 2886 people who died in Toronto, Ontario, from 1998 to 2010, and whose death was ruled as suicide by the Office of the Chief Coroner of Ontario. A cluster analysis using known suicide risk factors was performed to determine whether suicide deaths separate into distinct groups. Clusters were compared according to person- and suicide-specific factors. Results: Five clusters emerged. Cluster 1 had the highest proportion of females and nonviolent methods, and all had depression and a past suicide attempt. Cluster 2 had the highest proportion of people with a recent stressor and violent suicide methods, and all were married. Cluster 3 had mostly males between the ages of 20 and 64, and all had either experienced recent stressors, suffered from mental illness, or had a history of substance abuse. Cluster 4 had the youngest people and the highest proportion of deaths by jumping from height, few were married, and nearly one-half had bipolar disorder or schizophrenia. Cluster 5 had all unmarried people with no prior suicide attempts, and were the least likely to have an identified mental illness and most likely to leave a suicide note. Conclusions: People who die from suicide assort into different patterns of demographic, clinical, and death-specific characteristics. Identifying and studying subgroups of suicides may advance our understanding of the heterogeneous nature of suicide and help to inform development of more targeted suicide prevention strategies. PMID:24444321
Pattern recognition in menstrual bleeding diaries by statistical cluster analysis
Directory of Open Access Journals (Sweden)
Wessel Jens
2009-07-01
Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.
Higgs pair production: choosing benchmarks with cluster analysis
Energy Technology Data Exchange (ETDEWEB)
Carvalho, Alexandra; Dall’Osso, Martino; Dorigo, Tommaso [Dipartimento di Fisica e Astronomia and INFN, Sezione di Padova,Via Marzolo 8, I-35131 Padova (Italy); Goertz, Florian [CERN,1211 Geneva 23 (Switzerland); Gottardo, Carlo A. [Physikalisches Institut, Universität Bonn,Nussallee 12, 53115 Bonn (Germany); Tosi, Mia [CERN,1211 Geneva 23 (Switzerland)
2016-04-20
New physics theories often depend on a large number of free parameters. The phenomenology they predict for fundamental physics processes is in some cases drastically affected by the precise value of those free parameters, while in other cases is left basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics predicted by different models; a clustering algorithm using that metric may allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmarks are then guaranteed to be sensitive to a large area of the parameter space. In this document we show a practical implementation of the above strategy for the study of non-resonant production of Higgs boson pairs in the context of extensions of the standard model with anomalous couplings of the Higgs bosons. A non-standard value of those couplings may significantly enhance the Higgs boson pair-production cross section, such that the process could be detectable with the data that the LHC will collect in Run 2.
International Nuclear Information System (INIS)
Su, Meirong; Fath, Brian D.; Yang, Zhifeng; Chen, Bin; Liu, Gengyuan
2013-01-01
The evaluation of ecosystem health in urban clusters will help establish effective management that promotes sustainable regional development. To standardize the application of emergy synthesis and set pair analysis (EM–SPA) in ecosystem health assessment, a procedure for using EM–SPA models was established in this paper by combining the ability of emergy synthesis to reflect health status from a biophysical perspective with the ability of set pair analysis to describe extensive relationships among different variables. Based on the EM–SPA model, the relative health levels of selected urban clusters and their related ecosystem health patterns were characterized. The health states of three typical Chinese urban clusters – Jing-Jin-Tang, Yangtze River Delta, and Pearl River Delta – were investigated using the model. The results showed that the health status of the Pearl River Delta was relatively good; the health for the Yangtze River Delta was poor. As for the specific health characteristics, the Pearl River Delta and Yangtze River Delta urban clusters were relatively strong in Vigor, Resilience, and Urban ecosystem service function maintenance, while the Jing-Jin-Tang was relatively strong in organizational structure and environmental impact. Guidelines for managing these different urban clusters were put forward based on the analysis of the results of this study. - Highlights: • The use of integrated emergy synthesis and set pair analysis model was standardized. • The integrated model was applied on the scale of an urban cluster. • Health patterns of different urban clusters were compared. • Policy suggestions were provided based on the health pattern analysis
Technology Clusters Exploration for Patent Portfolio through Patent Abstract Analysis
Directory of Open Access Journals (Sweden)
Gabjo Kim
2016-12-01
Full Text Available This study explores technology clusters through patent analysis. The aim of exploring technology clusters is to grasp competitors’ levels of sustainable research and development (R&D and establish a sustainable strategy for entering an industry. To achieve this, we first grouped the patent documents with similar technologies by applying affinity propagation (AP clustering, which is effective while grouping large amounts of data. Next, in order to define the technology clusters, we adopted the term frequency-inverse document frequency (TF-IDF weight, which lists the terms in order of importance. We collected the patent data of Korean electric car companies from the United States Patent and Trademark Office (USPTO to verify our proposed methodology. As a result, our proposed methodology presents more detailed information on the Korean electric car industry than previous studies.
International Nuclear Information System (INIS)
Meslin-Chiffon, E.
2007-11-01
The embrittlement of reactor pressure vessel (RPV) under irradiation is partly due to the formation of point defects (PD) and solute clusters. The aim of this work was to gain more insight into the formation mechanisms of solute clusters in low copper ([Cu] = 0.1 wt%) FeCu and FeCuMnNi model alloys, in a copper free FeMnNi model alloy and in a low copper French RPV steel (16MND5). These materials were neutron-irradiated around 300 C in a test reactor. Solute clusters were characterized by tomographic atom probe whereas PD clusters were simulated with a rate theory numerical code calibrated under cascade damage conditions using transmission electron microscopy analysis. The confrontation between experiments and simulation reveals that a heterogeneous irradiation-induced solute precipitation/segregation probably occurs on PD clusters. (author)
Variational cluster perturbation theory for Bose-Hubbard models
International Nuclear Information System (INIS)
Koller, W; Dupuis, N
2006-01-01
We discuss the application of the variational cluster perturbation theory (VCPT) to the Mott-insulator-to-superfluid transition in the Bose-Hubbard model. We show how the VCPT can be formulated in such a way that it gives a translation invariant excitation spectrum-free of spurious gaps-despite the fact that it formally breaks translation invariance. The phase diagram and the single-particle Green function in the insulating phase are obtained for one-dimensional systems. When the chemical potential of the cluster is taken as a variational parameter, the VCPT reproduces the dimensional dependence of the phase diagram even for one-site clusters. We find a good quantitative agreement with the results of the density-matrix renormalization group when the number of sites in the cluster becomes of order 10. The extension of the method to the superfluid phase is discussed
A Collaboration Service Model for a Global Port Cluster
Directory of Open Access Journals (Sweden)
Keith K.T. Toh
2010-03-01
Full Text Available The importance of port clusters to a global city may be viewed from a number of perspectives. The development of port clusters and economies of agglomeration and their contribution to a regional economy is underpinned by information and physical infrastructure that facilitates collaboration between business entities within the cluster. The maturity of technologies providing portals, web and middleware services provides an opportunity to push the boundaries of contemporary service reference models and service catalogues to what the authors propose to be "collaboration services". Servicing port clusters, portal engineers of the future must consider collaboration services to benefit a region. Particularly, service orchestration through a "public user portal" must gain better utilisation of publically owned infrastructure, to share knowledge and collaborate among organisations through information systems.
CLUSTER ANALYSIS UKRAINIAN REGIONAL DISTRIBUTION BY LEVEL OF INNOVATION
Directory of Open Access Journals (Sweden)
Roman Shchur
2016-07-01
Full Text Available SWOT-analysis of the threats and benefits of innovation development strategy of Ivano-Frankivsk region in the context of financial support was сonducted. Methodical approach to determine of public-private partnerships potential that is tool of innovative economic development financing was identified. Cluster analysis of possibilities of forming public-private partnership in a particular region was carried out. Optimal set of problem areas that require urgent solutions and financial security is defined on the basis of cluster approach. It will help to form practical recommendations for the formation of an effective financial mechanism in the regions of Ukraine. Key words: the mechanism of innovation development financial provision, innovation development, public-private partnerships, cluster analysis, innovative development strategy.
Aerosol cluster impact and break-up: model and implementation
International Nuclear Information System (INIS)
Lechman, Jeremy B.
2010-01-01
In this report a model for simulating aerosol cluster impact with rigid walls is presented. The model is based on JKR adhesion theory and is implemented as an enhancement to the granular (DEM) package within the LAMMPS code. The theory behind the model is outlined and preliminary results are shown. Modeling the interactions of small particles is relevant to a number of applications (e.g., soils, powders, colloidal suspensions, etc.). Modeling the behavior of aerosol particles during agglomeration and cluster dynamics upon impact with a wall is of particular interest. In this report we describe preliminary efforts to develop and implement physical models for aerosol particle interactions. Future work will consist of deploying these models to simulate aerosol cluster behavior upon impact with a rigid wall for the purpose of developing relationships for impact speed and probability of stick/bounce/break-up as well as to assess the distribution of cluster sizes if break-up occurs. These relationships will be developed consistent with the need for inputs into system-level codes. Section 2 gives background and details on the physical model as well as implementations issues. Section 3 presents some preliminary results which lead to discussion in Section 4 of future plans.
Mathematical modelling of complex contagion on clustered networks
O'sullivan, David J.; O'Keeffe, Gary; Fennell, Peter; Gleeson, James
2015-09-01
The spreading of behavior, such as the adoption of a new innovation, is influenced bythe structure of social networks that interconnect the population. In the experiments of Centola (Science, 2010), adoption of new behavior was shown to spread further and faster across clustered-lattice networks than across corresponding random networks. This implies that the “complex contagion” effects of social reinforcement are important in such diffusion, in contrast to “simple” contagion models of disease-spread which predict that epidemics would grow more efficiently on random networks than on clustered networks. To accurately model complex contagion on clustered networks remains a challenge because the usual assumptions (e.g. of mean-field theory) regarding tree-like networks are invalidated by the presence of triangles in the network; the triangles are, however, crucial to the social reinforcement mechanism, which posits an increased probability of a person adopting behavior that has been adopted by two or more neighbors. In this paper we modify the analytical approach that was introduced by Hebert-Dufresne et al. (Phys. Rev. E, 2010), to study disease-spread on clustered networks. We show how the approximation method can be adapted to a complex contagion model, and confirm the accuracy of the method with numerical simulations. The analytical results of the model enable us to quantify the level of social reinforcement that is required to observe—as in Centola’s experiments—faster diffusion on clustered topologies than on random networks.
Mathematical modelling of complex contagion on clustered networks
Directory of Open Access Journals (Sweden)
David J. P. O'Sullivan
2015-09-01
Full Text Available The spreading of behavior, such as the adoption of a new innovation, is influenced bythe structure of social networks that interconnect the population. In the experiments of Centola (Science, 2010, adoption of new behavior was shown to spread further and faster across clustered-lattice networks than across corresponding random networks. This implies that the complex contagion effects of social reinforcement are important in such diffusion, in contrast to simple contagion models of disease-spread which predict that epidemics would grow more efficiently on random networks than on clustered networks. To accurately model complex contagion on clustered networks remains a challenge because the usual assumptions (e.g. of mean-field theory regarding tree-like networks are invalidated by the presence of triangles in the network; the triangles are, however, crucial to the social reinforcement mechanism, which posits an increased probability of a person adopting behavior that has been adopted by two or more neighbors. In this paper we modify the analytical approach that was introduced by Hebert-Dufresne et al. (Phys. Rev. E, 2010, to study disease-spread on clustered networks. We show how the approximation method can be adapted to a complex contagion model, and confirm the accuracy of the method with numerical simulations. The analytical results of the model enable us to quantify the level of social reinforcement that is required to observe—as in Centola’s experiments—faster diffusion on clustered topologies than on random networks.
Running and rotating: modelling the dynamics of migrating cell clusters
Copenhagen, Katherine; Gov, Nir; Gopinathan, Ajay
Collective motion of cells is a common occurrence in many biological systems, including tissue development and repair, and tumor formation. Recent experiments have shown cells form clusters in a chemical gradient, which display three different phases of motion: translational, rotational, and random. We present a model for cell clusters based loosely on other models seen in the literature that involves a Vicsek-like alignment as well as physical collisions and adhesions between cells. With this model we show that a mechanism for driving rotational motion in this kind of system is an increased motility of rim cells. Further, we examine the details of the relationship between rim and core cells, and find that the phases of the cluster as a whole are correlated with the creation and annihilation of topological defects in the tangential component of the velocity field.
Soft and diffractive scattering with the cluster model in Herwig
Energy Technology Data Exchange (ETDEWEB)
Gieseke, Stefan; Loshaj, Frasher; Kirchgaesser, Patrick [Karlsruhe Institute of Technology, Institute for Theoretical Physics, Karlsruhe (Germany)
2017-03-15
We present a new model for soft interactions in the event-generator Herwig. The model consists of two components. One to model diffractive final states on the basis of the cluster hadronization model and a second component that addresses soft multiple interactions as multiple particle production in multiperipheral kinematics. We present much improved results for minimum-bias measurements at various LHC energies. (orig.)
Application of microarray analysis on computer cluster and cloud platforms.
Bernau, C; Boulesteix, A-L; Knaus, J
2013-01-01
Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
Dynamic analysis of clustered building structures using substructures methods
International Nuclear Information System (INIS)
Leimbach, K.R.; Krutzik, N.J.
1989-01-01
The dynamic substructure approach to the building cluster on a common base mat starts with the generation of Ritz-vectors for each building on a rigid foundation. The base mat plus the foundation soil is subjected to kinematic constraint modes, for example constant, linear, quadratic or cubic constraints. These constraint modes are also imposed on the buildings. By enforcing kinematic compatibility of the complete structural system on the basis of the constraint modes a reduced Ritz model of the complete cluster is obtained. This reduced model can now be analyzed by modal time history or response spectrum methods
Point Cluster Analysis Using a 3D Voronoi Diagram with Applications in Point Cloud Segmentation
Directory of Open Access Journals (Sweden)
Shen Ying
2015-08-01
Full Text Available Three-dimensional (3D point analysis and visualization is one of the most effective methods of point cluster detection and segmentation in geospatial datasets. However, serious scattering and clotting characteristics interfere with the visual detection of 3D point clusters. To overcome this problem, this study proposes the use of 3D Voronoi diagrams to analyze and visualize 3D points instead of the original data item. The proposed algorithm computes the cluster of 3D points by applying a set of 3D Voronoi cells to describe and quantify 3D points. The decompositions of point cloud of 3D models are guided by the 3D Voronoi cell parameters. The parameter values are mapped from the Voronoi cells to 3D points to show the spatial pattern and relationships; thus, a 3D point cluster pattern can be highlighted and easily recognized. To capture different cluster patterns, continuous progressive clusters and segmentations are tested. The 3D spatial relationship is shown to facilitate cluster detection. Furthermore, the generated segmentations of real 3D data cases are exploited to demonstrate the feasibility of our approach in detecting different spatial clusters for continuous point cloud segmentation.
Fuzzy Modeled K-Cluster Quality Mining of Hidden Knowledge for Decision Support
S. Parkash Kumar; K. S. Ramaswami
2011-01-01
Problem statement: The work presented Fuzzy Modeled K-means Cluster Quality Mining of hidden knowledge for Decision Support. Based on the number of clusters, number of objects in each cluster and its cohesiveness, precision and recall values, the cluster quality metrics is measured. The fuzzy k-means is adapted approach by using heuristic method which iterates the cluster to form an efficient valid cluster. With the obtained data clusters, quality assessment is made by predictive mining using...
Automated analysis of organic particles using cluster SIMS
Energy Technology Data Exchange (ETDEWEB)
Gillen, Greg; Zeissler, Cindy; Mahoney, Christine; Lindstrom, Abigail; Fletcher, Robert; Chi, Peter; Verkouteren, Jennifer; Bright, David; Lareau, Richard T.; Boldman, Mike
2004-06-15
Cluster primary ion bombardment combined with secondary ion imaging is used on an ion microscope secondary ion mass spectrometer for the spatially resolved analysis of organic particles on various surfaces. Compared to the use of monoatomic primary ion beam bombardment, the use of a cluster primary ion beam (SF{sub 5}{sup +} or C{sub 8}{sup -}) provides significant improvement in molecular ion yields and a reduction in beam-induced degradation of the analyte molecules. These characteristics of cluster bombardment, along with automated sample stage control and custom image analysis software are utilized to rapidly characterize the spatial distribution of trace explosive particles, narcotics and inkjet-printed microarrays on a variety of surfaces.
Modeling and clustering users with evolving profiles in usage streams
Zhang, Chongsheng
2012-09-01
Today, there is an increasing need of data stream mining technology to discover important patterns on the fly. Existing data stream models and algorithms commonly assume that users\\' records or profiles in data streams will not be updated or revised once they arrive. Nevertheless, in various applications such asWeb usage, the records/profiles of the users can evolve along time. This kind of streaming data evolves in two forms, the streaming of tuples or transactions as in the case of traditional data streams, and more importantly, the evolving of user records/profiles inside the streams. Such data streams bring difficulties on modeling and clustering for exploring users\\' behaviors. In this paper, we propose three models to summarize this kind of data streams, which are the batch model, the Evolving Objects (EO) model and the Dynamic Data Stream (DDS) model. Through creating, updating and deleting user profiles, these models summarize the behaviors of each user as a profile object. Based upon these models, clustering algorithms are employed to discover interesting user groups from the profile objects. We have evaluated all the proposed models on a large real-world data set, showing that the DDS model summarizes the data streams with evolving tuples more efficiently and effectively, and provides better basis for clustering users than the other two models. © 2012 IEEE.
Modeling and clustering users with evolving profiles in usage streams
Zhang, Chongsheng; Masseglia, Florent; Zhang, Xiangliang
2012-01-01
Today, there is an increasing need of data stream mining technology to discover important patterns on the fly. Existing data stream models and algorithms commonly assume that users' records or profiles in data streams will not be updated or revised once they arrive. Nevertheless, in various applications such asWeb usage, the records/profiles of the users can evolve along time. This kind of streaming data evolves in two forms, the streaming of tuples or transactions as in the case of traditional data streams, and more importantly, the evolving of user records/profiles inside the streams. Such data streams bring difficulties on modeling and clustering for exploring users' behaviors. In this paper, we propose three models to summarize this kind of data streams, which are the batch model, the Evolving Objects (EO) model and the Dynamic Data Stream (DDS) model. Through creating, updating and deleting user profiles, these models summarize the behaviors of each user as a profile object. Based upon these models, clustering algorithms are employed to discover interesting user groups from the profile objects. We have evaluated all the proposed models on a large real-world data set, showing that the DDS model summarizes the data streams with evolving tuples more efficiently and effectively, and provides better basis for clustering users than the other two models. © 2012 IEEE.
Assessment of surface water quality using hierarchical cluster analysis
Directory of Open Access Journals (Sweden)
Dheeraj Kumar Dabgerwal
2016-02-01
Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016, pp: 32-44
application of single-linkage clustering method in the analysis of ...
African Journals Online (AJOL)
Admin
ANALYSIS OF GROWTH RATE OF GROSS DOMESTIC PRODUCT. (GDP) AT ... The end result of the algorithm is a tree of clusters called a dendrogram, which shows how the clusters are ..... Number of cluster sum from from observations of ...
Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups
Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José
2013-01-01
Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674
Transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis
Directory of Open Access Journals (Sweden)
Riccardi Giovanna
2009-03-01
Full Text Available Abstract Background The ESAT-6 (early secreted antigenic target, 6 kDa family collects small mycobacterial proteins secreted by Mycobacterium tuberculosis, particularly in the early phase of growth. There are 23 ESAT-6 family members in M. tuberculosis H37Rv. In a previous work, we identified the Zur- dependent regulation of five proteins of the ESAT-6/CFP-10 family (esxG, esxH, esxQ, esxR, and esxS. esxG and esxH are part of ESAT-6 cluster 3, whose expression was already known to be induced by iron starvation. Results In this research, we performed EMSA experiments and transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis (msmeg0615-msmeg0625 and M. tuberculosis. In contrast to what we had observed in M. tuberculosis, we found that in M. smegmatis ESAT-6 cluster 3 responds only to iron and not to zinc. In both organisms we identified an internal promoter, a finding which suggests the presence of two transcriptional units and, by consequence, a differential expression of cluster 3 genes. We compared the expression of msmeg0615 and msmeg0620 in different growth and stress conditions by means of relative quantitative PCR. The expression of msmeg0615 and msmeg0620 genes was essentially similar; they appeared to be repressed in most of the tested conditions, with the exception of acid stress (pH 4.2 where msmeg0615 was about 4-fold induced, while msmeg0620 was repressed. Analysis revealed that in acid stress conditions M. tuberculosis rv0282 gene was 3-fold induced too, while rv0287 induction was almost insignificant. Conclusion In contrast with what has been reported for M. tuberculosis, our results suggest that in M. smegmatis only IdeR-dependent regulation is retained, while zinc has no effect on gene expression. The role of cluster 3 in M. tuberculosis virulence is still to be defined; however, iron- and zinc-dependent expression strongly suggests that cluster 3 is highly expressed in the infective process, and that the cluster
Graph analysis of cell clusters forming vascular networks
Alves, A. P.; Mesquita, O. N.; Gómez-Gardeñes, J.; Agero, U.
2018-03-01
This manuscript describes the experimental observation of vasculogenesis in chick embryos by means of network analysis. The formation of the vascular network was observed in the area opaca of embryos from 40 to 55 h of development. In the area opaca endothelial cell clusters self-organize as a primitive and approximately regular network of capillaries. The process was observed by bright-field microscopy in control embryos and in embryos treated with Bevacizumab (Avastin), an antibody that inhibits the signalling of the vascular endothelial growth factor (VEGF). The sequence of images of the vascular growth were thresholded, and used to quantify the forming network in control and Avastin-treated embryos. This characterization is made by measuring vessels density, number of cell clusters and the largest cluster density. From the original images, the topology of the vascular network was extracted and characterized by means of the usual network metrics such as: the degree distribution, average clustering coefficient, average short path length and assortativity, among others. This analysis allows to monitor how the largest connected cluster of the vascular network evolves in time and provides with quantitative evidence of the disruptive effects that Avastin has on the tree structure of vascular networks.
Megacity analysis: a clustering approach to classification
2017-06-01
overview. Retrieved from https://www.usaid.gov/news-information/fact-sheets/kabul- urban -water-supply- kuws USGS. (2009). Conceptual model of water...PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Naval Postgraduate School Monterey, CA 93943-5000 8. PERFORMING ORGANIZATION REPORT NUMBER 9...is interested in these megacity networks and their implications for potential urban operations. We develop a methodology to group like megacities
Vector Nonlinear Time-Series Analysis of Gamma-Ray Burst Datasets on Heterogeneous Clusters
Directory of Open Access Journals (Sweden)
Ioana Banicescu
2005-01-01
Full Text Available The simultaneous analysis of a number of related datasets using a single statistical model is an important problem in statistical computing. A parameterized statistical model is to be fitted on multiple datasets and tested for goodness of fit within a fixed analytical framework. Definitive conclusions are hopefully achieved by analyzing the datasets together. This paper proposes a strategy for the efficient execution of this type of analysis on heterogeneous clusters. Based on partitioning processors into groups for efficient communications and a dynamic loop scheduling approach for load balancing, the strategy addresses the variability of the computational loads of the datasets, as well as the unpredictable irregularities of the cluster environment. Results from preliminary tests of using this strategy to fit gamma-ray burst time profiles with vector functional coefficient autoregressive models on 64 processors of a general purpose Linux cluster demonstrate the effectiveness of the strategy.
Indian Academy of Sciences (India)
2017-09-27
Sep 27, 2017 ... Author for correspondence (zh4403701@126.com). MS received 15 ... lic clusters using density functional theory (DFT)-GGA of the DMOL3 package. ... In the process of geometric optimization, con- vergence thresholds ..... and Postgraduate Research & Practice Innovation Program of. Jiangsu Province ...
Indian Academy of Sciences (India)
environmental as well as technical problems during fuel gas utilization. ... adsorption on some alloys of Pd, namely PdAu, PdAg ... ried out on small neutral and charged Au24,26,27, Cu,28 ... study of Zanti et al.29 on Pdn (n = 1–9) clusters.
A grand unified model for liganded gold clusters
Xu, Wen Wu; Zhu, Beien; Zeng, Xiao Cheng; Gao, Yi
2016-12-01
A grand unified model (GUM) is developed to achieve fundamental understanding of rich structures of all 71 liganded gold clusters reported to date. Inspired by the quark model by which composite particles (for example, protons and neutrons) are formed by combining three quarks (or flavours), here gold atoms are assigned three `flavours' (namely, bottom, middle and top) to represent three possible valence states. The `composite particles' in GUM are categorized into two groups: variants of triangular elementary block Au3(2e) and tetrahedral elementary block Au4(2e), all satisfying the duet rule (2e) of the valence shell, akin to the octet rule in general chemistry. The elementary blocks, when packed together, form the cores of liganded gold clusters. With the GUM, structures of 71 liganded gold clusters and their growth mechanism can be deciphered altogether. Although GUM is a predictive heuristic and may not be necessarily reflective of the actual electronic structure, several highly stable liganded gold clusters are predicted, thereby offering GUM-guided synthesis of liganded gold clusters by design.
Oxide-supported metal clusters: models for heterogeneous catalysts
International Nuclear Information System (INIS)
Santra, A K; Goodman, D W
2003-01-01
Understanding the size-dependent electronic, structural and chemical properties of metal clusters on oxide supports is an important aspect of heterogeneous catalysis. Recently model oxide-supported metal catalysts have been prepared by vapour deposition of catalytically relevant metals onto ultra-thin oxide films grown on a refractory metal substrate. Reactivity and spectroscopic/microscopic studies have shown that these ultra-thin oxide films are excellent models for the corresponding bulk oxides, yet are sufficiently electrically conductive for use with various modern surface probes including scanning tunnelling microscopy (STM). Measurements on metal clusters have revealed a metal to nonmetal transition as well as changes in the crystal and electronic structures (including lattice parameters, band width, band splitting and core-level binding energy shifts) as a function of cluster size. Size-dependent catalytic reactivity studies have been carried out for several important reactions, and time-dependent catalytic deactivation has been shown to arise from sintering of metal particles under elevated gas pressures and/or reactor temperatures. In situ STM methodologies have been developed to follow the growth and sintering kinetics on a cluster-by-cluster basis. Although several critical issues have been addressed by several groups worldwide, much more remains to be done. This article highlights some of these accomplishments and summarizes the challenges that lie ahead. (topical review)
Regional SAR Image Segmentation Based on Fuzzy Clustering with Gamma Mixture Model
Li, X. L.; Zhao, Q. H.; Li, Y.
2017-09-01
Most of stochastic based fuzzy clustering algorithms are pixel-based, which can not effectively overcome the inherent speckle noise in SAR images. In order to deal with the problem, a regional SAR image segmentation algorithm based on fuzzy clustering with Gamma mixture model is proposed in this paper. First, initialize some generating points randomly on the image, the image domain is divided into many sub-regions using Voronoi tessellation technique. Each sub-region is regarded as a homogeneous area in which the pixels share the same cluster label. Then, assume the probability of the pixel to be a Gamma mixture model with the parameters respecting to the cluster which the pixel belongs to. The negative logarithm of the probability represents the dissimilarity measure between the pixel and the cluster. The regional dissimilarity measure of one sub-region is defined as the sum of the measures of pixels in the region. Furthermore, the Markov Random Field (MRF) model is extended from pixels level to Voronoi sub-regions, and then the regional objective function is established under the framework of fuzzy clustering. The optimal segmentation results can be obtained by the solution of model parameters and generating points. Finally, the effectiveness of the proposed algorithm can be proved by the qualitative and quantitative analysis from the segmentation results of the simulated and real SAR images.
Cluster Analysis of International Information and Social Development.
Lau, Jesus
1990-01-01
Analyzes information activities in relation to socioeconomic characteristics in low, middle, and highly developed economies for the years 1960 and 1977 through the use of cluster analysis. Results of data from 31 countries suggest that information development is achieved mainly by countries that have also achieved social development. (26…
Making Sense of Cluster Analysis: Revelations from Pakistani Science Classes
Pell, Tony; Hargreaves, Linda
2011-01-01
Cluster analysis has been applied to quantitative data in educational research over several decades and has been a feature of the Maurice Galton's research in primary and secondary classrooms. It has offered potentially useful insights for teaching yet its implications for practice are rarely implemented. It has been subject also to negative…
Cluster analysis for validated climatology stations using precipitation in Mexico
Bravo Cabrera, J. L.; Azpra-Romero, E.; Zarraluqui-Such, V.; Gay-García, C.; Estrada Porrúa, F.
2012-01-01
Annual average of daily precipitation was used to group climatological stations into clusters using the k-means procedure and principal component analysis with varimax rotation. After a careful selection of the stations deployed in Mexico since 1950, we selected 349 characterized by having 35 to 40
A Cluster Analysis of Personality Style in Adults with ADHD
Robin, Arthur L.; Tzelepis, Angela; Bedway, Marquita
2008-01-01
Objective: The purpose of this study was to use hierarchical linear cluster analysis to examine the normative personality styles of adults with ADHD. Method: A total of 311 adults with ADHD completed the Millon Index of Personality Styles, which consists of 24 scales assessing motivating aims, cognitive modes, and interpersonal behaviors. Results:…
Characterization of population exposure to organochlorines: A cluster analysis application
R.M. Guimarães (Raphael Mendonça); S. Asmus (Sven); A. Burdorf (Alex)
2013-01-01
textabstractThis study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues
Robustness in cluster analysis in the presence of anomalous observations
Zhuk, EE
Cluster analysis of multivariate observations in the presence of "outliers" (anomalous observations) in a sample is studied. The expected (mean) fraction of erroneous decisions for the decision rule is computed analytically by minimizing the intraclass scatter. A robust decision rule (stable to
Language Learner Motivational Types: A Cluster Analysis Study
Papi, Mostafa; Teimouri, Yasser
2014-01-01
The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…
Cluster analysis as a prediction tool for pregnancy outcomes.
Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L
2015-03-01
Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.
Performance Analysis of Unsupervised Clustering Methods for Brain Tumor Segmentation
Directory of Open Access Journals (Sweden)
Tushar H Jaware
2013-10-01
Full Text Available Medical image processing is the most challenging and emerging field of neuroscience. The ultimate goal of medical image analysis in brain MRI is to extract important clinical features that would improve methods of diagnosis & treatment of disease. This paper focuses on methods to detect & extract brain tumour from brain MR images. MATLAB is used to design, software tool for locating brain tumor, based on unsupervised clustering methods. K-Means clustering algorithm is implemented & tested on data base of 30 images. Performance evolution of unsupervised clusteringmethods is presented.
Identifying clinical course patterns in SMS data using cluster analysis
DEFF Research Database (Denmark)
Kent, Peter; Kongsted, Alice
2012-01-01
ABSTRACT: BACKGROUND: Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important...... showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there are alternative ways of managing SMS data and many different methods...
Emergence of clustering in an acquaintance model without homophily
Bhat, Uttam; Krapivsky, P. L.; Redner, S.
2014-11-01
We introduce an agent-based acquaintance model in which social links are created by processes in which there is no explicit homophily. In spite of the homogeneous nature of the social interactions, highly-clustered social networks can arise. The crucial feature of our model is that of variable transitive interactions. Namely, when an agent introduces two unconnected friends, the rate at which a connection actually occurs between them depends on the number of their mutual acquaintances. As this transitive interaction rate is varied, the social network undergoes a dramatic clustering transition. Close to the transition, the network consists of a collection of well-defined communities. As a function of time, the network can also undergo an incomplete gelation transition, in which the gel, or giant cluster, does not constitute the entire network, even at infinite time. Some of the clustering properties of our model also arise, but in a more gradual manner, in Facebook networks. Finally, we discuss a more realistic variant of our original model in which network realizations can be constructed that quantitatively match Facebook networks.
Emergence of clustering in an acquaintance model without homophily
International Nuclear Information System (INIS)
Bhat, Uttam; Krapivsky, P L; Redner, S
2014-01-01
We introduce an agent-based acquaintance model in which social links are created by processes in which there is no explicit homophily. In spite of the homogeneous nature of the social interactions, highly-clustered social networks can arise. The crucial feature of our model is that of variable transitive interactions. Namely, when an agent introduces two unconnected friends, the rate at which a connection actually occurs between them depends on the number of their mutual acquaintances. As this transitive interaction rate is varied, the social network undergoes a dramatic clustering transition. Close to the transition, the network consists of a collection of well-defined communities. As a function of time, the network can also undergo an incomplete gelation transition, in which the gel, or giant cluster, does not constitute the entire network, even at infinite time. Some of the clustering properties of our model also arise, but in a more gradual manner, in Facebook networks. Finally, we discuss a more realistic variant of our original model in which network realizations can be constructed that quantitatively match Facebook networks. (paper)
Large psub(T) pion production and clustered parton model
Energy Technology Data Exchange (ETDEWEB)
Kanki, T [Osaka Univ., Toyonaka (Japan). Coll. of General Education
1977-05-01
Recent experimental results on the large p sub(T) inclusive ..pi../sup 0/ productions by pp and ..pi..p collisions are interpreted by the parton model in which the constituent quarks are defined to be the clusters of the quark-partons and gluons.
Metal cluster fission: jellium model and Molecular dynamics simulations
DEFF Research Database (Denmark)
Lyalin, Andrey G.; Obolensky, Oleg I.; Solov'yov, Ilia
2004-01-01
Fission of doubly charged sodium clusters is studied using the open-shell two-center deformed jellium model approximation and it ab initio molecular dynamic approach accounting for all electrons in the system. Results of calculations of fission reactions Na_10^2+ --> Na_7^+ + Na_3^+ and Na_18...
The dilute random field Ising model by finite cluster approximation
International Nuclear Information System (INIS)
Benyoussef, A.; Saber, M.
1987-09-01
Using the finite cluster approximation, phase diagrams of bond and site diluted three-dimensional simple cubic Ising models with a random field have been determined. The resulting phase diagrams have the same general features for both bond and site dilution. (author). 7 refs, 4 figs
Performance prediction model for distributed applications on multicore clusters
CSIR Research Space (South Africa)
Khanyile, NP
2012-07-01
Full Text Available discusses some of the short comings of this law in the current age. We propose a theoretical model for predicting the behavior of a distributed algorithm given the network restrictions of the cluster used. The paper focuses on the impact of latency...
Fault detection of flywheel system based on clustering and principal component analysis
Directory of Open Access Journals (Sweden)
Wang Rixin
2015-12-01
Full Text Available Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of “integrated power and attitude control” system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the relationship of parameters in each operation through the principal component analysis (PCA method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.
Viederytė, Rasa; Didžiokas, Rimantas
2014-01-01
Paper analyses several cluster models on the basis of competitiveness: Nine-factor model, Double diamond model, Funnel model of cluster determinants, Destination Competitiveness and sustainability models, which are related to Porter’s Diamond model and concentrate to the classical one - adopt M. Porter’s Diamond model methodology to the evaluation of Lithuanian Maritime sector’s clustering on the basis of competitiveness. Despite the advances in cluster research, this model remains a complex ...
Beyond Hydrodynamic Modeling of AGN Heating in Galaxy Clusters
Yang, Hsiang-Yi Karen
Clusters of galaxies hold a unique position in hierarchical structure formation - they are both powerful cosmological probes and excellent astrophysical laboratories. Accurate modeling of the cluster properties is crucial for reducing systematic uncertainties in cluster cosmology. However, theoretical modeling of the intracluster medium (ICM) has long suffered from the "cooling-flow problem" - clusters with short central times or cool cores (CCs) are predicted to host massive inflows of gas that are not observed. Feedback from active galactic nuclei (AGN) is by far the most promising heating mechanism to counteract radiative cooling. Recent hydrodynamic simulations have made remarkable progress reproducing properties of the CCs. However, there remain two major questions that cannot be probed using purely hydrodynamic models: (1) what are the roles of cosmic rays (CRs)? (2) how is the existing picture altered when the ICM is modeled as weakly collisional plasma? We propose to move beyond limitations of pure hydrodynamics and progress toward a complete understanding of how AGN jet-inflated bubbles interact with their surroundings and provide heat to the ICM. Our objectives include: (1) understand how CR-dominated bubbles heat the ICM; (2) understand bubble evolution and sound-wave dissipation in the ICM with different assumptions of plasma properties, e.g., collisionality of the ICM, with or without anisotropic transport processes; (3) Develop a subgrid model of AGN heating that can be adopted in cosmological simulations based on state-of-the-art isolated simulations. We will use a combination of analytical calculations and idealized simulations to advance our understanding of each individual physical process. We will then perform the first three-dimensional (3D) magnetohydrodynamic (MHD) simulations of self-regulated AGN feedback with relevant CR and anisotropic transport processes in order to quantify the amount and distribution of heating from the AGN. Our
High-dimensional cluster analysis with the Masked EM Algorithm
Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.
2014-01-01
Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694
A cluster analysis investigation of workaholism as a syndrome.
Aziz, Shahnaz; Zickar, Michael J
2006-01-01
Workaholism has been conceptualized as a syndrome although there have been few tests that explicitly consider its syndrome status. The authors analyzed a three-dimensional scale of workaholism developed by Spence and Robbins (1992) using cluster analysis. The authors identified three clusters of individuals, one of which corresponded to Spence and Robbins's profile of the workaholic (high work involvement, high drive to work, low work enjoyment). Consistent with previously conjectured relations with workaholism, individuals in the workaholic cluster were more likely to label themselves as workaholics, more likely to have acquaintances label them as workaholics, and more likely to have lower life satisfaction and higher work-life imbalance. The importance of considering workaholism as a syndrome and the implications for effective interventions are discussed. Copyright 2006 APA.
Cluster model calculations of alpha decays across the periodic table
International Nuclear Information System (INIS)
Merchant, A.C.; Buck, B.
1988-10-01
The cluster model of Buck, Dover and Vary has been used to calculate partial widths for alpha decay from the ground states of all nuclei for which experimental measurements exist. The cluster-core potential is represented by a simple three-parameter form having fixed diffuseness, a radius which scales as A 1/3 and a depth which is adjusted to fit the Q-value of the particular decay. The calculations yield excellent agreement with the vast majority of the available data, and some typical examples are presented. (author) [pt
The effect of alkylating agents on model supported metal clusters
Energy Technology Data Exchange (ETDEWEB)
Erdem-Senatalar, A.; Blackmond, D.G.; Wender, I. (Pittsburgh Univ., PA (USA). Dept. of Chemical and Petroleum Engineering); Oukaci, R. (CERHYD, Algiers (Algeria))
1988-01-01
Interactions between model supported metal clusters and alkylating agents were studied in an effort to understand a novel chemical trapping technique developed for identifying species adsorbed on catalyst surfaces. It was found that these interactions are more complex than had previously been suggested. Studies were completed using deuterium-labeled dimethyl sulfate (DMS), (CH{sub 3}){sub 2}SO{sub 4}, as a trapping agent to interact with the supported metal cluster ethylidyne tricobalt enneacarbonyl. Results showed that oxygenated products formed during the trapping reaction contained {minus}OCD{sub 3} groups from the DMS, indicating that the interaction was not a simple alkylation. 18 refs., 1 fig., 3 tabs.
Cosmological analysis of galaxy clusters surveys in X-rays
International Nuclear Information System (INIS)
Clerc, N.
2012-01-01
Clusters of galaxies are the most massive objects in equilibrium in our Universe. Their study allows to test cosmological scenarios of structure formation with precision, bringing constraints complementary to those stemming from the cosmological background radiation, supernovae or galaxies. They are identified through the X-ray emission of their heated gas, thus facilitating their mapping at different epochs of the Universe. This report presents two surveys of galaxy clusters detected in X-rays and puts forward a method for their cosmological interpretation. Thanks to its multi-wavelength coverage extending over 10 sq. deg. and after one decade of expertise, the XMM-LSS allows a systematic census of clusters in a large volume of the Universe. In the framework of this survey, the first part of this report describes the techniques developed to the purpose of characterizing the detected objects. A particular emphasis is placed on the most distant ones (z ≥ 1) through the complementarity of observations in X-ray, optical and infrared bands. Then the X-CLASS survey is fully described. Based on XMM archival data, it provides a new catalogue of 800 clusters detected in X-rays. A cosmological analysis of this survey is performed thanks to 'CR-HR' diagrams. This new method self-consistently includes selection effects and scaling relations and provides a means to bypass the computation of individual cluster masses. Propositions are made for applying this method to future surveys as XMM-XXL and eRosita. (author) [fr
Cluster analysis by optimal decomposition of induced fuzzy sets
Energy Technology Data Exchange (ETDEWEB)
Backer, E
1978-01-01
Nonsupervised pattern recognition is addressed and the concept of fuzzy sets is explored in order to provide the investigator (data analyst) additional information supplied by the pattern class membership values apart from the classical pattern class assignments. The basic ideas behind the pattern recognition problem, the clustering problem, and the concept of fuzzy sets in cluster analysis are discussed, and a brief review of the literature of the fuzzy cluster analysis is given. Some mathematical aspects of fuzzy set theory are briefly discussed; in particular, a measure of fuzziness is suggested. The optimization-clustering problem is characterized. Then the fundamental idea behind affinity decomposition is considered. Next, further analysis takes place with respect to the partitioning-characterization functions. The iterative optimization procedure is then addressed. The reclassification function is investigated and convergence properties are examined. Finally, several experiments in support of the method suggested are described. Four object data sets serve as appropriate test cases. 120 references, 70 figures, 11 tables. (RWR)
Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.
Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si
2017-07-01
Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.
Phenotypic clustering: a novel method for microglial morphology analysis.
Verdonk, Franck; Roux, Pascal; Flamant, Patricia; Fiette, Laurence; Bozza, Fernando A; Simard, Sébastien; Lemaire, Marc; Plaud, Benoit; Shorte, Spencer L; Sharshar, Tarek; Chrétien, Fabrice; Danckaert, Anne
2016-06-17
Microglial cells are tissue-resident macrophages of the central nervous system. They are extremely dynamic, sensitive to their microenvironment and present a characteristic complex and heterogeneous morphology and distribution within the brain tissue. Many experimental clues highlight a strong link between their morphology and their function in response to aggression. However, due to their complex "dendritic-like" aspect that constitutes the major pool of murine microglial cells and their dense network, precise and powerful morphological studies are not easy to realize and complicate correlation with molecular or clinical parameters. Using the knock-in mouse model CX3CR1(GFP/+), we developed a 3D automated confocal tissue imaging system coupled with morphological modelling of many thousands of microglial cells revealing precise and quantitative assessment of major cell features: cell density, cell body area, cytoplasm area and number of primary, secondary and tertiary processes. We determined two morphological criteria that are the complexity index (CI) and the covered environment area (CEA) allowing an innovative approach lying in (i) an accurate and objective study of morphological changes in healthy or pathological condition, (ii) an in situ mapping of the microglial distribution in different neuroanatomical regions and (iii) a study of the clustering of numerous cells, allowing us to discriminate different sub-populations. Our results on more than 20,000 cells by condition confirm at baseline a regional heterogeneity of the microglial distribution and phenotype that persists after induction of neuroinflammation by systemic injection of lipopolysaccharide (LPS). Using clustering analysis, we highlight that, at resting state, microglial cells are distributed in four microglial sub-populations defined by their CI and CEA with a regional pattern and a specific behaviour after challenge. Our results counteract the classical view of a homogenous regional resting
Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions
International Nuclear Information System (INIS)
Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G.; Hummer, Gerhard
2014-01-01
Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space
Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions
Energy Technology Data Exchange (ETDEWEB)
Nedialkova, Lilia V.; Amat, Miguel A. [Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544 (United States); Kevrekidis, Ioannis G., E-mail: yannis@princeton.edu, E-mail: gerhard.hummer@biophys.mpg.de [Department of Chemical and Biological Engineering and Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08544 (United States); Hummer, Gerhard, E-mail: yannis@princeton.edu, E-mail: gerhard.hummer@biophys.mpg.de [Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438 Frankfurt am Main (Germany)
2014-09-21
Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space.
Xu, Beijie; Recker, Mimi; Qi, Xiaojun; Flann, Nicholas; Ye, Lei
2013-01-01
This article examines clustering as an educational data mining method. In particular, two clustering algorithms, the widely used K-means and the model-based Latent Class Analysis, are compared, using usage data from an educational digital library service, the Instructional Architect (IA.usu.edu). Using a multi-faceted approach and multiple data…
Model study in chemisorption: atomic hydrogen on beryllium clusters
International Nuclear Information System (INIS)
Bauschlicher, C.W. Jr.
1976-08-01
The interaction between atomic hydrogen and the (0001) surface of Be metal has been studied by ab initio electronic structure theory. Self-consistent-field (SCF) calculations have been performed using minimum, optimized minimum, double zeta and mixed basis sets for clusters as large as 22 Be atoms. The binding energy and equilibrium geometry (the distance to the surface) were determined for 4 sites. Both spatially restricted (the wavefunction was constrained to transform as one of the irreducible representations of the molecular point group) and unrestricted SCF calculations were performed. Using only the optimized minimum basis set, clusters containing as many as 22 beryllium atoms have been investigated. From a variety of considerations, this cluster is seen to be nearly converged within the model used, providing the most reliable results for chemisorption. The site dependence of the frequency is shown to be a geometrical effect depending on the number and angle of the bonds. The diffusion of atomic hydrogen through a perfect beryllium crystal is predicted to be energetically unfavorable. The cohesive energy, the ionization energy and the singlet-triplet separation were computed for the clusters without hydrogen. These quantities can be seen as a measure of the total amount of edge effects. The chemisorptive properties are not related to the total amount of edge effects, but rather the edge effects felt by the adsorbate bonding berylliums. This lack of correlation with the total edge effects illustrates the local nature of the bonding, further strengthening the cluster model for chemisorption. A detailed discussion of the bonding and electronic structure is included. The remaining edge effects for the Be 22 cluster are discussed
Small traveling clusters in attractive and repulsive Hamiltonian mean-field models.
Barré, Julien; Yamaguchi, Yoshiyuki Y
2009-03-01
Long-lasting small traveling clusters are studied in the Hamiltonian mean-field model by comparing between attractive and repulsive interactions. Nonlinear Landau damping theory predicts that a Gaussian momentum distribution on a spatially homogeneous background permits the existence of traveling clusters in the repulsive case, as in plasma systems, but not in the attractive case. Nevertheless, extending the analysis to a two-parameter family of momentum distributions of Fermi-Dirac type, we theoretically predict the existence of traveling clusters in the attractive case; these findings are confirmed by direct N -body numerical simulations. The parameter region with the traveling clusters is much reduced in the attractive case with respect to the repulsive case.
A two-stage method for microcalcification cluster segmentation in mammography by deformable models
International Nuclear Information System (INIS)
Arikidis, N.; Kazantzi, A.; Skiadopoulos, S.; Karahaliou, A.; Costaridou, L.; Vassiou, K.
2015-01-01
Purpose: Segmentation of microcalcification (MC) clusters in x-ray mammography is a difficult task for radiologists. Accurate segmentation is prerequisite for quantitative image analysis of MC clusters and subsequent feature extraction and classification in computer-aided diagnosis schemes. Methods: In this study, a two-stage semiautomated segmentation method of MC clusters is investigated. The first stage is targeted to accurate and time efficient segmentation of the majority of the particles of a MC cluster, by means of a level set method. The second stage is targeted to shape refinement of selected individual MCs, by means of an active contour model. Both methods are applied in the framework of a rich scale-space representation, provided by the wavelet transform at integer scales. Segmentation reliability of the proposed method in terms of inter and intraobserver agreements was evaluated in a case sample of 80 MC clusters originating from the digital database for screening mammography, corresponding to 4 morphology types (punctate: 22, fine linear branching: 16, pleomorphic: 18, and amorphous: 24) of MC clusters, assessing radiologists’ segmentations quantitatively by two distance metrics (Hausdorff distance—HDIST cluster , average of minimum distance—AMINDIST cluster ) and the area overlap measure (AOM cluster ). The effect of the proposed segmentation method on MC cluster characterization accuracy was evaluated in a case sample of 162 pleomorphic MC clusters (72 malignant and 90 benign). Ten MC cluster features, targeted to capture morphologic properties of individual MCs in a cluster (area, major length, perimeter, compactness, and spread), were extracted and a correlation-based feature selection method yielded a feature subset to feed in a support vector machine classifier. Classification performance of the MC cluster features was estimated by means of the area under receiver operating characteristic curve (Az ± Standard Error) utilizing tenfold cross
DGA Clustering and Analysis: Mastering Modern, Evolving Threats, DGALab
Directory of Open Access Journals (Sweden)
Alexander Chailytko
2016-05-01
Full Text Available Domain Generation Algorithms (DGA is a basic building block used in almost all modern malware. Malware researchers have attempted to tackle the DGA problem with various tools and techniques, with varying degrees of success. We present a complex solution to populate DGA feed using reversed DGAs, third-party feeds, and a smart DGA extraction and clustering based on emulation of a large number of samples. Smart DGA extraction requires no reverse engineering and works regardless of the DGA type or initialization vector, while enabling a cluster-based analysis. Our method also automatically allows analysis of the whole malware family, specific campaign, etc. We present our system and demonstrate its abilities on more than 20 malware families. This includes showing connections between different campaigns, as well as comparing results. Most importantly, we discuss how to utilize the outcome of the analysis to create smarter protections against similar malware.
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Mobility in Europe: Recent Trends from a Cluster Analysis
Directory of Open Access Journals (Sweden)
Ioana Manafi
2017-08-01
Full Text Available During the past decade, Europe was confronted with major changes and events offering large opportunities for mobility. The EU enlargement process, the EU policies regarding youth, the economic crisis affecting national economies on different levels, political instabilities in some European countries, high rates of unemployment or the increasing number of refugees are only a few of the factors influencing net migration in Europe. Based on a set of socio-economic indicators for EU/EFTA countries and cluster analysis, the paper provides an overview of regional differences across European countries, related to migration magnitude in the identified clusters. The obtained clusters are in accordance with previous studies in migration, and appear stable during the period of 2005-2013, with only some exceptions. The analysis revealed three country clusters: EU/EFTA center-receiving countries, EU/EFTA periphery-sending countries and EU/EFTA outlier countries, the names suggesting not only the geographical position within Europe, but the trends in net migration flows during the years. Therewith, the results provide evidence for the persistence of a movement from periphery to center countries, which is correlated with recent flows of mobility in Europe.
Full text clustering and relationship network analysis of biomedical publications.
Directory of Open Access Journals (Sweden)
Renchu Guan
Full Text Available Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.
Sirenomelia in Argentina: Prevalence, geographic clusters and temporal trends analysis.
Groisman, Boris; Liascovich, Rosa; Gili, Juan Antonio; Barbero, Pablo; Bidondo, María Paz
2016-07-01
Sirenomelia is a severe malformation of the lower body characterized by a single medial lower limb and a variable combination of visceral abnormalities. Given that Sirenomelia is a very rare birth defect, epidemiological studies are scarce. The aim of this study is to evaluate prevalence, geographic clusters and time trends of sirenomelia in Argentina, using data from the National Network of Congenital Anomalies of Argentina (RENAC) from November 2009 until December 2014. This is a descriptive study using data from the RENAC, a hospital-based surveillance system for newborns affected with major morphological congenital anomalies. We calculated sirenomelia prevalence throughout the period, searched for geographical clusters, and evaluated time trends. The prevalence of confirmed cases of sirenomelia throughout the period was 2.35 per 100,000 births. Cluster analysis showed no statistically significant geographical aggregates. Time-trends analysis showed that the prevalence was higher in years 2009 to 2010. The observed prevalence was higher than the observed in previous epidemiological studies in other geographic regions. We observed a likely real increase in the initial period of our study. We used strict diagnostic criteria, excluding cases that only had clinical diagnosis of sirenomelia. Therefore, real prevalence could be even higher. This study did not show any geographic clusters. Because etiology of sirenomelia has not yet been established, studies of epidemiological features of this defect may contribute to define its causes. Birth Defects Research (Part A) 106:604-611, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia
2016-04-01
Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.
Full text clustering and relationship network analysis of biomedical publications.
Guan, Renchu; Yang, Chen; Marchese, Maurizio; Liang, Yanchun; Shi, Xiaohu
2014-01-01
Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.
Directory of Open Access Journals (Sweden)
Jocelyn H Bolin
2014-04-01
Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.
Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C
2014-01-01
Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.
The Parental Environment Cluster Model of Child Neglect: An Integrative Conceptual Model.
Burke, Judith; Chandy, Joseph; Dannerbeck, Anne; Watt, J. Wilson
1998-01-01
Presents Parental Environment Cluster model of child neglect which identifies three clusters of factors involved in parents' neglectful behavior: (1) parenting skills and functions; (2) development and use of positive social support; and (3) resource availability and management skills. Model offers a focal theory for research, structure for…
Energy Technology Data Exchange (ETDEWEB)
Krumholz, Mark R. [Department of Astronomy and Astrophysics, University of California, Santa Cruz, CA 95064 (United States); Adamo, Angela [Department of Astronomy, Oskar Klein Centre, Stockholm University, SE-10691 Stockholm (Sweden); Fumagalli, Michele [Institute for Computational Cosmology and Centre for Extragalactic Astronomy, Department of Physics, Durham University, South Road, Durham DH1 3LE (United Kingdom); Wofford, Aida [Institut d’Astrophysique de Paris, 98bis Boulevard Arago, F-75014 Paris (France); Calzetti, Daniela; Grasha, Kathryn [Department of Astronomy, University of Massachusetts–Amherst, Amherst, MA (United States); Lee, Janice C.; Whitmore, Bradley C.; Bright, Stacey N.; Ubeda, Leonardo [Space Telescope Science Institute, Baltimore, MD (United States); Gouliermis, Dimitrios A. [Centre for Astronomy, Institute for Theoretical Astrophysics, University of Heidelberg, Heidelberg (Germany); Kim, Hwihyun [Korea Astronomy and Space Science Institute, Daejeon (Korea, Republic of); Nair, Preethi [Department of Physics and Astronomy, University of Alabama, Tuscaloosa, AL (United States); Ryon, Jenna E. [Department of Astronomy, University of Wisconsin–Madison, Madison, WI (United States); Smith, Linda J. [European Space Agency/Space Telescope Science Institute, Baltimore, MD (United States); Thilker, David [Department of Physics and Astronomy, The Johns Hopkins University, Baltimore, MD (United States); Zackrisson, Erik, E-mail: mkrumhol@ucsc.edu, E-mail: adamo@astro.su.se [Department of Physics and Astronomy, Uppsala University, Uppsala (Sweden)
2015-10-20
We investigate a novel Bayesian analysis method, based on the Stochastically Lighting Up Galaxies (slug) code, to derive the masses, ages, and extinctions of star clusters from integrated light photometry. Unlike many analysis methods, slug correctly accounts for incomplete initial mass function (IMF) sampling, and returns full posterior probability distributions rather than simply probability maxima. We apply our technique to 621 visually confirmed clusters in two nearby galaxies, NGC 628 and NGC 7793, that are part of the Legacy Extragalactic UV Survey (LEGUS). LEGUS provides Hubble Space Telescope photometry in the NUV, U, B, V, and I bands. We analyze the sensitivity of the derived cluster properties to choices of prior probability distribution, evolutionary tracks, IMF, metallicity, treatment of nebular emission, and extinction curve. We find that slug's results for individual clusters are insensitive to most of these choices, but that the posterior probability distributions we derive are often quite broad, and sometimes multi-peaked and quite sensitive to the choice of priors. In contrast, the properties of the cluster population as a whole are relatively robust against all of these choices. We also compare our results from slug to those derived with a conventional non-stochastic fitting code, Yggdrasil. We show that slug's stochastic models are generally a better fit to the observations than the deterministic ones used by Yggdrasil. However, the overall properties of the cluster populations recovered by both codes are qualitatively similar.
Testing Numerical Models of Cool Core Galaxy Cluster Formation with X-Ray Observations
Henning, Jason W.; Gantner, Brennan; Burns, Jack O.; Hallman, Eric J.
2009-12-01
Using archival Chandra and ROSAT data along with numerical simulations, we compare the properties of cool core and non-cool core galaxy clusters, paying particular attention to the region beyond the cluster cores. With the use of single and double β-models, we demonstrate a statistically significant difference in the slopes of observed cluster surface brightness profiles while the cluster cores remain indistinguishable between the two cluster types. Additionally, through the use of hardness ratio profiles, we find evidence suggesting cool core clusters are cooler beyond their cores than non-cool core clusters of comparable mass and temperature, both in observed and simulated clusters. The similarities between real and simulated clusters supports a model presented in earlier work by the authors describing differing merger histories between cool core and non-cool core clusters. Discrepancies between real and simulated clusters will inform upcoming numerical models and simulations as to new ways to incorporate feedback in these systems.
Sensory over responsivity and obsessive compulsive symptoms: A cluster analysis.
Ben-Sasson, Ayelet; Podoly, Tamar Yonit
2017-02-01
Several studies have examined the sensory component in Obsesseive Compulsive Disorder (OCD) and described an OCD subtype which has a unique profile, and that Sensory Phenomena (SP) is a significant component of this subtype. SP has some commonalities with Sensory Over Responsivity (SOR) and might be in part a characteristic of this subtype. Although there are some studies that have examined SOR and its relation to Obsessive Compulsive Symptoms (OCS), literature lacks sufficient data on this interplay. First to further examine the correlations between OCS and SOR, and to explore the correlations between SOR modalities (i.e. smell, touch, etc.) and OCS subscales (i.e. washing, ordering, etc.). Second, to investigate the cluster analysis of SOR and OCS dimensions in adults, that is, to classify the sample using the sensory scores to find whether a sensory OCD subtype can be specified. Our third goal was to explore the psychometric features of a new sensory questionnaire: the Sensory Perception Quotient (SPQ). A sample of non clinical adults (n=350) was recruited via e-mail, social media and social networks. Participants completed questionnaires for measuring SOR, OCS, and anxiety. SOR and OCI-F scores were moderately significantly correlated (n=274), significant correlations between all SOR modalities and OCS subscales were found with no specific higher correlation between one modality to one OCS subscale. Cluster analysis revealed four distinct clusters: (1) No OC and SOR symptoms (NONE; n=100), (2) High OC and SOR symptoms (BOTH; n=28), (3) Moderate OC symptoms (OCS; n=63), (4) Moderate SOR symptoms (SOR; n=83). The BOTH cluster had significantly higher anxiety levels than the other clusters, and shared OC subscales scores with the OCS cluster. The BOTH cluster also reported higher SOR scores across tactile, vision, taste and olfactory modalities. The SPQ was found reliable and suitable to detect SOR, the sample SPQ scores was normally distributed (n=350). SOR is a
Analysis of plasmaspheric plumes: CLUSTER and IMAGE observations
Directory of Open Access Journals (Sweden)
F. Darrouzet
2006-07-01
Full Text Available Plasmaspheric plumes have been routinely observed by CLUSTER and IMAGE. The CLUSTER mission provides high time resolution four-point measurements of the plasmasphere near perigee. Total electron density profiles have been derived from the electron plasma frequency identified by the WHISPER sounder supplemented, in-between soundings, by relative variations of the spacecraft potential measured by the electric field instrument EFW; ion velocity is also measured onboard these satellites. The EUV imager onboard the IMAGE spacecraft provides global images of the plasmasphere with a spatial resolution of 0.1 RE every 10 min; such images acquired near apogee from high above the pole show the geometry of plasmaspheric plumes, their evolution and motion. We present coordinated observations of three plume events and compare CLUSTER in-situ data with global images of the plasmasphere obtained by IMAGE. In particular, we study the geometry and the orientation of plasmaspheric plumes by using four-point analysis methods. We compare several aspects of plume motion as determined by different methods: (i inner and outer plume boundary velocity calculated from time delays of this boundary as observed by the wave experiment WHISPER on the four spacecraft, (ii drift velocity measured by the electron drift instrument EDI onboard CLUSTER and (iii global velocity determined from successive EUV images. These different techniques consistently indicate that plasmaspheric plumes rotate around the Earth, with their foot fully co-rotating, but with their tip rotating slower and moving farther out.
Improving estimation of kinetic parameters in dynamic force spectroscopy using cluster analysis
Yen, Chi-Fu; Sivasankar, Sanjeevi
2018-03-01
Dynamic Force Spectroscopy (DFS) is a widely used technique to characterize the dissociation kinetics and interaction energy landscape of receptor-ligand complexes with single-molecule resolution. In an Atomic Force Microscope (AFM)-based DFS experiment, receptor-ligand complexes, sandwiched between an AFM tip and substrate, are ruptured at different stress rates by varying the speed at which the AFM-tip and substrate are pulled away from each other. The rupture events are grouped according to their pulling speeds, and the mean force and loading rate of each group are calculated. These data are subsequently fit to established models, and energy landscape parameters such as the intrinsic off-rate (koff) and the width of the potential energy barrier (xβ) are extracted. However, due to large uncertainties in determining mean forces and loading rates of the groups, errors in the estimated koff and xβ can be substantial. Here, we demonstrate that the accuracy of fitted parameters in a DFS experiment can be dramatically improved by sorting rupture events into groups using cluster analysis instead of sorting them according to their pulling speeds. We test different clustering algorithms including Gaussian mixture, logistic regression, and K-means clustering, under conditions that closely mimic DFS experiments. Using Monte Carlo simulations, we benchmark the performance of these clustering algorithms over a wide range of koff and xβ, under different levels of thermal noise, and as a function of both the number of unbinding events and the number of pulling speeds. Our results demonstrate that cluster analysis, particularly K-means clustering, is very effective in improving the accuracy of parameter estimation, particularly when the number of unbinding events are limited and not well separated into distinct groups. Cluster analysis is easy to implement, and our performance benchmarks serve as a guide in choosing an appropriate method for DFS data analysis.
Development of an interdisciplinary model cluster for tidal water environments
Dietrich, Stephan; Winterscheid, Axel; Jens, Wyrwa; Hartmut, Hein; Birte, Hein; Stefan, Vollmer; Andreas, Schöl
2013-04-01
Global climate change has a high potential to influence both the persistence and the transport pathways of water masses and its constituents in tidal waters and estuaries. These processes are linked through dispersion processes, thus directly influencing the sediment and solid suspend matter budgets, and thus the river morphology. Furthermore, the hydrologic regime has an impact on the transport of nutrients, phytoplankton, suspended matter, and temperature that determine the oxygen content within water masses, which is a major parameter describing the water quality. This project aims at the implementation of a so-called (numerical) model cluster in tidal waters, which includes the model compartments hydrodynamics, morphology and ecology. For the implementation of this cluster it is required to continue with the integration of different models that work in a wide range of spatial and temporal scales. The model cluster is thus suggested to lead to a more precise knowledge of the feedback processes between the single interdisciplinary model compartments. In addition to field measurements this model cluster will provide a complementary scientific basis required to address a spectrum of research questions concerning the integral management of estuaries within the Federal Institute of Hydrology (BfG, Germany). This will in particular include aspects like sediment and water quality management as well as adaptation strategies to climate change. The core of the model cluster will consist of the 3D-hydrodynamic model Delft3D (Roelvink and van Banning, 1994), long-term hydrodynamics in the estuaries are simulated with the Hamburg Shelf Ocean Model HAMSOM (Backhaus, 1983; Hein et al., 2012). The simulation results will be compared with the unstructured grid based SELFE model (Zhang and Bapista, 2008). The additional coupling of the BfG-developed 1D-water quality model QSim (Kirchesch and Schöl, 1999; Hein et al., 2011) with the morphological/hydrodynamic models is an
Clustering Multivariate Time Series Using Hidden Markov Models
Directory of Open Access Journals (Sweden)
Shima Ghassempour
2014-03-01
Full Text Available In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs, where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers.
A cluster expansion approach to exponential random graph models
International Nuclear Information System (INIS)
Yin, Mei
2012-01-01
The exponential family of random graphs are among the most widely studied network models. We show that any exponential random graph model may alternatively be viewed as a lattice gas model with a finite Banach space norm. The system may then be treated using cluster expansion methods from statistical mechanics. In particular, we derive a convergent power series expansion for the limiting free energy in the case of small parameters. Since the free energy is the generating function for the expectations of other random variables, this characterizes the structure and behavior of the limiting network in this parameter region
Performance Analysis of a Cluster-Based MAC Protocol for Wireless Ad Hoc Networks
Directory of Open Access Journals (Sweden)
Jesús Alonso-Zárate
2010-01-01
Full Text Available An analytical model to evaluate the non-saturated performance of the Distributed Queuing Medium Access Control Protocol for Ad Hoc Networks (DQMANs in single-hop networks is presented in this paper. DQMAN is comprised of a spontaneous, temporary, and dynamic clustering mechanism integrated with a near-optimum distributed queuing Medium Access Control (MAC protocol. Clustering is executed in a distributed manner using a mechanism inspired by the Distributed Coordination Function (DCF of the IEEE 802.11. Once a station seizes the channel, it becomes the temporary clusterhead of a spontaneous cluster and it coordinates the peer-to-peer communications between the clustermembers. Within each cluster, a near-optimum distributed queuing MAC protocol is executed. The theoretical performance analysis of DQMAN in single-hop networks under non-saturation conditions is presented in this paper. The approach integrates the analysis of the clustering mechanism into the MAC layer model. Up to the knowledge of the authors, this approach is novel in the literature. In addition, the performance of an ad hoc network using DQMAN is compared to that obtained when using the DCF of the IEEE 802.11, as a benchmark reference.
Poisson cluster analysis of cardiac arrest incidence in Columbus, Ohio.
Warden, Craig; Cudnik, Michael T; Sasson, Comilla; Schwartz, Greg; Semple, Hugh
2012-01-01
Scarce resources in disease prevention and emergency medical services (EMS) need to be focused on high-risk areas of out-of-hospital cardiac arrest (OHCA). Cluster analysis using geographic information systems (GISs) was used to find these high-risk areas and test potential predictive variables. This was a retrospective cohort analysis of EMS-treated adults with OHCAs occurring in Columbus, Ohio, from April 1, 2004, through March 31, 2009. The OHCAs were aggregated to census tracts and incidence rates were calculated based on their adult populations. Poisson cluster analysis determined significant clusters of high-risk census tracts. Both census tract-level and case-level characteristics were tested for association with high-risk areas by multivariate logistic regression. A total of 2,037 eligible OHCAs occurred within the city limits during the study period. The mean incidence rate was 0.85 OHCAs/1,000 population/year. There were five significant geographic clusters with 76 high-risk census tracts out of the total of 245 census tracts. In the case-level analysis, being in a high-risk cluster was associated with a slightly younger age (-3 years, adjusted odds ratio [OR] 0.99, 95% confidence interval [CI] 0.99-1.00), not being white, non-Hispanic (OR 0.54, 95% CI 0.45-0.64), cardiac arrest occurring at home (OR 1.53, 95% CI 1.23-1.71), and not receiving bystander cardiopulmonary resuscitation (CPR) (OR 0.77, 95% CI 0.62-0.96), but with higher survival to hospital discharge (OR 1.78, 95% CI 1.30-2.46). In the census tract-level analysis, high-risk census tracts were also associated with a slightly lower average age (-0.1 years, OR 1.14, 95% CI 1.06-1.22) and a lower proportion of white, non-Hispanic patients (-0.298, OR 0.04, 95% CI 0.01-0.19), but also a lower proportion of high-school graduates (-0.184, OR 0.00, 95% CI 0.00-0.00). This analysis identified high-risk census tracts and associated census tract-level and case-level characteristics that can be used to
Cluster shell model: I. Structure of 9Be, 9B
Della Rocca, V.; Iachello, F.
2018-05-01
We calculate energy spectra, electromagnetic transition rates, longitudinal and transverse electron scattering form factors and log ft values for beta decay in 9Be, 9B, within the framework of a cluster shell model. By comparing with experimental data, we find strong evidence for the structure of these nuclei to be two α-particles in a dumbbell configuration with Z2 symmetry, plus an additional nucleon.
Efficient image duplicated region detection model using sequential block clustering
Czech Academy of Sciences Publication Activity Database
Sekeh, M. A.; Maarof, M. A.; Rohani, M. F.; Mahdian, Babak
2013-01-01
Roč. 10, č. 1 (2013), s. 73-84 ISSN 1742-2876 Institutional support: RVO:67985556 Keywords : Image forensic * Copy–paste forgery * Local block matching Subject RIV: IN - Informatics, Computer Science Impact factor: 0.986, year: 2013 http://library.utia.cas.cz/separaty/2013/ZOI/mahdian-efficient image duplicated region detection model using sequential block clustering.pdf
Cluster Dynamics Modeling with Bubble Nucleation, Growth and Coalescence
Energy Technology Data Exchange (ETDEWEB)
de Almeida, Valmor F. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Blondel, Sophie [Univ. of Tennessee, Knoxville, TN (United States); Bernholdt, David E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Wirth, Brian D. [Univ. of Tennessee, Knoxville, TN (United States)
2017-06-01
The topic of this communication pertains to defect formation in irradiated solids such as plasma-facing tungsten submitted to helium implantation in fusion reactor com- ponents, and nuclear fuel (metal and oxides) submitted to volatile ssion product generation in nuclear reactors. The purpose of this progress report is to describe ef- forts towards addressing the prediction of long-time evolution of defects via continuum cluster dynamics simulation. The di culties are twofold. First, realistic, long-time dynamics in reactor conditions leads to a non-dilute di usion regime which is not accommodated by the prevailing dilute, stressless cluster dynamics theory. Second, long-time dynamics calls for a large set of species (ideally an in nite set) to capture all possible emerging defects, and this represents a computational bottleneck. Extensions beyond the dilute limit is a signi cant undertaking since no model has been advanced to extend cluster dynamics to non-dilute, deformable conditions. Here our proposed approach to model the non-dilute limit is to monitor the appearance of a spatially localized void volume fraction in the solid matrix with a bell shape pro le and insert an explicit geometrical bubble onto the support of the bell function. The newly cre- ated internal moving boundary provides the means to account for the interfacial ux of mobile species into the bubble, and the growth of bubbles allows for coalescence phenomena which captures highly non-dilute interactions. We present a preliminary interfacial kinematic model with associated interfacial di usion transport to follow the evolution of the bubble in any number of spatial dimensions and any number of bubbles, which can be further extended to include a deformation theory. Finally we comment on a computational front-tracking method to be used in conjunction with conventional cluster dynamics simulations in the non-dilute model proposed.
Directory of Open Access Journals (Sweden)
Jie Wu
2010-12-01
Full Text Available The operational performance of container ports has received more and more attentions in both academic and practitioner circles, the performance evaluation and process improvement of container ports have also been the focus of several studies. In this paper, Data Envelopment Analysis (DEA, an effective tool for relative efficiency assessment, is utilized for measuring the performances and benchmarking of the 77 world container ports in 2007. The used approaches in the current study consider four inputs (Capacity of Cargo Handling Machines, Number of Berths, Terminal Area and Storage Capacity and a single output (Container Throughput. The results for the efficiency scores are analyzed, and a unique ordering of the ports based on average cross efficiency is provided, also cluster analysis technique is used to select the more appropriate targets for poorly performing ports to use as benchmarks.
Riemannian multi-manifold modeling and clustering in brain networks
Slavakis, Konstantinos; Salsabilian, Shiva; Wack, David S.; Muldoon, Sarah F.; Baidoo-Williams, Henry E.; Vettel, Jean M.; Cieslak, Matthew; Grafton, Scott T.
2017-08-01
This paper introduces Riemannian multi-manifold modeling in the context of brain-network analytics: Brainnetwork time-series yield features which are modeled as points lying in or close to a union of a finite number of submanifolds within a known Riemannian manifold. Distinguishing disparate time series amounts thus to clustering multiple Riemannian submanifolds. To this end, two feature-generation schemes for brain-network time series are put forth. The first one is motivated by Granger-causality arguments and uses an auto-regressive moving average model to map low-rank linear vector subspaces, spanned by column vectors of appropriately defined observability matrices, to points into the Grassmann manifold. The second one utilizes (non-linear) dependencies among network nodes by introducing kernel-based partial correlations to generate points in the manifold of positivedefinite matrices. Based on recently developed research on clustering Riemannian submanifolds, an algorithm is provided for distinguishing time series based on their Riemannian-geometry properties. Numerical tests on time series, synthetically generated from real brain-network structural connectivity matrices, reveal that the proposed scheme outperforms classical and state-of-the-art techniques in clustering brain-network states/structures.
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Model of defect reactions and the influence of clustering in pulse-neutron-irradiated Si
International Nuclear Information System (INIS)
Myers, S. M.; Cooper, P. J.; Wampler, W. R.
2008-01-01
Transient reactions among irradiation defects, dopants, impurities, and carriers in pulse-neutron-irradiated Si were modeled taking into account the clustering of the primal defects in recoil cascades. Continuum equations describing the diffusion, field drift, and reactions of relevant species were numerically solved for a submicrometer spherical volume, within which the starting radial distributions of defects could be varied in accord with the degree of clustering. The radial profiles corresponding to neutron irradiation were chosen through pair-correlation-function analysis of vacancy and interstitial distributions obtained from the binary-collision code MARLOWE, using a spectrum of primary recoil energies computed for a fast-burst fission reactor. Model predictions of transient behavior were compared with a variety of experimental results from irradiated bulk Si, solar cells, and bipolar-junction transistors. The influence of defect clustering during neutron bombardment was further distinguished through contrast with electron irradiation, where the primal point defects are more uniformly dispersed
Diagnostics of subtropical plants functional state by cluster analysis
Directory of Open Access Journals (Sweden)
Oksana Belous
2016-05-01
Full Text Available The article presents an application example of statistical methods for data analysis on diagnosis of the adaptive capacity of subtropical plants varieties. We depicted selection indicators and basic physiological parameters that were defined as diagnostic. We used evaluation on a set of parameters of water regime, there are: determination of water deficit of the leaves, determining the fractional composition of water and detection parameters of the concentration of cell sap (CCS (for tea culture flushes. These settings are characterized by high liability and high responsiveness to the effects of many abiotic factors that determined the particular care in the selection of plant material for analysis and consideration of the impact on sustainability. On the basis of the experimental data calculated the coefficients of pair correlation between climatic factors and used physiological indicators. The result was a selection of physiological and biochemical indicators proposed to assess the adaptability and included in the basis of methodical recommendations on diagnostics of the functional state of the studied cultures. Analysis of complex studies involving a large number of indicators is quite difficult, especially does not allow to quickly identify the similarity of new varieties for their adaptive responses to adverse factors, and, therefore, to set general requirements to conditions of cultivation. Use of cluster analysis suggests that in the analysis of only quantitative data; define a set of variables used to assess varieties (and the more sampling, the more accurate the clustering will happen, be sure to ascertain the measure of similarity (or difference between objects. It is shown that the identification of diagnostic features, which are subjected to statistical processing, impact the accuracy of the varieties classification. Selection in result of the mono-clusters analysis (variety tea Kolhida; hazelnut Lombardsky red; variety kiwi Monty
Modeling and Testing Dark Energy and Gravity with Galaxy Cluster Data
Rapetti, David; Cataneo, Matteo; Heneka, Caroline; Mantz, Adam; Allen, Steven W.; Von Der Linden, Anja; Schmidt, Fabian; Lombriser, Lucas; Li, Baojiu; Applegate, Douglas; Kelly, Patrick; Morris, Glenn
2018-06-01
The abundance of galaxy clusters is a powerful probe to constrain the properties of dark energy and gravity at large scales. We employed a self-consistent analysis that includes survey, observable-mass scaling relations and weak gravitational lensing data to obtain constraints on f(R) gravity, which are an order of magnitude tighter than the best previously achieved, as well as on cold dark energy of negligible sound speed. The latter implies clustering of the dark energy fluid at all scales, allowing us to measure the effects of dark energy perturbations at cluster scales. For this study, we recalibrated the halo mass function using the following non-linear characteristic quantities: the spherical collapse threshold, the virial overdensity and an additional mass contribution for cold dark energy. We also presented a new modeling of the f(R) gravity halo mass function that incorporates novel corrections to capture key non-linear effects of the Chameleon screening mechanism, as found in high resolution N-body simulations. All these results permit us to predict, as I will also exemplify, and eventually obtain the next generation of cluster constraints on such models, and provide us with frameworks that can also be applied to other proposed dark energy and modified gravity models using cluster abundance observations.
Analysis of risk factors for cluster behavior of dental implant failures.
Chrcanovic, Bruno Ramos; Kisch, Jenö; Albrektsson, Tomas; Wennerberg, Ann
2017-08-01
Some studies indicated that implant failures are commonly concentrated in few patients. To identify and analyze cluster behavior of dental implant failures among subjects of a retrospective study. This retrospective study included patients receiving at least three implants only. Patients presenting at least three implant failures were classified as presenting a cluster behavior. Univariate and multivariate logistic regression models and generalized estimating equations analysis evaluated the effect of explanatory variables on the cluster behavior. There were 1406 patients with three or more implants (8337 implants, 592 failures). Sixty-seven (4.77%) patients presented cluster behavior, with 56.8% of all implant failures. The intake of antidepressants and bruxism were identified as potential negative factors exerting a statistically significant influence on a cluster behavior at the patient-level. The negative factors at the implant-level were turned implants, short implants, poor bone quality, age of the patient, the intake of medicaments to reduce the acid gastric production, smoking, and bruxism. A cluster pattern among patients with implant failure is highly probable. Factors of interest as predictors for implant failures could be a number of systemic and local factors, although a direct causal relationship cannot be ascertained. © 2017 Wiley Periodicals, Inc.
Wagstaff, Kiri L.
2012-03-01
On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained
Cluster Analysis of the International Stellarator Confinement Database
International Nuclear Information System (INIS)
Kus, A.; Dinklage, A.; Preuss, R.; Ascasibar, E.; Harris, J. H.; Okamura, S.; Yamada, H.; Sano, F.; Stroth, U.; Talmadge, J.
2008-01-01
Heterogeneous structure of collected data is one of the problems that occur during derivation of scalings for energy confinement time, and whose analysis tourns out to be wide and complicated matter. The International Stellarator Confinement Database [1], shortly ISCDB, comprises in its latest version 21 a total of 3647 observations from 8 experimental devices, 2067 therefrom beeing so far completed for upcoming analyses. For confinement scaling studies 1933 observation were chosen as the standard dataset. Here we describe a statistical method of cluster analysis for identification of possible cohesive substructures in ISDCB and present some preliminary results
Service Quality in Tourist Destination Pipa/Brazil: A Study Based on a Cluster Analysis
Directory of Open Access Journals (Sweden)
Domingos Fernandes Campos
2015-08-01
Full Text Available This study aims to evaluate the Attractiveness and Quality factors at the tourism services provided by Pipa/RN destination. Based on 28 services attributes, the expectations of 760 tourists have been collected. The service has been evaluated by Gap Model, verifying the (disconfirmation of expectations and perceived service. Two questions have been used to evaluate: (a Have the expectations been varied with the social and demographic factors? (b Have the clusters identified by cluster analysis been guided by social and demographic factors? The groups identified were marked by different priorities in relation to the attributes and by different levels of demanding on expected service.
Neuro-fuzzy system modeling based on automatic fuzzy clustering
Institute of Scientific and Technical Information of China (English)
Yuangang TANG; Fuchun SUN; Zengqi SUN
2005-01-01
A neuro-fuzzy system model based on automatic fuzzy clustering is proposed.A hybrid model identification algorithm is also developed to decide the model structure and model parameters.The algorithm mainly includes three parts:1) Automatic fuzzy C-means (AFCM),which is applied to generate fuzzy rules automatically,and then fix on the size of the neuro-fuzzy network,by which the complexity of system design is reducesd greatly at the price of the fitting capability;2) Recursive least square estimation (RLSE).It is used to update the parameters of Takagi-Sugeno model,which is employed to describe the behavior of the system;3) Gradient descent algorithm is also proposed for the fuzzy values according to the back propagation algorithm of neural network.Finally,modeling the dynamical equation of the two-link manipulator with the proposed approach is illustrated to validate the feasibility of the method.
Accommodating error analysis in comparison and clustering of molecular fingerprints.
Salamon, H; Segal, M R; Ponce de Leon, A; Small, P M
1998-01-01
Molecular epidemiologic studies of infectious diseases rely on pathogen genotype comparisons, which usually yield patterns comprising sets of DNA fragments (DNA fingerprints). We use a highly developed genotyping system, IS6110-based restriction fragment length polymorphism analysis of Mycobacterium tuberculosis, to develop a computational method that automates comparison of large numbers of fingerprints. Because error in fragment length measurements is proportional to fragment length and is positively correlated for fragments within a lane, an align-and-count method that compensates for relative scaling of lanes reliably counts matching fragments between lanes. Results of a two-step method we developed to cluster identical fingerprints agree closely with 5 years of computer-assisted visual matching among 1,335 M. tuberculosis fingerprints. Fully documented and validated methods of automated comparison and clustering will greatly expand the scope of molecular epidemiology.
Accident patterns for construction-related workers: a cluster analysis
Liao, Chia-Wen; Tyan, Yaw-Yauan
2012-01-01
The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.
Cluster analysis in systems of magnetic spheres and cubes
Energy Technology Data Exchange (ETDEWEB)
Pyanzina, E.S., E-mail: elena.pyanzina@urfu.ru [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); Gudkova, A.V. [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); Donaldson, J.G. [University of Vienna, Sensengasse 8, Vienna (Austria); Kantorovich, S.S. [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); University of Vienna, Sensengasse 8, Vienna (Austria)
2017-06-01
In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube. - Highlights: • A comparison of the degree of self-assembly in systems of magnetic spheres and cubes. • Spheres are more likely to form larger clusters than cubes. • Differences in microstructure will manifest in the magnetic response of each system.
Image Registration Algorithm Based on Parallax Constraint and Clustering Analysis
Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song
2018-01-01
To resolve the problem of slow computation speed and low matching accuracy in image registration, a new image registration algorithm based on parallax constraint and clustering analysis is proposed. Firstly, Harris corner detection algorithm is used to extract the feature points of two images. Secondly, use Normalized Cross Correlation (NCC) function to perform the approximate matching of feature points, and the initial feature pair is obtained. Then, according to the parallax constraint condition, the initial feature pair is preprocessed by K-means clustering algorithm, which is used to remove the feature point pairs with obvious errors in the approximate matching process. Finally, adopt Random Sample Consensus (RANSAC) algorithm to optimize the feature points to obtain the final feature point matching result, and the fast and accurate image registration is realized. The experimental results show that the image registration algorithm proposed in this paper can improve the accuracy of the image matching while ensuring the real-time performance of the algorithm.
Network clustering coefficient approach to DNA sequence analysis
Energy Technology Data Exchange (ETDEWEB)
Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br
2006-05-15
In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.
Multiscale visual quality assessment for cluster analysis with self-organizing maps
Bernard, Jürgen; von Landesberger, Tatiana; Bremm, Sebastian; Schreck, Tobias
2011-01-01
Cluster analysis is an important data mining technique for analyzing large amounts of data, reducing many objects to a limited number of clusters. Cluster visualization techniques aim at supporting the user in better understanding the characteristics and relationships among the found clusters. While promising approaches to visual cluster analysis already exist, these usually fall short of incorporating the quality of the obtained clustering results. However, due to the nature of the clustering process, quality plays an important aspect, as for most practical data sets, typically many different clusterings are possible. Being aware of clustering quality is important to judge the expressiveness of a given cluster visualization, or to adjust the clustering process with refined parameters, among others. In this work, we present an encompassing suite of visual tools for quality assessment of an important visual cluster algorithm, namely, the Self-Organizing Map (SOM) technique. We define, measure, and visualize the notion of SOM cluster quality along a hierarchy of cluster abstractions. The quality abstractions range from simple scalar-valued quality scores up to the structural comparison of a given SOM clustering with output of additional supportive clustering methods. The suite of methods allows the user to assess the SOM quality on the appropriate abstraction level, and arrive at improved clustering results. We implement our tools in an integrated system, apply it on experimental data sets, and show its applicability.
Modelling clustering of vertically aligned carbon nanotube arrays.
Schaber, Clemens F; Filippov, Alexander E; Heinlein, Thorsten; Schneider, Jörg J; Gorb, Stanislav N
2015-08-06
Previous research demonstrated that arrays of vertically aligned carbon nanotubes (VACNTs) exhibit strong frictional properties. Experiments indicated a strong decrease of the friction coefficient from the first to the second sliding cycle in repetitive measurements on the same VACNT spot, but stable values in consecutive cycles. VACNTs form clusters under shear applied during friction tests, and self-organization stabilizes the mechanical properties of the arrays. With increasing load in the range between 300 µN and 4 mN applied normally to the array surface during friction tests the size of the clusters increases, while the coefficient of friction decreases. To better understand the experimentally obtained results, we formulated and numerically studied a minimalistic model, which reproduces the main features of the system with a minimum of adjustable parameters. We calculate the van der Waals forces between the spherical friction probe and bunches of the arrays using the well-known Morse potential function to predict the number of clusters, their size, instantaneous and mean friction forces and the behaviour of the VACNTs during consecutive sliding cycles and at different normal loads. The data obtained by the model calculations coincide very well with the experimental data and can help in adapting VACNT arrays for biomimetic applications.
A clustering analysis of lipoprotein diameters in the metabolic syndrome
Directory of Open Access Journals (Sweden)
Frazier-Wood Alexis C
2011-12-01
Full Text Available Abstract Background The presence of smaller low-density lipoproteins (LDL has been associated with atherosclerosis risk, and the insulin resistance (IR underlying the metabolic syndrome (MetS. In addition, some research has supported the association of very low-, low- and high-density lipoprotein (VLDL HDL particle diameters with components of the metabolic syndrome (MetS, although this has been the focus of less research. We aimed to explore the relationship of VLDL, LDL and HDL diameters to MetS and its features, and by clustering individuals by their diameters of VLDL, LDL and HDL particles, to capture information across all three fractions of lipoprotein into a unified phenotype. Methods We used nuclear magnetic resonance spectroscopy measurements on fasting plasma samples from a general population sample of 1,036 adults (mean ± SD, 48.8 ± 16.2 y of age. Using latent class analysis, the sample was grouped by the diameter of their fasting lipoproteins, and mixed effects models tested whether the distribution of MetS components varied across the groups. Results Eight discrete groups were identified. Two groups (N = 251 were enriched with individuals meeting criteria for the MetS, and were characterized by the smallest LDL/HDL diameters. One of those two groups, one was additionally distinguished by large VLDL, and had significantly higher blood pressure, fasting glucose, triglycerides, and waist circumference (WC; P Conclusions While small LDL diameters remain associated with IR and the MetS, the occurrence of these in conjunction with a shift to overall larger VLDL diameter may identify those with the highest fasting glucose, TG and WC within the MetS. If replicated, the association of this phenotype with more severe IR-features indicated that it may contribute to identifying of those most at risk for incident type II diabetes and cardiometabolic disease.
Steady state subchannel analysis of AHWR fuel cluster
International Nuclear Information System (INIS)
Dasgupta, A.; Chandraker, D.K.; Vijayan, P.K.; Saha, D.
2006-09-01
Subchannel analysis is a technique used to predict the thermal hydraulic behavior of reactor fuel assemblies. The rod cluster is subdivided into a number of parallel interacting flow subchannels. The conservation equations are solved for each of these subchannels, taking into account subchannel interactions. Subchannel analysis of AHWR D-5 fuel cluster has been carried out to determine the variations in thermal hydraulic conditions of coolant and fuel temperatures along the length of the fuel bundle. The hottest regions within the AHWR fuel bundle have been identified. The effect of creep on the fuel performance has also been studied. MCHFR has been calculated using Jansen-Levy correlation. The calculations have been backed by sensitivity analysis for parameters whose values are not known accurately. The sensitivity analysis showed the calculations to have a very low sensitivity to these parameters. Apart from the analysis, the report also includes a brief introduction of a few subchannel codes. A brief description of the equations and solution methodology used in COBRA-IIIC and COBRA-IV-I is also given. (author)
Hyperon-nucleon interaction in the quark cluster model
International Nuclear Information System (INIS)
Straub, U.; Zhang Zongye; Braeuer, K.; Faessler, A.; Khadkikar, S.B.; Luebeck, G.
1988-01-01
The lambda-nucleon and sigma-nucleon interaction is described in the nonrelativistic quark cluster model. The SU(3) flavor symmetry breaking due to the different quark masses is taken into account, i.e. different wavefunctions for the light (up, down) and heavy (strange) quarks are used in flavor and orbital space. The six-quark wavefunction is fully antisymmetrized. The model hamiltonian contains gluon exchange, pseudoscalar meson exchange and a phenomenological σ-meson exchange. The six-quark scattering problem is solved within the resonating group method. The experimental lambda-nucleon and sigma-nucleon cross sections are well reproduced. (orig.)
Franke, R.
2016-11-01
In many networks discovered in biology, medicine, neuroscience and other disciplines special properties like a certain degree distribution and hierarchical cluster structure (also called communities) can be observed as general organizing principles. Detecting the cluster structure of an unknown network promises to identify functional subdivisions, hierarchy and interactions on a mesoscale. It is not trivial choosing an appropriate detection algorithm because there are multiple network, cluster and algorithmic properties to be considered. Edges can be weighted and/or directed, clusters overlap or build a hierarchy in several ways. Algorithms differ not only in runtime, memory requirements but also in allowed network and cluster properties. They are based on a specific definition of what a cluster is, too. On the one hand, a comprehensive network creation model is needed to build a large variety of benchmark networks with different reasonable structures to compare algorithms. On the other hand, if a cluster structure is already known, it is desirable to separate effects of this structure from other network properties. This can be done with null model networks that mimic an observed cluster structure to improve statistics on other network features. A third important application is the general study of properties in networks with different cluster structures, possibly evolving over time. Currently there are good benchmark and creation models available. But what is left is a precise sandbox model to build hierarchical, overlapping and directed clusters for undirected or directed, binary or weighted complex random networks on basis of a sophisticated blueprint. This gap shall be closed by the model CHIMERA (Cluster Hierarchy Interconnection Model for Evaluation, Research and Analysis) which will be introduced and described here for the first time.
CLUSTERING ANALYSIS OF OFFICER'S BEHAVIOURS IN LONDON POLICE FOOT PATROL ACTIVITIES
Directory of Open Access Journals (Sweden)
J. Shen
2015-07-01
Full Text Available In this small paper we aim at presenting a framework of conceptual representation and clustering analysis of police officers’ patrol pattern obtained from mining their raw movement trajectory data. This have been achieved by a model developed to accounts for the spatio-temporal dynamics human movements by incorporating both the behaviour features of the travellers and the semantic meaning of the environment they are moving in. Hence, the similarity metric of traveller behaviours is jointly defined according to the stay time allocation in each Spatio-temporal region of interests (ST-ROI to support clustering analysis of patrol behaviours. The proposed framework enables the analysis of behaviour and preferences on higher level based on raw moment trajectories. The model is firstly applied to police patrol data provided by the Metropolitan Police and will be tested by other type of dataset afterwards.
NUCORE - A system for nuclear structure calculations with cluster-core models
International Nuclear Information System (INIS)
Heras, C.A.; Abecasis, S.M.
1982-01-01
Calculation of nuclear energy levels and their electromagnetic properties, modelling the nucleus as a cluster of a few particles and/or holes interacting with a core which in turn is modelled as a quadrupole vibrator (cluster-phonon model). The members of the cluster interact via quadrupole-quadrupole and pairing forces. (orig.)
Semi-Supervised Generation with Cluster-aware Generative Models
DEFF Research Database (Denmark)
Maaløe, Lars; Fraccaro, Marco; Winther, Ole
2017-01-01
Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning. Many real life data sets contain a small amount of labelled data points, that are typically disregarded when training generative models. We propose the Clust...... a log-likelihood of −79.38 nats on permutation invariant MNIST, while also achieving competitive semi-supervised classification accuracies. The model can also be trained fully unsupervised, and still improve the log-likelihood performance with respect to related methods.......Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning. Many real life data sets contain a small amount of labelled data points, that are typically disregarded when training generative models. We propose the Cluster...
Fine‐Grained Mobile Application Clustering Model Using Retrofitted Document Embedding
Directory of Open Access Journals (Sweden)
Yeo‐Chan Yoon
2017-08-01
Full Text Available In this paper, we propose a fine‐grained mobile application clustering model using retrofitted document embedding. To automatically determine the clusters and their numbers with no predefined categories, the proposed model initializes the clusters based on title keywords and then merges similar clusters. For improved clustering performance, the proposed model distinguishes between an accurate clustering step with titles and an expansive clustering step with descriptions. During the accurate clustering step, an automatically tagged set is constructed as a result. This set is utilized to learn a high‐performance document vector. During the expansive clustering step, more applications are then classified using this document vector. Experimental results showed that the purity of the proposed model increased by 0.19, and the entropy decreased by 1.18, compared with the K‐means algorithm. In addition, the mean average precision improved by more than 0.09 in a comparison with a support vector machine classifier.
Performance Evaluation of Hadoop-based Large-scale Network Traffic Analysis Cluster
Directory of Open Access Journals (Sweden)
Tao Ran
2016-01-01
Full Text Available As Hadoop has gained popularity in big data era, it is widely used in various fields. The self-design and self-developed large-scale network traffic analysis cluster works well based on Hadoop, with off-line applications running on it to analyze the massive network traffic data. On purpose of scientifically and reasonably evaluating the performance of analysis cluster, we propose a performance evaluation system. Firstly, we set the execution times of three benchmark applications as the benchmark of the performance, and pick 40 metrics of customized statistical resource data. Then we identify the relationship between the resource data and the execution times by a statistic modeling analysis approach, which is composed of principal component analysis and multiple linear regression. After training models by historical data, we can predict the execution times by current resource data. Finally, we evaluate the performance of analysis cluster by the validated predicting of execution times. Experimental results show that the predicted execution times by trained models are within acceptable error range, and the evaluation results of performance are accurate and reliable.
Directory of Open Access Journals (Sweden)
Tariq Ahmad
Full Text Available Classification of acute decompensated heart failure (ADHF is based on subjective criteria that crudely capture disease heterogeneity. Improved phenotyping of the syndrome may help improve therapeutic strategies.To derive cluster analysis-based groupings for patients hospitalized with ADHF, and compare their prognostic performance to hemodynamic classifications derived at the bedside.We performed a cluster analysis on baseline clinical variables and PAC measurements of 172 ADHF patients from the ESCAPE trial. Employing regression techniques, we examined associations between clusters and clinically determined hemodynamic profiles (warm/cold/wet/dry. We assessed association with clinical outcomes using Cox proportional hazards models. Likelihood ratio tests were used to compare the prognostic value of cluster data to that of hemodynamic data.We identified four advanced HF clusters: 1 male Caucasians with ischemic cardiomyopathy, multiple comorbidities, lowest B-type natriuretic peptide (BNP levels; 2 females with non-ischemic cardiomyopathy, few comorbidities, most favorable hemodynamics; 3 young African American males with non-ischemic cardiomyopathy, most adverse hemodynamics, advanced disease; and 4 older Caucasians with ischemic cardiomyopathy, concomitant renal insufficiency, highest BNP levels. There was no association between clusters and bedside-derived hemodynamic profiles (p = 0.70. For all adverse clinical outcomes, Cluster 4 had the highest risk, and Cluster 2, the lowest. Compared to Cluster 4, Clusters 1-3 had 45-70% lower risk of all-cause mortality. Clusters were significantly associated with clinical outcomes, whereas hemodynamic profiles were not.By clustering patients with similar objective variables, we identified four clinically relevant phenotypes of ADHF patients, with no discernable relationship to hemodynamic profiles, but distinct associations with adverse outcomes. Our analysis suggests that ADHF classification using
Mathematical model on malicious attacks in a mobile wireless network with clustering
International Nuclear Information System (INIS)
Haldar, Kaushik; Mishra, Bimal Kumar
2015-01-01
A mathematical model has been formulated for the analysis of a wireless epidemic on a clustered heterogeneous network. The model introduces mobility into the epidemic framework assuming that the component nodes have a tendency to be attached with a frequently visited home cluster. This underlines the inherent regularity in the mobility pattern of mobile nodes in a wireless network. The analysis focuses primarily on features that arise because of the mobility considerations compared in the larger scenario formed by the epidemic aspects. A result on the invariance of the home cluster populations with respect to time provides an important view-point of the long-term behavior of the system. The analysis also focuses on obtaining a basic threshold condition that guides the epidemic behavior of the system. Analytical as well as numerical results have also been obtained to establish the asymptotic behavior of the connected components of the network, and that of the whole network when the underlying graph turns out to be irreducible. Applications to proximity based attacks and to scenarios with high cluster density have also been outlined
Analysis of Learning Development With Sugeno Fuzzy Logic And Clustering
Directory of Open Access Journals (Sweden)
Maulana Erwin Saputra
2017-06-01
Full Text Available In the first journal, I made this attempt to analyze things that affect the achievement of students in each school of course vary. Because students are one of the goals of achieving the goals of successful educational organizations. The mental influence of students’ emotions and behaviors themselves in relation to learning performance. Fuzzy logic can be used in various fields as well as Clustering for grouping, as in Learning Development analyzes. The process will be performed on students based on the symptoms that exist. In this research will use fuzzy logic and clustering. Fuzzy is an uncertain logic but its excess is capable in the process of language reasoning so that in its design is not required complicated mathematical equations. However Clustering method is K-Means method is method where data analysis is broken down by group k (k = 1,2,3, .. k. To know the optimal number of Performance group. The results of the research is with a questionnaire entered into matlab will produce a value that means in generating the graph. And simplify the school in seeing Student performance in the learning process by using certain criteria. So from the system that obtained the results for a decision-making required by the school.
IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.
Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing
2016-01-01
Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.
Segmentation of Residential Gas Consumers Using Clustering Analysis
Directory of Open Access Journals (Sweden)
Marta P. Fernandes
2017-12-01
Full Text Available The growing environmental concerns and liberalization of energy markets have resulted in an increased competition between utilities and a strong focus on efficiency. To develop new energy efficiency measures and optimize operations, utilities seek new market-related insights and customer engagement strategies. This paper proposes a clustering-based methodology to define the segmentation of residential gas consumers. The segments of gas consumers are obtained through a detailed clustering analysis using smart metering data. Insights are derived from the segmentation, where the segments result from the clustering process and are characterized based on the consumption profiles, as well as according to information regarding consumers’ socio-economic and household key features. The study is based on a sample of approximately one thousand households over one year. The representative load profiles of consumers are essentially characterized by two evident consumption peaks, one in the morning and the other in the evening, and an off-peak consumption. Significant insights can be derived from this methodology regarding typical consumption curves of the different segments of consumers in the population. This knowledge can assist energy utilities and policy makers in the development of consumer engagement strategies, demand forecasting tools and in the design of more sophisticated tariff systems.
Evolutionary-Hierarchical Bases of the Formation of Cluster Model of Innovation Economic Development
Directory of Open Access Journals (Sweden)
Yuliya Vladimirovna Dubrovskaya
2016-10-01
Full Text Available The functioning of a modern economic system is based on the interaction of objects of different hierarchical levels. Thus, the problem of the study of innovation processes taking into account the mutual influence of the activities of these economic actors becomes important. The paper dwells evolutionary basis for the formation of models of innovation development on the basis of micro and macroeconomic analysis. Most of the concepts recognized that despite a big number of diverse models, the coordination of the relations between economic agents is of crucial importance for the successful innovation development. According to the results of the evolutionary-hierarchical analysis, the authors reveal key phases of the development of forms of business cooperation, science and government in the domestic economy. It has become the starting point of the conception of the characteristics of the interaction in the cluster models of innovation development of the economy. Considerable expectancies on improvement of the national innovative system are connected with the development of cluster and network structures. The main objective of government authorities is the formation of mechanisms and institutions that will foster cooperation between members of the clusters. The article explains that the clusters cannot become the factors in the growth of the national economy, not being an effective tool for interaction between the actors of the regional innovative systems.
Number of Clusters and the Quality of Hybrid Predictive Models in Analytical CRM
Directory of Open Access Journals (Sweden)
Łapczyński Mariusz
2014-08-01
Full Text Available Making more accurate marketing decisions by managers requires building effective predictive models. Typically, these models specify the probability of customer belonging to a particular category, group or segment. The analytical CRM categories refer to customers interested in starting cooperation with the company (acquisition models, customers who purchase additional products (cross- and up-sell models or customers intending to resign from the cooperation (churn models. During building predictive models researchers use analytical tools from various disciplines with an emphasis on their best performance. This article attempts to build a hybrid predictive model combining decision trees (C&RT algorithm and cluster analysis (k-means. During experiments five different cluster validity indices and eight datasets were used. The performance of models was evaluated by using popular measures such as: accuracy, precision, recall, G-mean, F-measure and lift in the first and in the second decile. The authors tried to find a connection between the number of clusters and models' quality.
A Global Model for Circumgalactic and Cluster-core Precipitation
Voit, G. Mark; Meece, Greg; Li, Yuan; O'Shea, Brian W.; Bryan, Greg L.; Donahue, Megan
2017-08-01
We provide an analytic framework for interpreting observations of multiphase circumgalactic gas that is heavily informed by recent numerical simulations of thermal instability and precipitation in cool-core galaxy clusters. We start by considering the local conditions required for the formation of multiphase gas via two different modes: (1) uplift of ambient gas by galactic outflows, and (2) condensation in a stratified stationary medium in which thermal balance is explicitly maintained. Analytic exploration of these two modes provides insights into the relationships between the local ratio of the cooling and freefall timescales (I.e., {t}{cool}/{t}{ff}), the large-scale gradient of specific entropy, and the development of precipitation and multiphase media in circumgalactic gas. We then use these analytic findings to interpret recent simulations of circumgalactic gas in which global thermal balance is maintained. We show that long-lasting configurations of gas with 5≲ \\min ({t}{cool}/{t}{ff})≲ 20 and radial entropy profiles similar to observations of cool cores in galaxy clusters are a natural outcome of precipitation-regulated feedback. We conclude with some observational predictions that follow from these models. This work focuses primarily on precipitation and AGN feedback in galaxy-cluster cores, because that is where the observations of multiphase gas around galaxies are most complete. However, many of the physical principles that govern condensation in those environments apply to circumgalactic gas around galaxies of all masses.
Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters
Muraoka, Masae; Okuda, Hiroshi
With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.
Organizational Model of the Southern Asia Cluster Family Businesses
Directory of Open Access Journals (Sweden)
Vipin Gupta
2013-07-01
Full Text Available Recently, there has been an increased interest in the family business organization. Traditionally, the ideal typical organizational model was one where the management, governance, and ownership entities are kept separate. This principal agent model has been a subject of public debate in the wake of several corporate scandals. In the family business organization, significant management, governance and ownership is often with the members of a family & its trusted partners. It is common in the US to regulate the management, governance, and ownership roles of the family members by using competitive criteria for the involvement of different members. In Southern Asia cluster (Gupta & Hanges, 2004, on the other hand, it is quite common for the family involvement to be holistic and undivided, where the family collectively owns the shares in the family business. In this work, this organizational model of the Southern Asian family businesses is investigated. Keywords: Southern Asia, family business, organizational model
Cluster analysis in systems of magnetic spheres and cubes
Pyanzina, E. S.; Gudkova, A. V.; Donaldson, J. G.; Kantorovich, S. S.
2017-06-01
In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube.
Hassan, Kazi; Allen, Deonie; Haynes, Heather
2016-04-01
. Results illustrate that clustered flood events generated sediment loads up to an order of magnitude greater than that of individual events of the same flood volume. Correlations were significant for sediment volume compared to both maximum flow discharge (R2<0.8) and number of events (R2 -0.5 to -0.7) within the cluster. The strongest correlations occurred for clusters with a greater number of flow events only slightly above-threshold. This illustrates that the numerical model can capture a degree of the non-linear morphological response to flow magnitude. Analysis of the relationship between morphological change and the skewness of flow events within each cluster was also determined, illustrating only minor sensitivity to cluster peak distribution skewness. This is surprising and discussion is presented on model limitations, including the capability of sediment transport formulae to effectively account for temporal processes of antecedent flow, hysteresis, local supply etc.
A cluster analysis on road traffic accidents using genetic algorithms
Saharan, Sabariah; Baragona, Roberto
2017-04-01
The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.
Multiscale deep drawing analysis of dual-phase steels using grain cluster-based RGC scheme
International Nuclear Information System (INIS)
Tjahjanto, D D; Eisenlohr, P; Roters, F
2015-01-01
Multiscale modelling and simulation play an important role in sheet metal forming analysis, since the overall material responses at macroscopic engineering scales, e.g. formability and anisotropy, are strongly influenced by microstructural properties, such as grain size and crystal orientations (texture). In the present report, multiscale analysis on deep drawing of dual-phase steels is performed using an efficient grain cluster-based homogenization scheme.The homogenization scheme, called relaxed grain cluster (RGC), is based on a generalization of the grain cluster concept, where a (representative) volume element consists of p × q × r (hexahedral) grains. In this scheme, variation of the strain or deformation of individual grains is taken into account through the, so-called, interface relaxation, which is formulated within an energy minimization framework. An interfacial penalty term is introduced into the energy minimization framework in order to account for the effects of grain boundaries.The grain cluster-based homogenization scheme has been implemented and incorporated into the advanced material simulation platform DAMASK, which purposes to bridge the macroscale boundary value problems associated with deep drawing analysis to the micromechanical constitutive law, e.g. crystal plasticity model. Standard Lankford anisotropy tests are performed to validate the model parameters prior to the deep drawing analysis. Model predictions for the deep drawing simulations are analyzed and compared to the corresponding experimental data. The result shows that the predictions of the model are in a very good agreement with the experimental measurement. (paper)
A first packet processing subdomain cluster model based on SDN
Chen, Mingyong; Wu, Weimin
2017-08-01
For the current controller cluster packet processing performance bottlenecks and controller downtime problems. An SDN controller is proposed to allocate the priority of each device in the SDN (Software Defined Network) network, and the domain contains several network devices and Controller, the controller is responsible for managing the network equipment within the domain, the switch performs data delivery based on the load of the controller, processing network equipment data. The experimental results show that the model can effectively solve the risk of single point failure of the controller, and can solve the performance bottleneck of the first packet processing.
The random cluster model and a new integration identity
International Nuclear Information System (INIS)
Chen, L C; Wu, F Y
2005-01-01
We evaluate the free energy of the random cluster model at its critical point for 0 -1 (√q/2) is a rational number. As a by-product, our consideration leads to a closed-form evaluation of the integral 1/(4π 2 ) ∫ 0 2π dΘ ∫ 0 2π dΦ ln[A+B+C - AcosΘ - BcosΦ - Ccos(Θ+Φ)] = -ln(2S) + (2/π)[Ti 2 (AS) + Ti 2 (BS) + Ti 2 (CS)], which arises in lattice statistics, where A, B, C ≥ 0 and S=1/√(AB + BC + CA)
Wilderjans, Tom F; Ceulemans, Eva; Van Mechelen, Iven; Depril, Dirk
2011-03-01
In many areas of psychology, one is interested in disclosing the underlying structural mechanisms that generated an object by variable data set. Often, based on theoretical or empirical arguments, it may be expected that these underlying mechanisms imply that the objects are grouped into clusters that are allowed to overlap (i.e., an object may belong to more than one cluster). In such cases, analyzing the data with Mirkin's additive profile clustering model may be appropriate. In this model: (1) each object may belong to no, one or several clusters, (2) there is a specific variable profile associated with each cluster, and (3) the scores of the objects on the variables can be reconstructed by adding the cluster-specific variable profiles of the clusters the object in question belongs to. Until now, however, no software program has been publicly available to perform an additive profile clustering analysis. For this purpose, in this article, the ADPROCLUS program, steered by a graphical user interface, is presented. We further illustrate its use by means of the analysis of a patient by symptom data matrix.
Shen, Chung-Wei; Chen, Yi-Hau
2018-03-13
We propose a model selection criterion for semiparametric marginal mean regression based on generalized estimating equations. The work is motivated by a longitudinal study on the physical frailty outcome in the elderly, where the cluster size, that is, the number of the observed outcomes in each subject, is "informative" in the sense that it is related to the frailty outcome itself. The new proposal, called Resampling Cluster Information Criterion (RCIC), is based on the resampling idea utilized in the within-cluster resampling method (Hoffman, Sen, and Weinberg, 2001, Biometrika 88, 1121-1134) and accommodates informative cluster size. The implementation of RCIC, however, is free of performing actual resampling of the data and hence is computationally convenient. Compared with the existing model selection methods for marginal mean regression, the RCIC method incorporates an additional component accounting for variability of the model over within-cluster subsampling, and leads to remarkable improvements in selecting the correct model, regardless of whether the cluster size is informative or not. Applying the RCIC method to the longitudinal frailty study, we identify being female, old age, low income and life satisfaction, and chronic health conditions as significant risk factors for physical frailty in the elderly. © 2018, The International Biometric Society.
Modified genetic algorithms to model cluster structures in medium-size silicon clusters
International Nuclear Information System (INIS)
Bazterra, Victor E.; Ona, Ofelia; Caputo, Maria C.; Ferraro, Marta B.; Fuentealba, Patricio; Facelli, Julio C.
2004-01-01
This paper presents the results obtained using a genetic algorithm (GA) to search for stable structures of medium size silicon clusters. In this work the GA uses a semiempirical energy function to find the best cluster structures, which are further optimized using density-functional theory. For small clusters our results agree well with previously reported structures, but for larger ones different structures appear. This is the case of Si 36 where we report a different structure, with significant lower energy than those previously found using limited search approaches on common structural motifs. This demonstrates the need for global optimization schemes when searching for stable structures of medium-size silicon clusters
A mixture model-based approach to the clustering of microarray expression data.
McLachlan, G J; Bean, R W; Peel, D
2002-03-01
This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/
International Nuclear Information System (INIS)
Zeng, J.; Li, G.; Sun, J.
2013-01-01
Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)
Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat
2014-07-01
Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.
[Typologies of Madrid's citizens (Spain) at the end-of-life: cluster analysis].
Ortiz-Gonçalves, Belén; Perea-Pérez, Bernardo; Labajo González, Elena; Albarrán Juan, Elena; Santiago-Sáez, Andrés
2018-03-06
To establish typologies within Madrid's citizens (Spain) with regard to end-of-life by cluster analysis. The SPAD 8 programme was implemented in a sample from a health care centre in the autonomous region of Madrid (Spain). A multiple correspondence analysis technique was used, followed by a cluster analysis to create a dendrogram. A cross-sectional study was made beforehand with the results of the questionnaire. Five clusters stand out. Cluster 1: a group who preferred not to answer numerous questions (5%). Cluster 2: in favour of receiving palliative care and euthanasia (40%). Cluster 3: would oppose assisted suicide and would not ask for spiritual assistance (15%). Cluster 4: would like to receive palliative care and assisted suicide (16%). Cluster 5: would oppose assisted suicide and would ask for spiritual assistance (24%). The following four clusters stood out. Clusters 2 and 4 would like to receive palliative care, euthanasia (2) and assisted suicide (4). Clusters 4 and 5 regularly practiced their faith and their family members did not receive palliative care. Clusters 3 and 5 would be opposed to euthanasia and assisted suicide in particular. Clusters 2, 4 and 5 had not completed an advance directive document (2, 4 and 5). Clusters 2 and 3 seldom practiced their faith. This study could be taken into consideration to improve the quality of end-of-life care choices. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL
2006-01-01
The Flocking model, first proposed by Craig Reynolds, is one of the first bio-inspired computational collective behavior models that has many popular applications, such as animation. Our early research has resulted in a flock clustering algorithm that can achieve better performance than the Kmeans or the Ant clustering algorithms for data clustering. This algorithm generates a clustering of a given set of data through the embedding of the highdimensional data items on a two-dimensional grid for efficient clustering result retrieval and visualization. In this paper, we propose a bio-inspired clustering model, the Multiple Species Flocking clustering model (MSF), and present a distributed multi-agent MSF approach for document clustering.
MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS
Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...
Shape Analysis of HII Regions - I. Statistical Clustering
Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred
2018-04-01
We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.
Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method
DEFF Research Database (Denmark)
Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels
2014-01-01
The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probabi......The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models...... the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace...... method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous...
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-04-01
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models’ performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Tappi, D.
2005-01-01
Over recent decades, clusters like industrial districts have increasingly attracted attention in economic debate. The study of clusters, particularly in the Italian literature, highlights the inadequacy of the mainstream body of explanation to provide a theory of the emergence and transformation
Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.
Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D
2017-06-01
Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.
International Nuclear Information System (INIS)
Iqbal, Q.; Saleem, M.Y.; Hameed, A.; Asghar, M.
2014-01-01
For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)
Energy Technology Data Exchange (ETDEWEB)
Liu, Z.; Bessa, M. A.; Liu, W.K.
2017-10-25
A predictive computational theory is shown for modeling complex, hierarchical materials ranging from metal alloys to polymer nanocomposites. The theory can capture complex mechanisms such as plasticity and failure that span across multiple length scales. This general multiscale material modeling theory relies on sound principles of mathematics and mechanics, and a cutting-edge reduced order modeling method named self-consistent clustering analysis (SCA) [Zeliang Liu, M.A. Bessa, Wing Kam Liu, “Self-consistent clustering analysis: An efficient multi-scale scheme for inelastic heterogeneous materials,” Comput. Methods Appl. Mech. Engrg. 306 (2016) 319–341]. SCA reduces by several orders of magnitude the computational cost of micromechanical and concurrent multiscale simulations, while retaining the microstructure information. This remarkable increase in efficiency is achieved with a data-driven clustering method. Computationally expensive operations are performed in the so-called offline stage, where degrees of freedom (DOFs) are agglomerated into clusters. The interaction tensor of these clusters is computed. In the online or predictive stage, the Lippmann-Schwinger integral equation is solved cluster-wise using a self-consistent scheme to ensure solution accuracy and avoid path dependence. To construct a concurrent multiscale model, this scheme is applied at each material point in a macroscale structure, replacing a conventional constitutive model with the average response computed from the microscale model using just the SCA online stage. A regularized damage theory is incorporated in the microscale that avoids the mesh and RVE size dependence that commonly plagues microscale damage calculations. The SCA method is illustrated with two cases: a carbon fiber reinforced polymer (CFRP) structure with the concurrent multiscale model and an application to fatigue prediction for additively manufactured metals. For the CFRP problem, a speed up estimated to be about
Subgrid Modeling of AGN-driven Turbulence in Galaxy Clusters
Scannapieco, Evan; Brüggen, Marcus
2008-10-01
Hot, underdense bubbles powered by active galactic nuclei (AGNs) are likely to play a key role in halting catastrophic cooling in the centers of cool-core galaxy clusters. We present three-dimensional simulations that capture the evolution of such bubbles, using an adaptive mesh hydrodynamic code, FLASH3, to which we have added a subgrid model of turbulence and mixing. While pure hydro simulations indicate that AGN bubbles are disrupted into resolution-dependent pockets of underdense gas, proper modeling of subgrid turbulence indicates that this is a poor approximation to a turbulent cascade that continues far beyond the resolution limit. Instead, Rayleigh-Taylor instabilities act to effectively mix the heated region with its surroundings, while at the same time preserving it as a coherent structure, consistent with observations. Thus, bubbles are transformed into hot clouds of mixed material as they move outward in the hydrostatic intracluster medium (ICM), much as large airbursts lead to a distinctive "mushroom cloud" structure as they rise in the hydrostatic atmosphere of Earth. Properly capturing the evolution of such clouds has important implications for many ICM properties. In particular, it significantly changes the impact of AGNs on the distribution of entropy and metals in cool-core clusters such as Perseus.
Sparsity enabled cluster reduced-order models for control
Kaiser, Eurika; Morzyński, Marek; Daviller, Guillaume; Kutz, J. Nathan; Brunton, Bingni W.; Brunton, Steven L.
2018-01-01
Characterizing and controlling nonlinear, multi-scale phenomena are central goals in science and engineering. Cluster-based reduced-order modeling (CROM) was introduced to exploit the underlying low-dimensional dynamics of complex systems. CROM builds a data-driven discretization of the Perron-Frobenius operator, resulting in a probabilistic model for ensembles of trajectories. A key advantage of CROM is that it embeds nonlinear dynamics in a linear framework, which enables the application of standard linear techniques to the nonlinear system. CROM is typically computed on high-dimensional data; however, access to and computations on this full-state data limit the online implementation of CROM for prediction and control. Here, we address this key challenge by identifying a small subset of critical measurements to learn an efficient CROM, referred to as sparsity-enabled CROM. In particular, we leverage compressive measurements to faithfully embed the cluster geometry and preserve the probabilistic dynamics. Further, we show how to identify fewer optimized sensor locations tailored to a specific problem that outperform random measurements. Both of these sparsity-enabled sensing strategies significantly reduce the burden of data acquisition and processing for low-latency in-time estimation and control. We illustrate this unsupervised learning approach on three different high-dimensional nonlinear dynamical systems from fluids with increasing complexity, with one application in flow control. Sparsity-enabled CROM is a critical facilitator for real-time implementation on high-dimensional systems where full-state information may be inaccessible.
Nguyen, Hien D; Ullmann, Jeremy F P; McLachlan, Geoffrey J; Voleti, Venkatakaushik; Li, Wenze; Hillman, Elizabeth M C; Reutens, David C; Janke, Andrew L
2018-02-01
Calcium is a ubiquitous messenger in neural signaling events. An increasing number of techniques are enabling visualization of neurological activity in animal models via luminescent proteins that bind to calcium ions. These techniques generate large volumes of spatially correlated time series. A model-based functional data analysis methodology via Gaussian mixtures is suggested for the clustering of data from such visualizations is proposed. The methodology is theoretically justified and a computationally efficient approach to estimation is suggested. An example analysis of a zebrafish imaging experiment is presented.
Sensitization trajectories in childhood revealed by using a cluster analysis
DEFF Research Database (Denmark)
Schoos, Ann-Marie M.; Chawes, Bo L.; Melen, Erik
2017-01-01
Prospective Studies on Asthma in Childhood 2000 (COPSAC2000) birth cohort with specific IgE against 13 common food and inhalant allergens at the ages of ½, 1½, 4, and 6 years. An unsupervised cluster analysis for 3-dimensional data (nonnegative sparse parallel factor analysis) was used to extract latent......BACKGROUND: Assessment of sensitization at a single time point during childhood provides limited clinical information. We hypothesized that sensitization develops as specific patterns with respect to age at debut, development over time, and involved allergens and that such patterns might be more...... biologically and clinically relevant. OBJECTIVE: We sought to explore latent patterns of sensitization during the first 6 years of life and investigate whether such patterns associate with the development of asthma, rhinitis, and eczema. METHODS: We investigated 398 children from the at-risk Copenhagen...
A Variational Level Set Model Combined with FCMS for Image Clustering Segmentation
Directory of Open Access Journals (Sweden)
Liming Tang
2014-01-01
Full Text Available The fuzzy C means clustering algorithm with spatial constraint (FCMS is effective for image segmentation. However, it lacks essential smoothing constraints to the cluster boundaries and enough robustness to the noise. Samson et al. proposed a variational level set model for image clustering segmentation, which can get the smooth cluster boundaries and closed cluster regions due to the use of level set scheme. However it is very sensitive to the noise since it is actually a hard C means clustering model. In this paper, based on Samson’s work, we propose a new variational level set model combined with FCMS for image clustering segmentation. Compared with FCMS clustering, the proposed model can get smooth cluster boundaries and closed cluster regions due to the use of level set scheme. In addition, a block-based energy is incorporated into the energy functional, which enables the proposed model to be more robust to the noise than FCMS clustering and Samson’s model. Some experiments on the synthetic and real images are performed to assess the performance of the proposed model. Compared with some classical image segmentation models, the proposed model has a better performance for the images contaminated by different noise levels.
Integrating PROOF Analysis in Cloud and Batch Clusters
International Nuclear Information System (INIS)
Rodríguez-Marrero, Ana Y; Fernández-del-Castillo, Enol; López García, Álvaro; Marco de Lucas, Jesús; Matorras Weinig, Francisco; González Caballero, Isidro; Cuesta Noriega, Alberto
2012-01-01
High Energy Physics (HEP) analysis are becoming more complex and demanding due to the large amount of data collected by the current experiments. The Parallel ROOT Facility (PROOF) provides researchers with an interactive tool to speed up the analysis of huge volumes of data by exploiting parallel processing on both multicore machines and computing clusters. The typical PROOF deployment scenario is a permanent set of cores configured to run the PROOF daemons. However, this approach is incapable of adapting to the dynamic nature of interactive usage. Several initiatives seek to improve the use of computing resources by integrating PROOF with a batch system, such as Proof on Demand (PoD) or PROOF Cluster. These solutions are currently in production at Universidad de Oviedo and IFCA and are positively evaluated by users. Although they are able to adapt to the computing needs of users, they must comply with the specific configuration, OS and software installed at the batch nodes. Furthermore, they share the machines with other workloads, which may cause disruptions in the interactive service for users. These limitations make PROOF a typical use-case for cloud computing. In this work we take profit from Cloud Infrastructure at IFCA in order to provide a dynamic PROOF environment where users can control the software configuration of the machines. The Proof Analysis Framework (PAF) facilitates the development of new analysis and offers a transparent access to PROOF resources. Several performance measurements are presented for the different scenarios (PoD, SGE and Cloud), showing a speed improvement closely correlated with the number of cores used.
Determining wood chip size: image analysis and clustering methods
Directory of Open Access Journals (Sweden)
Paolo Febbi
2013-09-01
Full Text Available One of the standard methods for the determination of the size distribution of wood chips is the oscillating screen method (EN 15149- 1:2010. Recent literature demonstrated how image analysis could return highly accurate measure of the dimensions defined for each individual particle, and could promote a new method depending on the geometrical shape to determine the chip size in a more accurate way. A sample of wood chips (8 litres was sieved through horizontally oscillating sieves, using five different screen hole diameters (3.15, 8, 16, 45, 63 mm; the wood chips were sorted in decreasing size classes and the mass of all fractions was used to determine the size distribution of the particles. Since the chip shape and size influence the sieving results, Wang’s theory, which concerns the geometric forms, was considered. A cluster analysis on the shape descriptors (Fourier descriptors and size descriptors (area, perimeter, Feret diameters, eccentricity was applied to observe the chips distribution. The UPGMA algorithm was applied on Euclidean distance. The obtained dendrogram shows a group separation according with the original three sieving fractions. A comparison has been made between the traditional sieve and clustering results. This preliminary result shows how the image analysis-based method has a high potential for the characterization of wood chip size distribution and could be further investigated. Moreover, this method could be implemented in an online detection machine for chips size characterization. An improvement of the results is expected by using supervised multivariate methods that utilize known class memberships. The main objective of the future activities will be to shift the analysis from a 2-dimensional method to a 3- dimensional acquisition process.
Testing the Bose-Einstein Condensate dark matter model at galactic cluster scale
International Nuclear Information System (INIS)
Harko, Tiberiu; Liang, Pengxiang; Liang, Shi-Dong; Mocanu, Gabriela
2015-01-01
The possibility that dark matter may be in the form of a Bose-Einstein Condensate (BEC) has been extensively explored at galactic scale. In particular, good fits for the galactic rotations curves have been obtained, and upper limits for the dark matter particle mass and scattering length have been estimated. In the present paper we extend the investigation of the properties of the BEC dark matter to the galactic cluster scale, involving dark matter dominated astrophysical systems formed of thousands of galaxies each. By considering that one of the major components of a galactic cluster, the intra-cluster hot gas, is described by King's β-model, and that both intra-cluster gas and dark matter are in hydrostatic equilibrium, bound by the same total mass profile, we derive the mass and density profiles of the BEC dark matter. In our analysis we consider several theoretical models, corresponding to isothermal hot gas and zero temperature BEC dark matter, non-isothermal gas and zero temperature dark matter, and isothermal gas and finite temperature BEC, respectively. The properties of the finite temperature BEC dark matter cluster are investigated in detail numerically. We compare our theoretical results with the observational data of 106 galactic clusters. Using a least-squares fitting, as well as the observational results for the dark matter self-interaction cross section, we obtain some upper bounds for the mass and scattering length of the dark matter particle. Our results suggest that the mass of the dark matter particle is of the order of μ eV, while the scattering length has values in the range of 10 −7 fm
Stochastic cluster algorithms for discrete Gaussian (SOS) models
International Nuclear Information System (INIS)
Evertz, H.G.; Hamburg Univ.; Hasenbusch, M.; Marcu, M.; Tel Aviv Univ.; Pinn, K.; Muenster Univ.; Solomon, S.
1990-10-01
We present new Monte Carlo cluster algorithms which eliminate critical slowing down in the simulation of solid-on-solid models. In this letter we focus on the two-dimensional discrete Gaussian model. The algorithms are based on reflecting the integer valued spin variables with respect to appropriately chosen reflection planes. The proper choice of the reflection plane turns out to be crucial in order to obtain a small dynamical exponent z. Actually, the successful versions of our algorithm are a mixture of two different procedures for choosing the reflection plane, one of them ergodic but slow, the other one non-ergodic and also slow when combined with a Metropolis algorithm. (orig.)
Cluster analysis of received constellations for optical performance monitoring
van Weerdenburg, J.J.A.; van Uden, R.; Sillekens, E.; de Waardt, H.; Koonen, A.M.J.; Okonkwo, C.
2016-01-01
Performance monitoring based on centroid clustering to investigate constellation generation offsets. The tool allows flexibility in constellation generation tolerances by forwarding centroids to the demapper. The relation of fibre nonlinearities and singular value decomposition of intra-cluster
DEFF Research Database (Denmark)
Pedersen, Anders Bro; Aabrandt, Andreas; Østergaard, Jacob
2014-01-01
In order to provide a vehicle fleet that realistically represents the predicted Electric Vehicle (EV) penetration for the future, a model is required that mimics people driving behaviour rather than simply playing back collected data. When the focus is broadened from on a traditional user...... scales, which calls for a statistically correct, yet flexible model. This paper describes a method for modelling EV, based on non-categorized data, which takes into account the plug in locations of the vehicles. By using clustering analysis to extrapolate and classify the primary locations where...
The composite sequential clustering technique for analysis of multispectral scanner data
Su, M. Y.
1972-01-01
The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.
Directory of Open Access Journals (Sweden)
M. Preethi Shree
2018-04-01
Full Text Available Genetic diversity analysis was conducted for biometric attributes in 20 progenies of Neolamarckia cadamba. The application of D2 clustering technique in Neolamarckia cadamba genetic resources resolved the 20 progenies into five clusters. The maximum intra cluster distance was shown by the cluster II. The maximum inter cluster distance was recorded between cluster III and V which indicated the presence of wider genetic distance between Neolamarckia cadamba progenies. Among the growth attributes, volume (36.84 % contributed maximum towards genetic divergence followed by bole height, basal diameter, tree height, number of branches in Neolamarckia cadamba progenies.
QTL global meta-analysis: are trait determining genes clustered?
Directory of Open Access Journals (Sweden)
Adelson David L
2009-04-01
Full Text Available Abstract Background A key open question in biology is if genes are physically clustered with respect to their known functions or phenotypic effects. This is of particular interest for Quantitative Trait Loci (QTL where a QTL region could contain a number of genes that contribute to the trait being measured. Results We observed a significant increase in gene density within QTL regions compared to non-QTL regions and/or the entire bovine genome. By grouping QTL from the Bovine QTL Viewer database into 8 categories of non-redundant regions, we have been able to analyze gene density and gene function distribution, based on Gene Ontology (GO with relation to their location within QTL regions, outside of QTL regions and across the entire bovine genome. We identified a number of GO terms that were significantly over represented within particular QTL categories. Furthermore, select GO terms expected to be associated with the QTL category based on common biological knowledge have also proved to be significantly over represented in QTL regions. Conclusion Our analysis provides evidence of over represented GO terms in QTL regions. This increased GO term density indicates possible clustering of gene functions within QTL regions of the bovine genome. Genes with similar functions may be grouped in specific locales and could be contributing to QTL traits. Moreover, we have identified over-represented GO terminology that from a biological standpoint, makes sense with respect to QTL category type.
Alpha-cluster preformation factor within cluster-formation model for odd-A and odd-odd heavy nuclei
Saleh Ahmed, Saad M.
2017-06-01
The alpha-cluster probability that represents the preformation of alpha particle in alpha-decay nuclei was determined for high-intensity alpha-decay mode odd-A and odd-odd heavy nuclei, 82 CSR) and the hypothesised cluster-formation model (CFM) as in our previous work. Our previous successful determination of phenomenological values of alpha-cluster preformation factors for even-even nuclei motivated us to expand the work to cover other types of nuclei. The formation energy of interior alpha cluster needed to be derived for the different nuclear systems with considering the unpaired-nucleon effect. The results showed the phenomenological value of alpha preformation probability and reflected the unpaired nucleon effect and the magic and sub-magic effects in nuclei. These results and their analyses presented are very useful for future work concerning the calculation of the alpha decay constants and the progress of its theory.
Maximum-entropy clustering algorithm and its global convergence analysis
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
Constructing a batch of differentiable entropy functions touniformly approximate an objective function by means of the maximum-entropy principle, a new clustering algorithm, called maximum-entropy clustering algorithm, is proposed based on optimization theory. This algorithm is a soft generalization of the hard C-means algorithm and possesses global convergence. Its relations with other clustering algorithms are discussed.
Hybrid Tracking Algorithm Improvements and Cluster Analysis Methods.
1982-02-26
UPGMA ), and Ward’s method. Ling’s papers describe a (k,r) clustering method. Each of these methods have individual characteristics which make them...Reference 7), UPGMA is probably the most frequently used clustering strategy. UPGMA tries to group new points into an existing cluster by using an
Social Media Use and Depression and Anxiety Symptoms: A Cluster Analysis.
Shensa, Ariel; Sidani, Jaime E; Dew, Mary Amanda; Escobar-Viera, César G; Primack, Brian A
2018-03-01
Individuals use social media with varying quantity, emotional, and behavioral at- tachment that may have differential associations with mental health outcomes. In this study, we sought to identify distinct patterns of social media use (SMU) and to assess associations between those patterns and depression and anxiety symptoms. In October 2014, a nationally-representative sample of 1730 US adults ages 19 to 32 completed an online survey. Cluster analysis was used to identify patterns of SMU. Depression and anxiety were measured using respective 4-item Patient-Reported Outcome Measurement Information System (PROMIS) scales. Multivariable logistic regression models were used to assess associations between clus- ter membership and depression and anxiety. Cluster analysis yielded a 5-cluster solu- tion. Participants were characterized as "Wired," "Connected," "Diffuse Dabblers," "Concentrated Dabblers," and "Unplugged." Membership in 2 clusters - "Wired" and "Connected" - increased the odds of elevated depression and anxiety symptoms (AOR = 2.7, 95% CI = 1.5-4.7; AOR = 3.7, 95% CI = 2.1-6.5, respectively, and AOR = 2.0, 95% CI = 1.3-3.2; AOR = 2.0, 95% CI = 1.3-3.1, respectively). SMU pattern characterization of a large population suggests 2 pat- terns are associated with risk for depression and anxiety. Developing educational interventions that address use patterns rather than single aspects of SMU (eg, quantity) would likely be useful.
Prediction of strontium bromide laser efficiency using cluster and decision tree analysis
Directory of Open Access Journals (Sweden)
Iliev Iliycho
2018-01-01
Full Text Available Subject of investigation is a new high-powered strontium bromide (SrBr2 vapor laser emitting in multiline region of wavelengths. The laser is an alternative to the atom strontium lasers and electron free lasers, especially at the line 6.45 μm which line is used in surgery for medical processing of biological tissues and bones with minimal damage. In this paper the experimental data from measurements of operational and output characteristics of the laser are statistically processed by means of cluster analysis and tree-based regression techniques. The aim is to extract the more important relationships and dependences from the available data which influence the increase of the overall laser efficiency. There are constructed and analyzed a set of cluster models. It is shown by using different cluster methods that the seven investigated operational characteristics (laser tube diameter, length, supplied electrical power, and others and laser efficiency are combined in 2 clusters. By the built regression tree models using Classification and Regression Trees (CART technique there are obtained dependences to predict the values of efficiency, and especially the maximum efficiency with over 95% accuracy.
tclust: An R Package for a Trimming Approach to Cluster Analysis
Directory of Open Access Journals (Sweden)
2012-04-01
Full Text Available Outlying data can heavily influence standard clustering methods. At the same time, clustering principles can be useful when robustifying statistical procedures. These two reasons motivate the development of feasible robust model-based clustering approaches. With this in mind, an R package for performing non-hierarchical robust clustering, called tclust, is presented here. Instead of trying to “fit” noisy data, a proportion α of the most outlying observations is trimmed. The tclust package efficiently handles different cluster scatter constraints. Graphical exploratory tools are also provided to help the user make sensible choices for the trimming proportion as well as the number of clusters to search for.
Liu, Jingxia; Colditz, Graham A
2018-05-01
There is growing interest in conducting cluster randomized trials (CRTs). For simplicity in sample size calculation, the cluster sizes are assumed to be identical across all clusters. However, equal cluster sizes are not guaranteed in practice. Therefore, the relative efficiency (RE) of unequal versus equal cluster sizes has been investigated when testing the treatment effect. One of the most important approaches to analyze a set of correlated data is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which the "working correlation structure" is introduced and the association pattern depends on a vector of association parameters denoted by ρ. In this paper, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in CRTs. The variances of the estimator of the treatment effect are derived for the different types of outcome. RE is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal cluster sizes. We discuss a commonly used structure in CRTs-exchangeable, and derive the simpler formula of RE with continuous, binary, and count outcomes. Finally, REs are investigated for several scenarios of cluster size distributions through simulation studies. We propose an adjusted sample size due to efficiency loss. Additionally, we also propose an optimal sample size estimation based on the GEE models under a fixed budget for known and unknown association parameter (ρ) in the working correlation structure within the cluster. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Deuterium cluster model for low energy nuclear reactions (LENR)
Miley, George; Hora, Heinrich
2007-11-01
For studying the possible reactions of high density deuterons on the background of a degenerate electron gas, a summary of experimental observations resulted in the possibility of reactions in pm distance and more than ksec duration similar to the K-shell electron capture [1]. The essential reason was the screening of the deuterons by a factor of 14 based on the observations. Using the bosonic properties for a cluster formation of the deuterons and a model of compound nuclear reactions [2], the measured distribution of the resulting nuclei may be explained as known from the Maruhn-Greiner theory for fission. The local maximum of the distribution at the main minimum indicates the excited states of the compound nuclei during their intermediary state. This measured local maximum may be an independent proof for the deuteron clusters at LENR. [1] H. Hora, G.H. Miley et al. Physics Letters A175, 138 (1993) [2] H. Hora and G.H. Miley, APS March Meeting 2007, Program p. 116
Directory of Open Access Journals (Sweden)
ASLI SUNER
2013-06-01
Full Text Available Multiple correspondence analysis is a method making easy to interpret the categorical variables given in contingency tables, showing the similarities, associations as well as divergences among these variables via graphics on a lower dimensional space. Clustering methods are helped to classify the grouped data according to their similarities and to get useful summarized data from them. In this study, interpretations of multiple correspondence analysis are supported by cluster analysis; factors affecting referred health institute such as age, disease group and health insurance are examined and it is aimed to compare results of the methods.
MMPI profiles of males accused of severe crimes: a cluster analysis
Spaans, M.; Barendregt, M.; Muller, E.; Beurs, E. de; Nijman, H.L.I.; Rinne, T.
2009-01-01
In studies attempting to classify criminal offenders by cluster analysis of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) data, the number of clusters found varied between 10 (the Megargee System) and two (one cluster indicating no psychopathology and one exhibiting serious
Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John
2015-09-01
We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.
Diagrammatic analysis of correlations in polymer fluids: Cluster diagrams via Edwards' field theory
International Nuclear Information System (INIS)
Morse, David C.
2006-01-01
Edwards' functional integral approach to the statistical mechanics of polymer liquids is amenable to a diagrammatic analysis in which free energies and correlation functions are expanded as infinite sums of Feynman diagrams. This analysis is shown to lead naturally to a perturbative cluster expansion that is closely related to the Mayer cluster expansion developed for molecular liquids by Chandler and co-workers. Expansion of the functional integral representation of the grand-canonical partition function yields a perturbation theory in which all quantities of interest are expressed as functionals of a monomer-monomer pair potential, as functionals of intramolecular correlation functions of non-interacting molecules, and as functions of molecular activities. In different variants of the theory, the pair potential may be either a bare or a screened potential. A series of topological reductions yields a renormalized diagrammatic expansion in which collective correlation functions are instead expressed diagrammatically as functionals of the true single-molecule correlation functions in the interacting fluid, and as functions of molecular number density. Similar renormalized expansions are also obtained for a collective Ornstein-Zernicke direct correlation function, and for intramolecular correlation functions. A concise discussion is given of the corresponding Mayer cluster expansion, and of the relationship between the Mayer and perturbative cluster expansions for liquids of flexible molecules. The application of the perturbative cluster expansion to coarse-grained models of dense multi-component polymer liquids is discussed, and a justification is given for the use of a loop expansion. As an example, the formalism is used to derive a new expression for the wave-number dependent direct correlation function and recover known expressions for the intramolecular two-point correlation function to first-order in a renormalized loop expansion for coarse-grained models of
Modeling jet and outflow feedback during star cluster formation
Energy Technology Data Exchange (ETDEWEB)
Federrath, Christoph [Monash Centre for Astrophysics, School of Mathematical Sciences, Monash University, VIC 3800 (Australia); Schrön, Martin [Department of Computational Hydrosystems, Helmholtz Centre for Environmental Research-UFZ, Permoserstr. 15, D-04318 Leipzig (Germany); Banerjee, Robi [Hamburger Sternwarte, Gojenbergsweg 112, D-21029 Hamburg (Germany); Klessen, Ralf S., E-mail: christoph.federrath@monash.edu [Universität Heidelberg, Zentrum für Astronomie, Institut für Theoretische Astrophysik, Albert-Ueberle-Strasse 2, D-69120 Heidelberg (Germany)
2014-08-01
Powerful jets and outflows are launched from the protostellar disks around newborn stars. These outflows carry enough mass and momentum to transform the structure of their parent molecular cloud and to potentially control star formation itself. Despite their importance, we have not been able to fully quantify the impact of jets and outflows during the formation of a star cluster. The main problem lies in limited computing power. We would have to resolve the magnetic jet-launching mechanism close to the protostar and at the same time follow the evolution of a parsec-size cloud for a million years. Current computer power and codes fall orders of magnitude short of achieving this. In order to overcome this problem, we implement a subgrid-scale (SGS) model for launching jets and outflows, which demonstrably converges and reproduces the mass, linear and angular momentum transfer, and the speed of real jets, with ∼1000 times lower resolution than would be required without the SGS model. We apply the new SGS model to turbulent, magnetized star cluster formation and show that jets and outflows (1) eject about one-fourth of their parent molecular clump in high-speed jets, quickly reaching distances of more than a parsec, (2) reduce the star formation rate by about a factor of two, and (3) lead to the formation of ∼1.5 times as many stars compared to the no-outflow case. Most importantly, we find that jets and outflows reduce the average star mass by a factor of ∼ three and may thus be essential for understanding the characteristic mass of the stellar initial mass function.
Fuzzy C-Means Clustering Model Data Mining For Recognizing Stock Data Sampling Pattern
Directory of Open Access Journals (Sweden)
Sylvia Jane Annatje Sumarauw
2007-06-01
Full Text Available Abstract Capital market has been beneficial to companies and investor. For investors, the capital market provides two economical advantages, namely deviden and capital gain, and a non-economical one that is a voting .} hare in Shareholders General Meeting. But, it can also penalize the share owners. In order to prevent them from the risk, the investors should predict the prospect of their companies. As a consequence of having an abstract commodity, the share quality will be determined by the validity of their company profile information. Any information of stock value fluctuation from Jakarta Stock Exchange can be a useful consideration and a good measurement for data analysis. In the context of preventing the shareholders from the risk, this research focuses on stock data sample category or stock data sample pattern by using Fuzzy c-Me, MS Clustering Model which providing any useful information jar the investors. lite research analyses stock data such as Individual Index, Volume and Amount on Property and Real Estate Emitter Group at Jakarta Stock Exchange from January 1 till December 31 of 204. 'he mining process follows Cross Industry Standard Process model for Data Mining (CRISP,. DM in the form of circle with these steps: Business Understanding, Data Understanding, Data Preparation, Modelling, Evaluation and Deployment. At this modelling process, the Fuzzy c-Means Clustering Model will be applied. Data Mining Fuzzy c-Means Clustering Model can analyze stock data in a big database with many complex variables especially for finding the data sample pattern, and then building Fuzzy Inference System for stimulating inputs to be outputs that based on Fuzzy Logic by recognising the pattern. Keywords: Data Mining, AUz..:y c-Means Clustering Model, Pattern Recognition
Cluster analysis of midlatitude oceanic cloud regimes: mean properties and temperature sensitivity
Directory of Open Access Journals (Sweden)
N. D. Gordon
2010-07-01
Full Text Available Clouds play an important role in the climate system by reducing the amount of shortwave radiation reaching the surface and the amount of longwave radiation escaping to space. Accurate simulation of clouds in computer models remains elusive, however, pointing to a lack of understanding of the connection between large-scale dynamics and cloud properties. This study uses a k-means clustering algorithm to group 21 years of satellite cloud data over midlatitude oceans into seven clusters, and demonstrates that the cloud clusters are associated with distinct large-scale dynamical conditions. Three clusters correspond to low-level cloud regimes with different cloud fraction and cumuliform or stratiform characteristics, but all occur under large-scale descent and a relatively dry free troposphere. Three clusters correspond to vertically extensive cloud regimes with tops in the middle or upper troposphere, and they differ according to the strength of large-scale ascent and enhancement of tropospheric temperature and humidity. The final cluster is associated with a lower troposphere that is dry and an upper troposphere that is moist and experiencing weak ascent and horizontal moist advection.
Since the present balance of reflection of shortwave and absorption of longwave radiation by clouds could change as the atmosphere warms from increasing anthropogenic greenhouse gases, we must also better understand how increasing temperature modifies cloud and radiative properties. We therefore undertake an observational analysis of how midlatitude oceanic clouds change with temperature when dynamical processes are held constant (i.e., partial derivative with respect to temperature. For each of the seven cloud regimes, we examine the difference in cloud and radiative properties between warm and cold subsets. To avoid misinterpreting a cloud response to large-scale dynamical forcing as a cloud response to temperature, we require horizontal and vertical
Comparative analysis on the selection of number of clusters in community detection
Kawamoto, Tatsuro; Kabashima, Yoshiyuki
2018-02-01
We conduct a comparative analysis on various estimates of the number of clusters in community detection. An exhaustive comparison requires testing of all possible combinations of frameworks, algorithms, and assessment criteria. In this paper we focus on the framework based on a stochastic block model, and investigate the performance of greedy algorithms, statistical inference, and spectral methods. For the assessment criteria, we consider modularity, map equation, Bethe free energy, prediction errors, and isolated eigenvalues. From the analysis, the tendency of overfit and underfit that the assessment criteria and algorithms have becomes apparent. In addition, we propose that the alluvial diagram is a suitable tool to visualize statistical inference results and can be useful to determine the number of clusters.
Directory of Open Access Journals (Sweden)
Ichiro IWASAKI
2010-06-01
Full Text Available Michael Porter’s concept of competitive advantages emphasizes the importance of regional cooperation of various actors in order to gain competitiveness on globalized markets. Foreign investors may play an important role in forming such cooperation networks. Their local suppliers tend to concentrate regionally. They can form, together with local institutions of education, research, financial and other services, development agencies, the nucleus of cooperative clusters. This paper deals with the relationship between supplier networks and clusters. Two main issues are discussed in more detail: the interest of multinational companies in entering regional clusters and the spillover effects that may stem from their participation. After the discussion on the theoretical background, the paper introduces a relatively new analytical method: “cluster mapping” - a method that can spot regional hot spots of specific economic activities with cluster building potential. Experience with the method was gathered in the US and in the European Union. After the discussion on the existing empirical evidence, the authors introduce their own cluster mapping results, which they obtained by using a refined version of the original methodology.
Energy Technology Data Exchange (ETDEWEB)
Bush, B. [National Renewable Energy Lab. (NREL), Golden, CO (United States); Melaina, M. [National Renewable Energy Lab. (NREL), Golden, CO (United States); Penev, M. [National Renewable Energy Lab. (NREL), Golden, CO (United States); Daniel, W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2013-09-01
This report describes the development and analysis of detailed temporal and spatial scenarios for early market hydrogen fueling infrastructure clustering and fuel cell electric vehicle rollout using the Scenario Evaluation, Regionalization and Analysis (SERA) model. The report provides an overview of the SERA scenario development framework and discusses the approach used to develop the nationwidescenario.
Angelov, Kiril; Kaynakchieva, Vesela
2017-12-01
The aim of the current study is to research and analyze Mathematical model for research and analyze of relations and functions between enterprises, members of cluster, and its approbation in given cluster. Subject of the study are theoretical mechanisms for the definition of mathematical models for research and analyze of relations and functions between enterprises, members of cluster. Object of the study are production enterprises, members of cluster. Results of this study show that described theoretical mathematical model is applicable for research and analyze of functions and relations between enterprises, members of cluster from different industrial sectors. This circumstance creates alternatives for election of cluster, where is experimented this model for interaction improvement between enterprises, members of cluster.
Adrian Ioana; Tiberiu Socaciu
2013-01-01
The article presents specific aspects of management and models for economic analysis. Thus, we present the main types of economic analysis: statistical analysis, dynamic analysis, static analysis, mathematical analysis, psychological analysis. Also we present the main object of the analysis: the technological activity analysis of a company, the analysis of the production costs, the economic activity analysis of a company, the analysis of equipment, the analysis of labor productivity, the anal...
Directory of Open Access Journals (Sweden)
Zhang Zhang
2009-06-01
Full Text Available A major analytical challenge in computational biology is the detection and description of clusters of specified site types, such as polymorphic or substituted sites within DNA or protein sequences. Progress has been stymied by a lack of suitable methods to detect clusters and to estimate the extent of clustering in discrete linear sequences, particularly when there is no a priori specification of cluster size or cluster count. Here we derive and demonstrate a maximum likelihood method of hierarchical clustering. Our method incorporates a tripartite divide-and-conquer strategy that models sequence heterogeneity, delineates clusters, and yields a profile of the level of clustering associated with each site. The clustering model may be evaluated via model selection using the Akaike Information Criterion, the corrected Akaike Information Criterion, and the Bayesian Information Criterion. Furthermore, model averaging using weighted model likelihoods may be applied to incorporate model uncertainty into the profile of heterogeneity across sites. We evaluated our method by examining its performance on a number of simulated datasets as well as on empirical polymorphism data from diverse natural alleles of the Drosophila alcohol dehydrogenase gene. Our method yielded greater power for the detection of clustered sites across a breadth of parameter ranges, and achieved better accuracy and precision of estimation of clusters, than did the existing empirical cumulative distribution function statistics.
Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model
International Nuclear Information System (INIS)
Ellefsen, Karl J.; Smith, David B.
2016-01-01
Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples. - Highlights: • We evaluate a clustering procedure by applying it to geochemical data. • The procedure generates a hierarchy of clusters. • Different levels of the hierarchy show geochemical processes at different spatial scales. • The clustering method is Bayesian finite mixture modeling. • Model parameters are estimated with Hamiltonian Monte Carlo sampling.
Latent class factor and cluster models, bi-plots and tri-plots and related graphical displays
Magidson, J.; Vermunt, J.K.
2001-01-01
We propose an alternative method of conducting exploratory latent class analysis that utilizes latent class factor models, and compare it to the more traditional approach based on latent class cluster models. We show that when formulated in terms of R mutually independent, dichotomous latent
Model catalysis by size-selected cluster deposition
Energy Technology Data Exchange (ETDEWEB)
Anderson, Scott [Univ. of Utah, Salt Lake City, UT (United States)
2015-11-20
This report summarizes the accomplishments during the last four years of the subject grant. Results are presented for experiments in which size-selected model catalysts were studied under surface science and aqueous electrochemical conditions. Strong effects of cluster size were found, and by correlating the size effects with size-dependent physical properties of the samples measured by surface science methods, it was possible to deduce mechanistic insights, such as the factors that control the rate-limiting step in the reactions. Results are presented for CO oxidation, CO binding energetics and geometries, and electronic effects under surface science conditions, and for the electrochemical oxygen reduction reaction, ethanol oxidation reaction, and for oxidation of carbon by water.
Clustering network layers with the strata multilayer stochastic block model.
Stanley, Natalie; Shai, Saray; Taylor, Dane; Mucha, Peter J
2016-01-01
Multilayer networks are a useful data structure for simultaneously capturing multiple types of relationships between a set of nodes. In such networks, each relational definition gives rise to a layer. While each layer provides its own set of information, community structure across layers can be collectively utilized to discover and quantify underlying relational patterns between nodes. To concisely extract information from a multilayer network, we propose to identify and combine sets of layers with meaningful similarities in community structure. In this paper, we describe the "strata multilayer stochastic block model" (sMLSBM), a probabilistic model for multilayer community structure. The central extension of the model is that there exist groups of layers, called "strata", which are defined such that all layers in a given stratum have community structure described by a common stochastic block model (SBM). That is, layers in a stratum exhibit similar node-to-community assignments and SBM probability parameters. Fitting the sMLSBM to a multilayer network provides a joint clustering that yields node-to-community and layer-to-stratum assignments, which cooperatively aid one another during inference. We describe an algorithm for separating layers into their appropriate strata and an inference technique for estimating the SBM parameters for each stratum. We demonstrate our method using synthetic networks and a multilayer network inferred from data collected in the Human Microbiome Project.
Performance analysis of clustering techniques over microarray data: A case study
Dash, Rasmita; Misra, Bijan Bihari
2018-03-01
Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.
Fokkema, M; Smits, N; Zeileis, A; Hothorn, T; Kelderman, H
2017-10-25
Identification of subgroups of patients for whom treatment A is more effective than treatment B, and vice versa, is of key importance to the development of personalized medicine. Tree-based algorithms are helpful tools for the detection of such interactions, but none of the available algorithms allow for taking into account clustered or nested dataset structures, which are particularly common in psychological research. Therefore, we propose the generalized linear mixed-effects model tree (GLMM tree) algorithm, which allows for the detection of treatment-subgroup interactions, while accounting for the clustered structure of a dataset. The algorithm uses model-based recursive partitioning to detect treatment-subgroup interactions, and a GLMM to estimate the random-effects parameters. In a simulation study, GLMM trees show higher accuracy in recovering treatment-subgroup interactions, higher predictive accuracy, and lower type II error rates than linear-model-based recursive partitioning and mixed-effects regression trees. Also, GLMM trees show somewhat higher predictive accuracy than linear mixed-effects models with pre-specified interaction effects, on average. We illustrate the application of GLMM trees on an individual patient-level data meta-analysis on treatments for depression. We conclude that GLMM trees are a promising exploratory tool for the detection of treatment-subgroup interactions in clustered datasets.
Minku, Leandro L.
2017-10-06
Background: Software Effort Estimation (SEE) can be formulated as an online learning problem, where new projects are completed over time and may become available for training. In this scenario, a Cross-Company (CC) SEE approach called Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving the high cost of collecting such training projects. However, Dycom relies on splitting CC projects into different subsets in order to create its CC models. Such splitting can have a significant impact on Dycom\\'s predictive performance. Aims: This paper investigates whether clustering methods can be used to help finding good CC splits for Dycom. Method: Dycom is extended to use clustering methods for creating the CC subsets. Three different clustering methods are investigated, namely Hierarchical Clustering, K-Means, and Expectation-Maximisation. Clustering Dycom is compared against the original Dycom with CC subsets of different sizes, based on four SEE databases. A baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number of CC subsets to be pre-defined, and a poor choice can negatively affect predictive performance. EM enables Dycom to automatically set the number of CC subsets while still maintaining or improving predictive performance with respect to the baseline WC model. Clustering Dycom with Hierarchical Clustering did not offer significant advantage in terms of predictive performance. Conclusion: Clustering methods can be an effective way to automatically generate Dycom\\'s CC subsets.
Peuten, M.; Zocchi, A.; Gieles, M.; Hénault-Brunet, V.
2017-09-01
Lowered isothermal models, such as the multimass Michie-King models, have been successful in describing observational data of globular clusters. In this study, we assess whether such models are able to describe the phase space properties of evolutionary N-body models. We compare the multimass models as implemented in limepy (Gieles & Zocchi) to N-body models of star clusters with different retention fractions for the black holes and neutron stars evolving in a tidal field. We find that multimass models successfully reproduce the density and velocity dispersion profiles of the different mass components in all evolutionary phases and for different remnants retention. We further use these results to study the evolution of global model parameters. We find that over the lifetime of clusters, radial anisotropy gradually evolves from the low- to the high-mass components and we identify features in the properties of observable stars that are indicative of the presence of stellar-mass black holes. We find that the model velocity scale depends on mass as m-δ, with δ ≃ 0.5 for almost all models, but the dependence of central velocity dispersion on m can be shallower, depending on the dark remnant content, and agrees well with that of the N-body models. The reported model parameters, and correlations amongst them, can be used as theoretical priors when fitting these types of mass models to observational data.
Structure and substructure analysis of DAFT/FADA galaxy clusters in the [0.4-0.9] redshift range
Guennou, L.; Adami, C.; Durret, F.; Lima Neto, G. B.; Ulmer, M. P.; Clowe, D.; LeBrun, V.; Martinet, N.; Allam, S.; Annis, J.; Basa, S.; Benoist, C.; Biviano, A.; Cappi, A.; Cypriano, E. S.; Gavazzi, R.; Halliday, C.; Ilbert, O.; Jullo, E.; Just, D.; Limousin, M.; Márquez, I.; Mazure, A.; Murphy, K. J.; Plana, H.; Rostagni, F.; Russeil, D.; Schirmer, M.; Slezak, E.; Tucker, D.; Zaritsky, D.; Ziegler, B.
2014-01-01
Context. The DAFT/FADA survey is based on the study of ~90 rich (masses found in the literature >2 × 1014 M⊙) and moderately distant clusters (redshifts 0.4 DAFT/FADA survey for which XMM-Newton and/or a sufficient number of galaxy redshifts in the cluster range are available, with the aim of detecting substructures and evidence for merging events. These properties are discussed in the framework of standard cold dark matter (ΛCDM) cosmology. Methods: In X-rays, we analysed the XMM-Newton data available, fit a β-model, and subtracted it to identify residuals. We used Chandra data, when available, to identify point sources. In the optical, we applied a Serna & Gerbal (SG) analysis to clusters with at least 15 spectroscopic galaxy redshifts available in the cluster range. We discuss the substructure detection efficiencies of both methods. Results: XMM-Newton data were available for 32 clusters, for which we derive the X-ray luminosity and a global X-ray temperature for 25 of them. For 23 clusters we were able to fit the X-ray emissivity with a β-model and subtract it to detect substructures in the X-ray gas. A dynamical analysis based on the SG method was applied to the clusters having at least 15 spectroscopic galaxy redshifts in the cluster range: 18 X-ray clusters and 11 clusters with no X-ray data. The choice of a minimum number of 15 redshifts implies that only major substructures will be detected. Ten substructures were detected both in X-rays and by the SG method. Most of the substructures detected both in X-rays and with the SG method are probably at their first cluster pericentre approach and are relatively recent infalls. We also find hints of a decreasing X-ray gas density profile core radius with redshift. Conclusions: The percentage of mass included in substructures was found to be roughly constant with redshift values of 5-15%, in agreement both with the general CDM framework and with the results of numerical simulations. Galaxies in substructures
Zhang, Ying; Moges, Semu; Block, Paul
2018-01-01
Prediction of seasonal precipitation can provide actionable information to guide management of various sectoral activities. For instance, it is often translated into hydrological forecasts for better water resources management. However, many studies assume homogeneity in precipitation across an entire study region, which may prove ineffective for operational and local-level decisions, particularly for locations with high spatial variability. This study proposes advancing local-level seasonal precipitation predictions by first conditioning on regional-level predictions, as defined through objective cluster analysis, for western Ethiopia. To our knowledge, this is the first study predicting seasonal precipitation at high resolution in this region, where lives and livelihoods are vulnerable to precipitation variability given the high reliance on rain-fed agriculture and limited water resources infrastructure. The combination of objective cluster analysis, spatially high-resolution prediction of seasonal precipitation, and a modeling structure spanning statistical and dynamical approaches makes clear advances in prediction skill and resolution, as compared with previous studies. The statistical model improves versus the non-clustered case or dynamical models for a number of specific clusters in northwestern Ethiopia, with clusters having regional average correlation and ranked probability skill score (RPSS) values of up to 0.5 and 33 %, respectively. The general skill (after bias correction) of the two best-performing dynamical models over the entire study region is superior to that of the statistical models, although the dynamical models issue predictions at a lower resolution and the raw predictions require bias correction to guarantee comparable skills.
Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.
Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço
2017-11-01
The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples. © 2017 American Academy of Forensic Sciences.
Directory of Open Access Journals (Sweden)
Marco Borri
Full Text Available To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment.The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4. Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters.The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4, determined with cluster validation, produced the best separation between reducing and non-reducing clusters.The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.
Directory of Open Access Journals (Sweden)
John Stewart
2012-10-01
Full Text Available This study examined the evolution of student responses to seven contextually different versions of two Force Concept Inventory questions in an introductory physics course at the University of Arkansas. The consistency in answering the closely related questions evolved little over the seven-question exam. A model for the state of student knowledge involving the probability of selecting one of the multiple-choice answers was developed. Criteria for using clustering algorithms to extract model parameters were explored and it was found that the overlap between the probability distributions of the model vectors was an important parameter in characterizing the cluster models. The course data were then clustered and the extracted model showed that students largely fit into two groups both pre- and postinstruction: one that answered all questions correctly with high probability and one that selected the distracter representing the same misconception with high probability. For the course studied, 14% of the students were left with persistent misconceptions post instruction on a static force problem and 30% on a dynamic Newton’s third law problem. These students selected the answer representing the predominant misconception slightly more consistently postinstruction, indicating that the course studied had been ineffective at moving this subgroup of students nearer a Newtonian force concept and had instead moved them slightly farther away from a correct conceptual understanding of these two problems. The consistency in answering pairs of problems with varied physical contexts is shown to be an important supplementary statistic to the score on the problems and suggests that the inclusion of such problem pairs in future conceptual inventories would be efficacious. Multiple, contextually varied questions further probe the structure of students’ knowledge. To allow working instructors to make use of the additional insight gained from cluster analysis, it
Developing a Clustering-Based Empirical Bayes Analysis Method for Hotspot Identification
Directory of Open Access Journals (Sweden)
Yajie Zou
2017-01-01
Full Text Available Hotspot identification (HSID is a critical part of network-wide safety evaluations. Typical methods for ranking sites are often rooted in using the Empirical Bayes (EB method to estimate safety from both observed crash records and predicted crash frequency based on similar sites. The performance of the EB method is highly related to the selection of a reference group of sites (i.e., roadway segments or intersections similar to the target site from which safety performance functions (SPF used to predict crash frequency will be developed. As crash data often contain underlying heterogeneity that, in essence, can make them appear to be generated from distinct subpopulations, methods are needed to select similar sites in a principled manner. To overcome this possible heterogeneity problem, EB-based HSID methods that use common clustering methodologies (e.g., mixture models, K-means, and hierarchical clustering to select “similar” sites for building SPFs are developed. Performance of the clustering-based EB methods is then compared using real crash data. Here, HSID results, when computed on Texas undivided rural highway cash data, suggest that all three clustering-based EB analysis methods are preferred over the conventional statistical methods. Thus, properly classifying the road segments for heterogeneous crash data can further improve HSID accuracy.
Cardiometabolic risk clustering in spinal cord injury: results of exploratory factor analysis.
Libin, Alexander; Tinsley, Emily A; Nash, Mark S; Mendez, Armando J; Burns, Patricia; Elrod, Matt; Hamm, Larry F; Groah, Suzanne L
2013-01-01
Evidence suggests an elevated prevalence of cardiometabolic risks among persons with spinal cord injury (SCI); however, the unique clustering of risk factors in this population has not been fully explored. The purpose of this study was to describe unique clustering of cardiometabolic risk factors differentiated by level of injury. One hundred twenty-one subjects (mean 37 ± 12 years; range, 18-73) with chronic C5 to T12 motor complete SCI were studied. Assessments included medical histories, anthropometrics and blood pressure, and fasting serum lipids, glucose, insulin, and hemoglobin A1c (HbA1c). The most common cardiometabolic risk factors were overweight/obesity, high levels of low-density lipoprotein (LDL-C), and low levels of high-density lipoprotein (HDL-C). Risk clustering was found in 76.9% of the population. Exploratory principal component factor analysis using varimax rotation revealed a 3-factor model in persons with paraplegia (65.4% variance) and a 4-factor solution in persons with tetraplegia (73.3% variance). The differences between groups were emphasized by the varied composition of the extracted factors: Lipid Profile A (total cholesterol [TC] and LDL-C), Body Mass-Hypertension Profile (body mass index [BMI], systolic blood pressure [SBP], and fasting insulin [FI]); Glycemic Profile (fasting glucose and HbA1c), and Lipid Profile B (TG and HDL-C). BMI and SBP formed a separate factor only in persons with tetraplegia. Although the majority of the population with SCI has risk clustering, the composition of the risk clusters may be dependent on level of injury, based on a factor analysis group comparison. This is clinically plausible and relevant as tetraplegics tend to be hypo- to normotensive and more sedentary, resulting in lower HDL-C and a greater propensity toward impaired carbohydrate metabolism.
Energy Technology Data Exchange (ETDEWEB)
Schrabback, T.; et al.
2016-11-11
We present an HST/ACS weak gravitational lensing analysis of 13 massive high-redshift (z_median=0.88) galaxy clusters discovered in the South Pole Telescope (SPT) Sunyaev-Zel'dovich Survey. This study is part of a larger campaign that aims to robustly calibrate mass-observable scaling relations over a wide range in redshift to enable improved cosmological constraints from the SPT cluster sample. We introduce new strategies to ensure that systematics in the lensing analysis do not degrade constraints on cluster scaling relations significantly. First, we efficiently remove cluster members from the source sample by selecting very blue galaxies in V-I colour. Our estimate of the source redshift distribution is based on CANDELS data, where we carefully mimic the source selection criteria of the cluster fields. We apply a statistical correction for systematic photometric redshift errors as derived from Hubble Ultra Deep Field data and verified through spatial cross-correlations. We account for the impact of lensing magnification on the source redshift distribution, finding that this is particularly relevant for shallower surveys. Finally, we account for biases in the mass modelling caused by miscentring and uncertainties in the mass-concentration relation using simulations. In combination with temperature estimates from Chandra we constrain the normalisation of the mass-temperature scaling relation ln(E(z) M_500c/10^14 M_sun)=A+1.5 ln(kT/7.2keV) to A=1.81^{+0.24}_{-0.14}(stat.) +/- 0.09(sys.), consistent with self-similar redshift evolution when compared to lower redshift samples. Additionally, the lensing data constrain the average concentration of the clusters to c_200c=5.6^{+3.7}_{-1.8}.
Energy Technology Data Exchange (ETDEWEB)
Schrabback, T.; Applegate, D.; Dietrich, J. P.; Hoekstra, H.; Bocquet, S.; Gonzalez, A. H.; der Linden, A. von; McDonald, M.; Morrison, C. B.; Raihan, S. F.; Allen, S. W.; Bayliss, M.; Benson, B. A.; Bleem, L. E.; Chiu, I.; Desai, S.; Foley, R. J.; de Haan, T.; High, F. W.; Hilbert, S.; Mantz, A. B.; Massey, R.; Mohr, J.; Reichardt, C. L.; Saro, A.; Simon, P.; Stern, C.; Stubbs, C. W.; Zenteno, A.
2017-10-14
We present an HST/Advanced Camera for Surveys (ACS) weak gravitational lensing analysis of 13 massive high-redshift (z(median) = 0.88) galaxy clusters discovered in the South Pole Telescope (SPT) Sunyaev-Zel'dovich Survey. This study is part of a larger campaign that aims to robustly calibrate mass-observable scaling relations over a wide range in redshift to enable improved cosmological constraints from the SPT cluster sample. We introduce new strategies to ensure that systematics in the lensing analysis do not degrade constraints on cluster scaling relations significantly. First, we efficiently remove cluster members from the source sample by selecting very blue galaxies in V - I colour. Our estimate of the source redshift distribution is based on Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) data, where we carefully mimic the source selection criteria of the cluster fields. We apply a statistical correction for systematic photometric redshift errors as derived from Hubble Ultra Deep Field data and verified through spatial cross-correlations. We account for the impact of lensing magnification on the source redshift distribution, finding that this is particularly relevant for shallower surveys. Finally, we account for biases in the mass modelling caused by miscentring and uncertainties in the concentration-mass relation using simulations. In combination with temperature estimates from Chandra we constrain the normalization of the mass-temperature scaling relation ln (E(z) M-500c/10(14)M(circle dot)) = A + 1.5ln (kT/7.2 keV) to A = 1.81(-0.14)(+0.24)(stat.)+/- 0.09(sys.), consistent with self-similar redshift evolution when compared to lower redshift samples. Additionally, the lensing data constrain the average concentration of the clusters to c(200c) = 5.6(-1.8)(+3.7).
Schrabback, T.; Applegate, D.; Dietrich, J. P.; Hoekstra, H.; Bocquet, S.; Gonzalez, A. H.; von der Linden, A.; McDonald, M.; Morrison, C. B.; Raihan, S. F.; Allen, S. W.; Bayliss, M.; Benson, B. A.; Bleem, L. E.; Chiu, I.; Desai, S.; Foley, R. J.; de Haan, T.; High, F. W.; Hilbert, S.; Mantz, A. B.; Massey, R.; Mohr, J.; Reichardt, C. L.; Saro, A.; Simon, P.; Stern, C.; Stubbs, C. W.; Zenteno, A.
2018-02-01
We present an HST/Advanced Camera for Surveys (ACS) weak gravitational lensing analysis of 13 massive high-redshift (zmedian = 0.88) galaxy clusters discovered in the South Pole Telescope (SPT) Sunyaev-Zel'dovich Survey. This study is part of a larger campaign that aims to robustly calibrate mass-observable scaling relations over a wide range in redshift to enable improved cosmological constraints from the SPT cluster sample. We introduce new strategies to ensure that systematics in the lensing analysis do not degrade constraints on cluster scaling relations significantly. First, we efficiently remove cluster members from the source sample by selecting very blue galaxies in V - I colour. Our estimate of the source redshift distribution is based on Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) data, where we carefully mimic the source selection criteria of the cluster fields. We apply a statistical correction for systematic photometric redshift errors as derived from Hubble Ultra Deep Field data and verified through spatial cross-correlations. We account for the impact of lensing magnification on the source redshift distribution, finding that this is particularly relevant for shallower surveys. Finally, we account for biases in the mass modelling caused by miscentring and uncertainties in the concentration-mass relation using simulations. In combination with temperature estimates from Chandra we constrain the normalization of the mass-temperature scaling relation ln (E(z)M500c/1014 M⊙) = A + 1.5ln (kT/7.2 keV) to A=1.81^{+0.24}_{-0.14}(stat.) {± } 0.09(sys.), consistent with self-similar redshift evolution when compared to lower redshift samples. Additionally, the lensing data constrain the average concentration of the clusters to c_200c=5.6^{+3.7}_{-1.8}.
Interpolation of daily rainfall using spatiotemporal models and clustering
Militino, A. F.
2014-06-11
Accumulated daily rainfall in non-observed locations on a particular day is frequently required as input to decision-making tools in precision agriculture or for hydrological or meteorological studies. Various solutions and estimation procedures have been proposed in the literature depending on the auxiliary information and the availability of data, but most such solutions are oriented to interpolating spatial data without incorporating temporal dependence. When data are available in space and time, spatiotemporal models usually provide better solutions. Here, we analyse the performance of three spatiotemporal models fitted to the whole sampled set and to clusters within the sampled set. The data consists of daily observations collected from 87 manual rainfall gauges from 1990 to 2010 in Navarre, Spain. The accuracy and precision of the interpolated data are compared with real data from 33 automated rainfall gauges in the same region, but placed in different locations than the manual rainfall gauges. Root mean squared error by months and by year are also provided. To illustrate these models, we also map interpolated daily precipitations and standard errors on a 1km2 grid in the whole region. © 2014 Royal Meteorological Society.
Interpolation of daily rainfall using spatiotemporal models and clustering
Militino, A. F.; Ugarte, M. D.; Goicoa, T.; Genton, Marc G.
2014-01-01
Accumulated daily rainfall in non-observed locations on a particular day is frequently required as input to decision-making tools in precision agriculture or for hydrological or meteorological studies. Various solutions and estimation procedures have been proposed in the literature depending on the auxiliary information and the availability of data, but most such solutions are oriented to interpolating spatial data without incorporating temporal dependence. When data are available in space and time, spatiotemporal models usually provide better solutions. Here, we analyse the performance of three spatiotemporal models fitted to the whole sampled set and to clusters within the sampled set. The data consists of daily observations collected from 87 manual rainfall gauges from 1990 to 2010 in Navarre, Spain. The accuracy and precision of the interpolated data are compared with real data from 33 automated rainfall gauges in the same region, but placed in different locations than the manual rainfall gauges. Root mean squared error by months and by year are also provided. To illustrate these models, we also map interpolated daily precipitations and standard errors on a 1km2 grid in the whole region. © 2014 Royal Meteorological Society.
GraphCrunch 2: Software tool for network modeling, alignment and clustering.
Kuchaiev, Oleksii; Stevanović, Aleksandar; Hayes, Wayne; Pržulj, Nataša
2011-01-19
Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL") for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other existing tool. Finally, GraphCruch 2 implements an
GraphCrunch 2: Software tool for network modeling, alignment and clustering
Directory of Open Access Journals (Sweden)
Hayes Wayne
2011-01-01
Full Text Available Abstract Background Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. Results We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL" for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other
The cluster model and the generalized Brody-Moshinsky coefficients
International Nuclear Information System (INIS)
Silvestre-Brac, B.
1985-01-01
Cluster theories, which rigorously eliminate the centre of mass motion, need intrinsic cluster coordinates. It is shown that the Jacobi coordinates of the various clusters are related by an orthogonal transformation and that the use of generalized Brody-Moshinsky coefficients allows an exact calculation of the exchange kernels. This procedure is illustrated by the description of nucleon-nucleon interaction in terms of constituent quarks
COCOA Code for Creating Mock Observations of Star Cluster Models
Askar, Abbas; Giersz, Mirek; Pych, Wojciech; Dalessandro, Emanuele
2017-01-01
We introduce and present results from the COCOA (Cluster simulatiOn Comparison with ObservAtions) code that has been developed to create idealized mock photometric observations using results from numerical simulations of star cluster evolution. COCOA is able to present the output of realistic numerical simulations of star clusters carried out using Monte Carlo or \\textit{N}-body codes in a way that is useful for direct comparison with photometric observations. In this paper, we describe the C...
Directory of Open Access Journals (Sweden)
Quan Zhou
2017-07-01
Full Text Available The construction of large-scale wind farms results in a dramatic increase of wind turbine (WT faults. The failure mode is also becoming increasingly complex. This study proposes a new model for early warning and diagnosis of WT faults to solve the problem of Supervisory Control And Data Acquisition (SCADA systems, given that the traditional threshold method cannot provide timely warning. First, the characteristic quantity of fault early warning and diagnosis analyzed by clustering analysis can obtain in advance abnormal data in the normal threshold range by considering the effects of wind speed. Based on domain knowledge, Adaptive Neuro-fuzzy Inference System (ANFIS is then modified to establish the fault early warning and diagnosis model. This approach improves the accuracy of the model under the condition of absent and sparse training data. Case analysis shows that the effect of the early warning and diagnosis model in this study is better than that of the traditional threshold method.
Identification and validation of asthma phenotypes in Chinese population using cluster analysis.
Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang
2017-10-01
Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Genome-scale analysis of positional clustering of mouse testis-specific genes
Directory of Open Access Journals (Sweden)
Lee Bernett TK
2005-01-01
Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.
ANALYSIS OF DEVELOPING BATIK INDUSTRY CLUSTER IN BAKARAN VILLAGE CENTRAL JAVA PROVINCE
Directory of Open Access Journals (Sweden)
Hermanto Hermanto
2017-06-01
Full Text Available SMEs grow in a cluster in a certain geographical area. The entrepreneurs grow and thrive through the business cluster. Central Java Province has a lot of business clusters in improving the regional economy, one of which is batik industry cluster. Pati Regency is one of regencies / city in Central Java that has the lowest turnover. Batik industy cluster in Pati develops quite well, which can be seen from the increasing number of batik industry incorporated in the cluster. This research examines the strategy of developing the batik industry cluster in Pati Regency. The purpose of this research is to determine the proper strategy for developing the batik industry clusters in Pati. The method of research is quantitative. The analysis tool of this research is the Strengths, Weakness, Opportunity, Threats (SWOT analysis. The result of SWOT analysis in this research shows that the proper strategy for developing the batik industry cluster in Pati is optimizing the management of batik business cluster in Bakaran Village; the local government provides information of the facility of business capital loans; the utilization of labors from Bakaran Village while improving the quality of labors by training, and marketing the Bakaran batik to the broader markets while maintaining the quality of batik. Advice that can be given from this research is that the parties who have a role in batik industry cluster development in Bakaran Village, Pati Regency, such as the Local Government.
Analysis of genetic association using hierarchical clustering and cluster validation indices.
Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L
2017-10-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Cluster analysis of HZE particle tracks as applied to space radiobiology problems
International Nuclear Information System (INIS)
Batmunkh, M.; Bayarchimeg, L.; Lkhagva, O.; Belov, O.
2013-01-01
A cluster analysis is performed of ionizations in tracks produced by the most abundant nuclei in the charge and energy spectra of the galactic cosmic rays. The frequency distribution of clusters is estimated for cluster sizes comparable to the DNA molecule at different packaging levels. For this purpose, an improved K-mean-based algorithm is suggested. This technique allows processing particle tracks containing a large number of ionization events without setting the number of clusters as an input parameter. Using this method, the ionization distribution pattern is analyzed depending on the cluster size and particle's linear energy transfer
Application of cluster analysis and unsupervised learning to multivariate tissue characterization
International Nuclear Information System (INIS)
Momenan, R.; Insana, M.F.; Wagner, R.F.; Garra, B.S.; Loew, M.H.
1987-01-01
This paper describes a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis. These techniques are being applied to unsupervised ultrasonic image segmentation and tissue characterization. The performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier. In these comparisons two objectives are sought: a) How well does the clustering method group the data?; b) Do the clusters correspond to known tissue classes? The first question is investigated by a measure of cluster similarity and dispersion. The second question involves a comparison with a supervised technique using labeled data
a Three-Step Spatial-Temporal Clustering Method for Human Activity Pattern Analysis
Huang, W.; Li, S.; Xu, S.
2016-06-01
How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the
A THREE-STEP SPATIAL-TEMPORAL-SEMANTIC CLUSTERING METHOD FOR HUMAN ACTIVITY PATTERN ANALYSIS
Directory of Open Access Journals (Sweden)
W. Huang
2016-06-01
Full Text Available How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time to four dimensions (space, time and semantics. More specifically, not only a location and time that people stay and spend are collected, but also what people “say” for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The
Directory of Open Access Journals (Sweden)
Tae Won Chung
2016-12-01
Full Text Available Measurement and discussions of logistics cluster competitiveness with a national approach are required to boost agglomeration effects and potentially create logistics efficiency and productivity. This study developed assessment criteria of logistics cluster competitiveness based on Porter's diamond model, calculated the weight of each criterion by the AHP method, and finally evaluated and discussed logistics cluster competitiveness among Asia main countries. The results indicate that there was a large difference in logistics cluster competitiveness among six countries. The logistics cluster competitiveness scores of Singapore (7.93, Japan (7.38, and Hong Kong (7.04 are observably different from those of China (5.40, Korea (5.08, and Malaysia (3.46. Singapore, with the highest competitiveness score, revealed its absolute advantage in logistics cluster indices. These research results intend to provide logistics policy makers with some strategic recommendations, and may serve as a baseline for further logistics cluster studies using Porter's diamond model.
An evaluation of centrality measures used in cluster analysis
Engström, Christopher; Silvestrov, Sergei
2014-12-01
Clustering of data into groups of similar objects plays an important part when analysing many types of data, especially when the datasets are large as they often are in for example bioinformatics, social networks and computational linguistics. Many clustering algorithms such as K-means and some types of hierarchical clustering need a number of centroids representing the 'center' of the clusters. The choice of centroids for the initial clusters often plays an important role in the quality of the clusters. Since a data point with a high centrality supposedly lies close to the 'center' of some cluster, this can be used to assign centroids rather than through some other method such as picking them at random. Some work have been done to evaluate the use of centrality measures such as degree, betweenness and eigenvector centrality in clustering algorithms. The aim of this article is to compare and evaluate the usefulness of a number of common centrality measures such as the above mentioned and others such as PageRank and related measures.
International Nuclear Information System (INIS)
Regentova, E.; Zhang, L.; Veni, G.; Zheng, J.
2007-01-01
A system is designed for detecting microcalcification clusters (MCC) in digital mammograms. The system is intended for computer-aided diagnostic prompting. Further discrimination of MCC as benign or malignant is assumed to be performed by radiologists. Processing of mammograms is based on the statistical modeling by means of wavelet domain hidden markov trees (WHMT). Segmentation is performed by the weighted likelihood evaluation followed by the classification based on spatial filters for a single microcalcification (MC) and a cluster of MC detection. The analysis is carried out on FROC curves for 40 mammograms from the mini-MIAS database and for 100 mammograms with 50 cancerous and 50 benign cases from DDSM database. The designed system is capable to detect 100% of true positive cases in these sets. The rate of false positives is 2.9 per case for mini-MIAS dataset; and 0.01 for the DDSM images. (orig.)
Modeling the pinning of Au and Ni clusters on graphite
Smith, R.; Nock, C.; Kenny, S.D.; Belbruno, J.J.; Di Vece, M.; Paloma, S.; Palmer, R.E.
2006-01-01
The pinning of size-selected AuN and NiN clusters on graphite, for N=7–100, is investigated by means of molecular dynamics simulations and the results are compared to experiment and previous work with Ag clusters. Ab initio calculations of the binding of the metal adatom and dimers on a graphite
Minku, Leandro L.; Hou, Siqing
2017-01-01
baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number
*K-means and cluster models for cancer signatures.
Kakushadze, Zura; Yu, Willie
2017-09-01
We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in https://ssrn.com/abstract=2802753 to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means' computational cost is a fraction of NMF's. Using 1389 published samples for 14 cancer types, we find that 3 cancers (liver cancer, lung cancer and renal cell carcinoma) stand out and do not have cluster-like structures. Two clusters have especially high within-cluster correlations with 11 other cancers indicating common underlying structures. Our approach opens a novel avenue for studying such structures. *K-means is universal and can be applied in other fields. We discuss some potential applications in quantitative finance.
Statistical Clustering and Compositional Modeling of Iapetus VIMS Spectral Data
Pinilla-Alonso, N.; Roush, T. L.; Marzo, G.; Dalle Ore, C. M.; Cruikshank, D. P.
2009-12-01
It has long been known that the surfaces of Saturn's major satellites are predominantly icy objects [e.g. 1 and references therein]. Since 2004, these bodies have been the subject of observations by the Cassini-VIMS (Visual and Infrared Mapping Spectrometer) experiment [2]. Iapetus has the unique property that the hemisphere centered on the apex of its locked synchronous orbital motion around Saturn has a very low geometrical albedo of 2-6%, while the opposite hemisphere is about 10 times more reflective. The nature and origin of the dark material of Iapetus has remained a question since its discovery [3 and references therein]. The nature of this material and how it is distributed on the surface of this body, can shed new light into the knowledge of the Saturnian system. We apply statistical clustering [4] and theoretical modeling [5,6] to address the surface composition of Iapetus. The VIMS data evaluated were obtained during the second flyby of Iapetus, in September 2007. This close approach allowed VIMS to obtain spectra at relatively high spatial resolution, ~1-22 km/pixel. The data we study sampled the trailing hemisphere and part of the dark leading one. The statistical clustering [4] is used to identify statistically distinct spectra on Iapetus. The composition of these distinct spectra are evaluated using theoretical models [5,6]. We thank Allan Meyer for his help. This research was supported by an appointment to the NASA Postdoctoral Program at the Ames Research Center, administered by Oak Ridge Associated Universities through a contract with NASA. [1] A, Coradini et al., 2009, Earth, Moon & Planets, 105, 289-310. [2] Brown et al., 2004, Space Science Reviews, 115, 111-168. [3] Cruikshank, D. et al Icarus, 2008, 193, 334-343. [4] Marzo, G. et al. 2008, Journal of Geophysical Research, 113, E12, CiteID E12009. [5] Hapke, B. 1993, Theory of reflectance and emittance spectroscopy, Cambridge University Press. [6] Shkuratov, Y. et al. 1999, Icarus, 137, 235-246.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Andersen, Erlend K F; Kristensen, Gunnar B; Lyng, Heidi; Malinen, Eirik
2011-08-01
Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, K(trans) and ύ(e), were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control.
International Nuclear Information System (INIS)
Andersen, Erlend K. F.; Kristensen, Gunnar B.; Lyng, Heidi; Malinen, Eirik
2011-01-01
Introduction. Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Materials and methods. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, K trans and u e , were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Results. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Conclusions. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control
Energy Technology Data Exchange (ETDEWEB)
Andersen, Erlend K. F. (Dept. of Medical Physics, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway)), e-mail: eirik.malinen@fys.uio.no; Kristensen, Gunnar B. (Section for Gynaecological Oncology, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway)); Lyng, Heidi (Dept. of Radiation Biology, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway)); Malinen, Eirik (Dept. of Medical Physics, The Norwegian Radium Hospital, Oslo Univ. Hospital, Oslo (Norway); Dept. of Physics, Univ. of Oslo, Oslo (Norway))
2011-08-15
Introduction. Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Materials and methods. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, Ktrans and u{sub e}, were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Results. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Conclusions. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control
Common Factor Analysis Versus Principal Component Analysis: Choice for Symptom Cluster Research
Directory of Open Access Journals (Sweden)
Hee-Ju Kim, PhD, RN
2008-03-01
Conclusion: If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research, CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.
Identifying novel phenotypes of acute heart failure using cluster analysis of clinical variables.
Horiuchi, Yu; Tanimoto, Shuzou; Latif, A H M Mahbub; Urayama, Kevin Y; Aoki, Jiro; Yahagi, Kazuyuki; Okuno, Taishi; Sato, Yu; Tanaka, Tetsu; Koseki, Keita; Komiyama, Kota; Nakajima, Hiroyoshi; Hara, Kazuhiro; Tanabe, Kengo
2018-07-01
Acute heart failure (AHF) is a heterogeneous disease caused by various cardiovascular (CV) pathophysiology and multiple non-CV comorbidities. We aimed to identify clinically important subgroups to improve our understanding of the pathophysiology of AHF and inform clinical decision-making. We evaluated detailed clinical data of 345 consecutive AHF patients using non-hierarchical cluster analysis of 77 variables, including age, sex, HF etiology, comorbidities, physical findings, laboratory data, electrocardiogram, echocardiogram and treatment during hospitalization. Cox proportional hazards regression analysis was performed to estimate the association between the clusters and clinical outcomes. Three clusters were identified. Cluster 1 (n=108) represented "vascular failure". This cluster had the highest average systolic blood pressure at admission and lung congestion with type 2 respiratory failure. Cluster 2 (n=89) represented "cardiac and renal failure". They had the lowest ejection fraction (EF) and worst renal function. Cluster 3 (n=148) comprised mostly older patients and had the highest prevalence of atrial fibrillation and preserved EF. Death or HF hospitalization within 12-month occurred in 23% of Cluster 1, 36% of Cluster 2 and 36% of Cluster 3 (p=0.034). Compared with Cluster 1, risk of death or HF hospitalization was 1.74 (95% CI, 1.03-2.95, p=0.037) for Cluster 2 and 1.82 (95% CI, 1.13-2.93, p=0.014) for Cluster 3. Cluster analysis may be effective in producing clinically relevant categories of AHF, and may suggest underlying pathophysiology and potential utility in predicting clinical outcomes. Copyright © 2018 Elsevier B.V. All rights reserved.
The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis
Directory of Open Access Journals (Sweden)
Chen Yidong
2004-01-01
Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.
The Flemish frozen-vegetable industry as an example of cluster analysis : Flanders Vegetable Valley
Vanhaverbeke, W.P.M.; Larosse, J.; Winnen, W.; Hulsink, W.; Dons, J.J.M.
2008-01-01
In this contribution we present a strategic analysis of the cluster dynamics in the frozen-vegetable industry in Flanders (Belgium)1. The main purpose of this case is twofold. First, we determine the added value of using data about customer and supplier relationships in cluster analysis. Second, we
Brown, S. J.; White, S.; Power, N.
2015-01-01
A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…
Directory of Open Access Journals (Sweden)
Mojmír Sabolovič
2011-01-01
Full Text Available The paper deals with theoretical conception of value analysis of regions, municipal corporations and clusters. The subject of this paper is heterodox approach to sensitivity analysis of finite set of variables based on non-additive measure. For dynamic analysis of trajectory of general value are sufficient robust models based on maximum entropy principle. Findings concern explanation of proper fuzzy integral – Choquet integral. The fuzzy measure is represented by theory of capacities (Choquet, 1953 on powerset. In fine, the conception of the New integral for capacities (Lehler, 2005 is discussed. Value analysis and transmission constitutes remarkable aspect of performance evaluation of regions, municipal corporations and clusters. In the light of high ratio of soft variables, social behavior, intangible assets and human capital within those types of subjects the fuzzy integral introduce useful tool for modeling. The New integral afterwards concerns considerable characteristic of people behavior – risk averse articulated concave function and non-additive operator. Results comprehended tools enabling observation of synergy, redundancy and inhibition of value variables as consequence of non-additive measure. In fine, results induced issues for future research.
Performance Analysis of Cluster Formation in Wireless Sensor Networks
Directory of Open Access Journals (Sweden)
Edgar Romo Montiel
2017-12-01
Full Text Available Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.
Performance Analysis of Cluster Formation in Wireless Sensor Networks.
Montiel, Edgar Romo; Rivero-Angeles, Mario E; Rubino, Gerardo; Molina-Lozano, Heron; Menchaca-Mendez, Rolando; Menchaca-Mendez, Ricardo
2017-12-13
Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN) use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.
Meng, Xiaocheng; Che, Renfei; Gao, Shi; He, Juntao
2018-04-01
With the advent of large data age, power system research has entered a new stage. At present, the main application of large data in the power system is the early warning analysis of the power equipment, that is, by collecting the relevant historical fault data information, the system security is improved by predicting the early warning and failure rate of different kinds of equipment under certain relational factors. In this paper, a method of line failure rate warning is proposed. Firstly, fuzzy dynamic clustering is carried out based on the collected historical information. Considering the imbalance between the attributes, the coefficient of variation is given to the corresponding weights. And then use the weighted fuzzy clustering to deal with the data more effectively. Then, by analyzing the basic idea and basic properties of the relational analysis model theory, the gray relational model is improved by combining the slope and the Deng model. And the incremental composition and composition of the two sequences are also considered to the gray relational model to obtain the gray relational degree between the various samples. The failure rate is predicted according to the principle of weighting. Finally, the concrete process is expounded by an example, and the validity and superiority of the proposed method are verified.
Kindschuh, Sarah R.; Cain, James W.; Daniel, David; Peyton, Mark A.
2016-01-01
The capacity to describe and quantify predation by large carnivores expanded considerably with the advent of GPS technology. Analyzing clusters of GPS locations formed by carnivores facilitates the detection of predation events by identifying characteristics which distinguish predation sites. We present a performance assessment of GPS cluster analysis as applied to the predation and scavenging of an omnivore, the American black bear (Ursus americanus), on ungulate prey and carrion. Through field investigations of 6854 GPS locations from 24 individual bears, we identified 54 sites where black bears formed a cluster of locations while predating or scavenging elk (Cervus elaphus), mule deer (Odocoileus hemionus), or cattle (Bos spp.). We developed models for three data sets to predict whether a GPS cluster was formed at a carnivory site vs. a non-carnivory site (e.g., bed sites or non-ungulate foraging sites). Two full-season data sets contained GPS locations logged at either 3-h or 30-min intervals from April to November, and a third data set contained 30-min interval data from April through July corresponding to the calving period for elk. Longer fix intervals resulted in the detection of fewer carnivory sites. Clusters were more likely to be carnivory sites if they occurred in open or edge habitats, if they occurred in the early season, if the mean distance between all pairs of GPS locations within the cluster was less, and if the cluster endured for a longer period of time. Clusters were less likely to be carnivory sites if they were initiated in the morning or night compared to the day. The top models for each data set performed well and successfully predicted 71–96% of field-verified carnivory events, 55–75% of non–carnivory events, and 58–76% of clusters overall. Refinement of this method will benefit from further application across species and ecological systems.
Clusters of galaxies as tools in observational cosmology : results from x-ray analysis
International Nuclear Information System (INIS)
Weratschnig, J.M.
2009-01-01
Clusters of galaxies are the largest gravitationally bound structures in the universe. They can be used as ideal tools to study large scale structure formation (e.g. when studying merger clusters) and provide highly interesting environments to analyse several characteristic interaction processes (like ram pressure stripping of galaxies, magnetic fields). In this dissertation thesis, we have studied several clusters of galaxies using X-ray observations. To obtain scientific results, we have applied different data reduction and analysis methods. With a combination of morphological and spectral analysis, the merger cluster Abell 514 was studied in much detail. It has a highly interesting morphology and shows signs for an ongoing merger as well as a shock. using a new method to detect substructure, we have analysed several clusters to determine whether any substructure is present in the X-ray image. This hints towards a real structure in the distribution of the intra-cluster medium (ICM) and is evidence for ongoing mergers. The results from this analysis are extensively used with the cluster of galaxies Abell S1136. Here, we study the ICM distribution and compare its structure with the spatial distribution of star forming galaxies. Cluster magnetic fields are another important topic of my thesis. They can be studied in Radio observations, which can be put into relation with results from X-ray observations. using observational data from several clusters, we could support the theory that cluster magnetic fields are frozen into the ICM. (author)
Lei, Yang; Yu, Dai; Bin, Zhang; Yang, Yang
2017-01-01
Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor's knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K -means clustering method to improve the user's satisfactions towards the result. The core of this method is to get the user's feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user's business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user's requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.
Cluster Computing For Real Time Seismic Array Analysis.
Martini, M.; Giudicepietro, F.
A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by
Concordance of X-ray cluster data with big bang nucleosynthesis in mixed dark matter models
International Nuclear Information System (INIS)
Strickland, R.W.; Schramm, D.N.
1997-01-01
If the hot, X-ray-emitting gas in rich clusters forms a fair sample of the universe as in cold dark matter (CDM) models and the universe is at the critical density Ω T =1, then the data appear to imply a baryon fraction Ω b,X (Ω b,X ≡Ω b derived from X-ray cluster data), larger than that predicted by big bang nucleosynthesis (BBN). While other systematic effects such as clumping can lower Ω b,X , in this paper we use an elementary analysis to show that a simple admixture of hot dark matter (HDM; low-mass neutrinos) with CDM to yield mixed dark matter shifts Ω b,X down so that significant overlap with Ω b from BBN can occur for H 0 approx-lt 73kms -1 Mpc -1 , even without invoking the possible aforementioned effects. The overlap interval is slightly larger for lower mass neutrinos since fewer of them cluster on the scale of the hot X-ray gas. We illustrate this result quantitatively in terms of a simple isothermal model. More realistic velocity dispersion profiles, with less centrally peaked density profiles, imply that fewer neutrinos are trapped and thus further increase the interval of overlap. copyright 1997 The American Astronomical Society
Modelling of Krn+ Clusters. II. Photoabsorption Spectra of Small Clusters (n=2 - 5)
Czech Academy of Sciences Publication Activity Database
Kalus, R.; Paidarová, Ivana; Hrivňák, D.; Gadea, F. X.
2004-01-01
Roč. 298, 1/3 (2004), s. 155-166 ISSN 0301-0104 R&D Projects: GA ČR GA203/02/1204 Grant - others:Barrande Program(XE) 2003-024-1 Institutional research plan: CEZ:AV0Z4040901 Keywords : krypton * rare gases * cluster ions Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 2.316, year: 2004
Choi, Jiwoong; Leblanc, Lawrence; Choi, Sanghun; Haghighi, Babak; Hoffman, Eric; Lin, Ching-Long
2017-11-01
The goal of this study is to assess inter-subject variability in delivery of orally inhaled drug products to small airways in asthmatic lungs. A recent multiscale imaging-based cluster analysis (MICA) of computed tomography (CT) lung images in an asthmatic cohort identified four clusters with statistically distinct structural and functional phenotypes associating with unique clinical biomarkers. Thus, we aimed to address inter-subject variability via inter-cluster variability. We selected a representative subject from each of the 4 asthma clusters as well as 1 male and 1 female healthy controls, and performed computational fluid and particle simulations on CT-based airway models of these subjects. The results from one severe and one non-severe asthmatic cluster subjects characterized by segmental airway constriction had increased particle deposition efficiency, as compared with the other two cluster subjects (one non-severe and one severe asthmatics) without airway constriction. Constriction-induced jets impinging on distal bifurcations led to excessive particle deposition. The results emphasize the impact of airway constriction on regional particle deposition rather than disease severity, demonstrating the potential of using cluster membership to tailor drug delivery. NIH Grants U01HL114494 and S10-RR022421, and FDA Grant U01FD005837. XSEDE.
Cluster model calculations of the solid state materials electron structure
International Nuclear Information System (INIS)
Pelikan, P.; Biskupic, S.; Banacky, P.; Zajac, A.; Svrcek, A.; Noga, J.
1997-01-01
Materials of the general composition ACuO 2 are the parent compounds of so called infinite layer superconductors. In the paper presented the electron structure of the compounds CaCuO 2 , SrCuO2, Ca 0.86 Sr 0.14 CuO2 and Ca 0.26 Sr 0.74 CuO 2 were calculated. The cluster models consisting of 192 atoms were computed using quasi relativistic version of semiempirical INDO method. The obtained results indicate the strong ionicity of Ca/Sr-O bonds and high covalency of Cu-bonds. The width of energy gap at the Fermi level increases as follows: Ca 0.26 Sr 0.74 CuO 2 0.86 Sr 0.14 CuO2 2 . This order correlates with the fact that materials of the composition Ca x Sr 1-x CuO 2 have have the high temperatures of the superconductive transition (up to 110 K). Materials partially substituted by Sr 2+ have also the higher density of states in the close vicinity at the Fermi level that ai the additional condition for the possibility of superconductive transition. It was calculated the strong influence of the vibration motions to the energy gap at the Fermi level. (authors). 1 tabs., 2 figs., 10 refs
Comparative analysis of clustering methods for gene expression time course data
Directory of Open Access Journals (Sweden)
Ivan G. Costa
2004-01-01
Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.
Internal validation of risk models in clustered data: a comparison of bootstrap schemes
Bouwmeester, W.; Moons, K.G.M.; Kappen, T.H.; van Klei, W.A.; Twisk, J.W.R.; Eijkemans, M.J.C.; Vergouwe, Y.
2013-01-01
Internal validity of a risk model can be studied efficiently with bootstrapping to assess possible optimism in model performance. Assumptions of the regular bootstrap are violated when the development data are clustered. We compared alternative resampling schemes in clustered data for the estimation
Field of Study Choice: Using Conjoint Analysis and Clustering
Shtudiner, Ze'ev; Zwilling, Moti; Kantor, Jeffrey
2017-01-01
Purpose: The purpose of this paper is to measure student's preferences regarding various attributes that affect their decision process while choosing a higher education area of study. Design/ Methodology/Approach: The paper exhibits two different models which shed light on the perceived value of each examined area of study: conjoint analysis and…
Analysis of brood sex ratios: implications of offspring clustering
Czech Academy of Sciences Publication Activity Database
Krackow, S.; Tkadlec, Emil
Roc. 50, č. 4 (2001), s. 293-301 ISSN 0340-5443 R&D Projects: GA ČR GA524/01/1316 Institutional research plan: CEZ:AV0Z6093917 Keywords : generalized linear mixed models * random coefficients * multilevel analysis Subject RIV: EG - Zoology Impact factor: 2.353, year: 2001
Testing dark energy and dark matter cosmological models with clusters of galaxies
Energy Technology Data Exchange (ETDEWEB)
Boehringer, Hans [Max-Planck-Institut fuer Extraterrestrische Physik, Garching (Germany)
2008-07-01
Galaxy clusters are, as the largest building blocks of our Universe, ideal probes to study the large-scale structure and to test cosmological models. The principle approach und the status of this research is reviewed. Clusters lend themselves for tests in serveral ways: the cluster mass function, the spatial clustering, the evolution of both functions with reshift, and the internal composition can be used to constrain cosmological parameters. X-ray observations are currently the best means of obtaining the relevant data on the galaxy cluster population. We illustrate in particular all the above mentioned methods with our ROSAT based cluster surveys. The mass calibration of clusters is an important issue, that is currently solved with XMM-Newton and Chandra studies. Based on the current experience we provide an outlook for future research, especially with eROSITA.
The effect of mining data k-means clustering toward students profile model drop out potential
Purba, Windania; Tamba, Saut; Saragih, Jepronel
2018-04-01
The high of student success and the low of student failure can reflect the quality of a college. One of the factors of fail students was drop out. To solve the problem, so mining data with K-means Clustering was applied. K-Means Clustering method would be implemented to clustering the drop out students potentially. Firstly the the result data would be clustering to get the information of all students condition. Based on the model taken was found that students who potentially drop out because of the unexciting students in learning, unsupported parents, diffident students and less of students behavior time. The result of process of K-Means Clustering could known that students who more potentially drop out were in Cluster 1 caused Credit Total System, Quality Total, and the lowest Grade Point Average (GPA) compared between cluster 2 and 3.
A Coupled Hidden Markov Random Field Model for Simultaneous Face Clustering and Tracking in Videos
Wu, Baoyuan
2016-10-25
Face clustering and face tracking are two areas of active research in automatic facial video processing. They, however, have long been studied separately, despite the inherent link between them. In this paper, we propose to perform simultaneous face clustering and face tracking from real world videos. The motivation for the proposed research is that face clustering and face tracking can provide useful information and constraints to each other, thus can bootstrap and improve the performances of each other. To this end, we introduce a Coupled Hidden Markov Random Field (CHMRF) to simultaneously model face clustering, face tracking, and their interactions. We provide an effective algorithm based on constrained clustering and optimal tracking for the joint optimization of cluster labels and face tracking. We demonstrate significant improvements over state-of-the-art results in face clustering and tracking on several videos.
The cosmological analysis of X-ray cluster surveys - I. A new method for interpreting number counts
Clerc, N.; Pierre, M.; Pacaud, F.; Sadibekova, T.
2012-07-01
We present a new method aimed at simplifying the cosmological analysis of X-ray cluster surveys. It is based on purely instrumental observable quantities considered in a two-dimensional X-ray colour-magnitude diagram (hardness ratio versus count rate). The basic principle is that even in rather shallow surveys, substantial information on cluster redshift and temperature is present in the raw X-ray data and can be statistically extracted; in parallel, such diagrams can be readily predicted from an ab initio cosmological modelling. We illustrate the methodology for the case of a 100-deg2XMM survey having a sensitivity of ˜10-14 erg s-1 cm-2 and fit at the same time, the survey selection function, the cluster evolutionary scaling relations and the cosmology; our sole assumption - driven by the limited size of the sample considered in the case study - is that the local cluster scaling relations are known. We devote special attention to the realistic modelling of the count-rate measurement uncertainties and evaluate the potential of the method via a Fisher analysis. In the absence of individual cluster redshifts, the count rate and hardness ratio (CR-HR) method appears to be much more efficient than the traditional approach based on cluster counts (i.e. dn/dz, requiring redshifts). In the case where redshifts are available, our method performs similar to the traditional mass function (dn/dM/dz) for the purely cosmological parameters, but constrains better parameters defining the cluster scaling relations and their evolution. A further practical advantage of the CR-HR method is its simplicity: this fully top-down approach totally bypasses the tedious steps consisting in deriving cluster masses from X-ray temperature measurements.
A weak lensing analysis of the PLCK G100.2-30.4 cluster
Radovich, M.; Formicola, I.; Meneghetti, M.; Bartalucci, I.; Bourdin, H.; Mazzotta, P.; Moscardini, L.; Ettori, S.; Arnaud, M.; Pratt, G. W.; Aghanim, N.; Dahle, H.; Douspis, M.; Pointecouteau, E.; Grado, A.
2015-07-01
We present a mass estimate of the Planck-discovered cluster PLCK G100.2-30.4, derived from a weak lensing analysis of deep Subaru griz images. We perform a careful selection of the background galaxies using the multi-band imaging data, and undertake the weak lensing analysis on the deep (1 h) r -band image. The shape measurement is based on the Kaiser-Squires-Broadhurst algorithm; we adopt the PSFex software to model the point spread function (PSF) across the field and correct for this in the shape measurement. The weak lensing analysis is validated through extensive image simulations. We compare the resulting weak lensing mass profile and total mass estimate to those obtained from our re-analysis of XMM-Newton observations, derived under the hypothesis of hydrostatic equilibrium. The total integrated mass profiles agree remarkably well, within 1σ across their common radial range. A mass M500 ~ 7 × 1014M⊙ is derived for the cluster from our weak lensing analysis. Comparing this value to that obtained from our reanalysis of XMM-Newton data, we obtain a bias factor of (1-b) = 0.8 ± 0.1. This is compatible within 1σ with the value of (1-b) obtained in Planck 2015 from the calibration of the bias factor using newly available weak lensing reconstructed masses. Based on data collected at Subaru Telescope (University of Tokyo).
Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun
2014-04-29
To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.
OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.
Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A
2014-10-16
The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.
Comparison of Outputs for Variable Combinations Used in Cluster Analysis on Polarmetric Imagery
National Research Council Canada - National Science Library
Petre, Melinda
2008-01-01
.... More specifically, two techniques, Cluster Analysis (CA) and Principle Component Analysis (PCA) can be combined to process Stoke s imagery by distinguishing between pixels, and producing groups of pixels with similar characteristics...
Amir Farbin
The ATLAS Analysis Model is a continually developing vision of how to reconcile physics analysis requirements with the ATLAS offline software and computing model constraints. In the past year this vision has influenced the evolution of the ATLAS Event Data Model, the Athena software framework, and physics analysis tools. These developments, along with the October Analysis Model Workshop and the planning for CSC analyses have led to a rapid refinement of the ATLAS Analysis Model in the past few months. This article introduces some of the relevant issues and presents the current vision of the future ATLAS Analysis Model. Event Data Model The ATLAS Event Data Model (EDM) consists of several levels of details, each targeted for a specific set of tasks. For example the Event Summary Data (ESD) stores calorimeter cells and tracking system hits thereby permitting many calibration and alignment tasks, but will be only accessible at particular computing sites with potentially large latency. In contrast, the Analysis...
Xu, Zhiqiang
2017-02-16
Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approach