WorldWideScience

Sample records for 1842676957299765latent class cluster

  1. Context-sensitive intra-class clustering

    Yu, Yingwei

    2014-02-01

    This paper describes a new semi-supervised learning algorithm for intra-class clustering (ICC). ICC partitions each class into sub-classes in order to minimize overlap across clusters from different classes. This is achieved by allowing partitioning of a certain class to be assisted by data points from other classes in a context-dependent fashion. The result is that overlap across sub-classes (both within- and across class) is greatly reduced. ICC is particularly useful when combined with algorithms that assume that each class has a unimodal Gaussian distribution (e.g., Linear Discriminant Analysis (LDA), quadratic classifiers), an assumption that is not always true in many real-world situations. ICC can help partition non-Gaussian, multimodal distributions to overcome such a problem. In this sense, ICC works as a preprocessor. Experiments with our ICC algorithm on synthetic data sets and real-world data sets indicated that it can significantly improve the performance of LDA and quadratic classifiers. We expect our approach to be applicable to a broader class of pattern recognition problems where class-conditional densities are significantly non-Gaussian or multi-modal. © 2013 Elsevier Ltd. All rights reserved.

  2. Identifying Clusters of Concepts in a Low Cohesive Class for Extract Class Refactoring Using Metrics Supplemented Agglomerative Clustering Technique

    Rao, A Ananda

    2012-01-01

    Object oriented software with low cohesive classes can increase maintenance cost. Low cohesive classes are likely to be introduced into the software during initial design due to deviation from design principles and during evolution due to software deterioration. Low cohesive class performs operations that should be done by two or more classes. The low cohesive classes need to be identified and refactored using extract class refactoring to improve the cohesion. In this regard, two aspects are involved; the first one is to identify the low cohesive classes and the second one is to identify the clusters of concepts in the low cohesive classes for extract class refactoring. In this paper, we propose metrics supplemented agglomerative clustering technique for covering the above two aspects. The proposed metrics are validated using Weyuker's properties. The approach is applied successfully on two examples and on a case study.

  3. Class Restricted Clustering and Micro-Perturbation for Data Privacy

    Li, Xiao-Bai; Sarkar, Sumit

    2013-01-01

    The extensive use of information technologies by organizations to collect and share personal data has raised strong privacy concerns. To respond to the public’s demand for data privacy, a class of clustering-based data masking techniques is increasingly being used for privacy-preserving data sharing and analytics. Traditional clustering-based approaches for masking numeric attributes, while addressing re-identification risks, typically do not consider the disclosure risk of categorical confid...

  4. Arabic web pages clustering and annotation using semantic class features

    Hanan M. Alghamdi

    2014-12-01

    Full Text Available To effectively manage the great amount of data on Arabic web pages and to enable the classification of relevant information are very important research problems. Studies on sentiment text mining have been very limited in the Arabic language because they need to involve deep semantic processing. Therefore, in this paper, we aim to retrieve machine-understandable data with the help of a Web content mining technique to detect covert knowledge within these data. We propose an approach to achieve clustering with semantic similarities. This approach comprises integrating k-means document clustering with semantic feature extraction and document vectorization to group Arabic web pages according to semantic similarities and then show the semantic annotation. The document vectorization helps to transform text documents into a semantic class probability distribution or semantic class density. To reach semantic similarities, the approach extracts the semantic class features and integrates them into the similarity weighting schema. The quality of the clustering result has evaluated the use of the purity and the mean intra-cluster distance (MICD evaluation measures. We have evaluated the proposed approach on a set of common Arabic news web pages. We have acquired favorable clustering results that are effective in minimizing the MICD, expanding the purity and lowering the runtime.

  5. Filling the gap: a new class of old star cluster?

    Forbes, Duncan; Usher, Christopher; Strader, Jay; Romanowsky, Aaron; Brodie, Jean; Arnold, Jacob; Spitler, Lee

    2013-01-01

    It is not understood whether long-lived star clusters possess a continuous range of sizes and masses (and hence densities), or if rather, they should be considered as distinct types with different origins. Utilizing the Hubble Space Telescope (HST) to measure sizes, and long exposures on the Keck 10m telescope to obtain distances, we have discovered the first confirmed star clusters that lie within a previously claimed size-luminosity gap dubbed the `avoidance zone' by Hwang et al (2011). The existence of these star clusters extends the range of sizes, masses and densities for star clusters, and argues against current formation models that predict well-defined size-mass relationships (such as stripped nuclei, giant globular clusters or merged star clusters). The red colours of these gap objects suggests that they are not a new class of object but are related to Faint Fuzzies observed in nearby lenticular galaxies. We also report a number of low luminosity UCDs with sizes of up to 50 pc. Future, statistically ...

  6. ADHD latent class clusters: DSM-IV subtypes and comorbidity.

    Elia, Josephine; Arcos-Burgos, Mauricio; Bolton, Kelly L; Ambrosini, Paul J; Berrettini, Wade; Muenke, Maximilian

    2009-12-30

    ADHD (Attention Deficit Hyperactivity Disorder) has a complex, heterogeneous phenotype only partially captured by Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) criteria. In this report, latent class analyses (LCA) are used to identify ADHD phenotypes using K-SADS-IVR (Schedule for Affective Disorders & Schizophrenia for School Age Children-IV-Revised) symptoms and symptom severity data from a clinical sample of 500 ADHD subjects, ages 6-18, participating in an ADHD genetic study. Results show that LCA identified six separate ADHD clusters, some corresponding to specific DSM-IV subtypes while others included several subtypes. DSM-IV comorbid anxiety and mood disorders were generally similar across all clusters, and subjects without comorbidity did not aggregate within any one cluster. Age and gender composition also varied. These results support findings from population-based LCA studies. The six clusters provide additional homogenous groups that can be used to define ADHD phenotypes in genetic association studies. The limited age ranges aggregating in the different clusters may prove to be a particular advantage in genetic studies where candidate gene expression may vary during developmental phases. DSM-IV comorbid mood and anxiety disorders also do not appear to increase cluster heterogeneity; however, longitudinal studies that cover period of risk are needed to support this finding. PMID:19900717

  7. Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

    Rita Ismayilova; Emilya Nasirova; Colleen Hanou; Rivard, Robert G.; Bautista, Christian T.

    2014-01-01

    Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC) analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model...

  8. Parallel unstructured AMR and gigabit networking for Beowulf-class clusters

    Norton, C. D.; Cwik, T. A.

    2001-01-01

    The impact of gigabit networking with Myrinet 2000 hardware and MPICH-GM software on a 2-way SMP Beowulf-class cluster for parallel unstructured adaptive mesh refinement using the PYRAMID library is described.

  9. Teleportation of an Arbitrary Two-Particle State via a Single Cluster-Class State

    Teleportation of an arbitrary two-qubit state with a single partially entangled state, a four-qubit linear cluster-class state, is studied. The case is more practical than previous ones using maximally entangled states as the quantum channel. In order to realize teleportation, we first construct a cluster-basis of 16 orthonormal cluster states. We show that quantum teleportation can be successfully implemented with a certain probability if the receiver can adopt appropriate unitary transformations after receiving the sender's cluster-basis measurement information. In addition, an important conclusion can be obtained that a four-qubit maximally entangled state (cluster state) can be extracted from a single copy of the cluster-class state with the same probability as the teleportation in principle. (general)

  10. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  11. ADHD latent class clusters: DSM-IV subtypes and comorbidity

    Elia, Josephine; Arcos-Burgos, Mauricio; Bolton, Kelly L.; Ambrosini, Paul J.; Berrettini, Wade; Muenke, Maximilian

    2009-01-01

    ADHD (Attention Deficit Hyperactivity Disorder) has a complex, heterogeneous phenotype only partially captured by Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) criteria. In this report, latent class analyses (LCA) are used to identify ADHD phenotypes using K-SADS-IVR (Schedule for Affective Disorders & Schizophrenia for School Age Children-IV-Revised) symptoms and symptom severity data from a clinical sample of 500 ADHD subjects, ages 6–18, participating in an ADHD genetic st...

  12. Cyclist–motorist crash patterns in Denmark: A latent class clustering approach

    Kaplan, Sigal; Prato, Carlo Giacomo

    2013-01-01

    Objective: The current study aimed at uncovering patterns of cyclist–motorist crashes in Denmark and investigating their prevalence and severity. The importance of implementing clustering techniques for providing a holistic overview of vulnerable road users’ crash patterns derives from the need to...... prioritize safety issues and to devise efficient preventive measures. Method: The current study focused on cyclist–motorist crashes that occurred in Denmark during the period between 2007 and 2011. To uncover crash patterns, the current analysis applied latent class clustering, an unsupervised probabilistic...... clustering approach that relies on the statistical concept of likelihood and allows partial overlap across clusters. Results: The analysis yielded 13 distinguishable cyclist–motorist latent classes. Specific crash patterns for urban and rural areas were revealed. Prevalent features that allowed...

  13. Bond percolation on a class of correlated and clustered random graphs

    We introduce a formalism for computing bond percolation properties of a class of correlated and clustered random graphs. This class of graphs is a generalization of the configuration model where nodes of different types are connected via different types of hyperedges, edges that can link more than two nodes. We argue that the multitype approach coupled with the use of clustered hyperedges can reproduce a wide spectrum of complex patterns, and thus enhances our capability to model real complex networks. As an illustration of this claim, we use our formalism to highlight unusual behaviours of the size and composition of the components (small and giant) in a synthetic, albeit realistic, social network. (paper)

  14. Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering

    Luo, Zhihui; Johnson, Stephen B.; Weng, Chunhua

    2010-01-01

    This paper presents a novel approach to learning semantic classes of clinical research eligibility criteria. It uses the UMLS Semantic Types to represent semantic features and the Hierarchical Clustering method to group similar eligibility criteria. By establishing a gold standard using two independent raters, we evaluated the coverage and accuracy of the induced semantic classes. On 2,718 random eligibility criteria sentences, the inter-rater classification agreement was 85.73%. In a 10-fold...

  15. The X-CLASS - redMaPPer galaxy cluster comparison: I. Identification procedures

    Sadibekova, Tatyana; Clerc, Nicolas; Faccioli, Lorenzo; Gastaud, Rene; Fevre, Jean-Paul Le; Rozo, Eduardo; Rykoff, Eli S

    2014-01-01

    We performed a detailed and, for a large part interactive, analysis of the matching output between the X-CLASS and redMaPPer cluster catalogues. The overlap between the two catalogues has been accurately determined and possible cluster positional errors were manually recovered. The final samples comprise 270 and 355 redMaPPer and X-CLASS clusters respectively. X-ray cluster matching rates were analysed as a function of optical richness. In a second step, the redMaPPer clusters were correlated with the entire X-ray catalogue, containing point and uncharacterised sources (down to a few 10^{-15} erg s^{-1} cm^{-2} in the [0.5-2] keV band). A stacking analysis was performed for the remaining undetected optical clusters. Main results show that neither of the wavebands misses any massive cluster (as coded by X-ray luminosity or optical richness). After correcting for obvious pipeline short-comings (about 10% of the cases both in optical and X-ray), ~50% of the redMaPPer (down to a richness of 20) are found to coinc...

  16. Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

    Rita Ismayilova

    2014-01-01

    Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.

  17. Exploring the Relationship between Autism Spectrum Disorder and Epilepsy Using Latent Class Cluster Analysis

    Cuccaro, Michael L.; Tuchman, Roberto F.; Hamilton, Kara L.; Wright, Harry H.; Abramson, Ruth K.; Haines, Jonathan L.; Gilbert, John R.; Pericak-Vance, Margaret

    2012-01-01

    Epilepsy co-occurs frequently in autism spectrum disorders (ASD). Understanding this co-occurrence requires a better understanding of the ASD-epilepsy phenotype (or phenotypes). To address this, we conducted latent class cluster analysis (LCCA) on an ASD dataset (N = 577) which included 64 individuals with epilepsy. We identified a 5-cluster…

  18. Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering.

    Luo, Zhihui; Johnson, Stephen B; Weng, Chunhua

    2010-01-01

    This paper presents a novel approach to learning semantic classes of clinical research eligibility criteria. It uses the UMLS Semantic Types to represent semantic features and the Hierarchical Clustering method to group similar eligibility criteria. By establishing a gold standard using two independent raters, we evaluated the coverage and accuracy of the induced semantic classes. On 2,718 random eligibility criteria sentences, the inter-rater classification agreement was 85.73%. In a 10-fold validation test, the average Precision, Recall and F-score of the classification results of a decision-tree classifier were 87.8%, 88.0%, and 87.7% respectively. Our induced classes well aligned with 16 out of 17 eligibility criteria classes defined by the BRIDGE model. We discuss the potential of this method and our future work. PMID:21347026

  19. Partnership effects in general practice: identification of clustering using intra-class correlation coefficients.

    Ashworth, Mark; Armstrong, David

    2003-01-01

    Although most United Kingdom general practitioners (GPs) work together in a shared professional arrangement termed 'partnership', little is known about the nature of such partnerships. We report the results of a survey of 61 general practice partners in 15 group practices and their attitudes to prescribing and managerial issues related to participation in a commissioning group. Intra-class correlation coefficients (ICCs) were used to explore how these individually held attitudes clustered wit...

  20. Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment

    Mirela Praisler

    2014-01-01

    Full Text Available An essential factor influencing the efficiency of the predictive models built with principal component analysis (PCA is the quality of the data clustering revealed by the score plots. The sensitivity and selectivity of the class assignment are strongly influenced by the relative position of the clusters and by their dispersion. We are proposing a set of indicators inspired from analytical geometry that may be used for an objective quantitative assessment of the data clustering quality as well as a global clustering quality coefficient (GCQC that is a measure of the overall predictive power of the PCA models. The use of these indicators for evaluating the efficiency of the PCA class assignment is illustrated by a comparative study performed for the identification of the preprocessing function that is generating the most efficient PCA system screening for amphetamines based on their GC-FTIR spectra. The GCQC ranking of the tested feature weights is explained based on estimated density distributions and validated by using quadratic discriminant analysis (QDA.

  1. A class of spherical, truncated, anisotropic models for application to globular clusters

    de Vita, Ruggero; Bertin, Giuseppe; Zocchi, Alice

    2016-05-01

    Recently, a class of non-truncated, radially anisotropic models (the so-called f(ν)-models), originally constructed in the context of violent relaxation and modelling of elliptical galaxies, has been found to possess interesting qualities in relation to observed and simulated globular clusters. In view of new applications to globular clusters, we improve this class of models along two directions. To make them more suitable for the description of small stellar systems hosted by galaxies, we introduce a "tidal" truncation by means of a procedure that guarantees full continuity of the distribution function. The new fT(ν)-models are shown to provide a better fit to the observed photometric and spectroscopic profiles for a sample of 13 globular clusters studied earlier by means of non-truncated models; interestingly, the best-fit models also perform better with respect to the radial-orbit instability. Then, we design a flexible but simple two-component family of truncated models to study the separate issues of mass segregation and multiple populations. We do not aim at a fully realistic description of globular clusters to compete with the description currently obtained by means of dedicated simulations. The goal here is to try to identify the simplest models, that is, those with the smallest number of free parameters, but still have the capacity to provide a reasonable description for clusters that are evidently beyond the reach of one-component models. With this tool, we aim at identifying the key factors that characterize mass segregation or the presence of multiple populations. To reduce the relevant parameter space, we formulate a few physical arguments based on recent observations and simulations. A first application to two well-studied globular clusters is briefly described and discussed.

  2. Cluster analysis of structural stage classes to map wildland fuels in a Madrean ecosystem.

    Miller, Jay D; Danzer, Shelley R; Watts, Joseph M; Stone, Sheridan; Yool, Stephen R

    2003-07-01

    Geospatial information technology is changing the nature of fire mapping science and management. Geographic information systems (GIS) and global positioning system technology coupled with remotely sensed data provide powerful tools for mapping, assessing, and understanding the complex spatial phenomena of wildland fuels and fire hazard. The effectiveness of these technologies for fire management still depends on good baseline fuels data since techniques have yet to be developed to directly interrogate understory fuels with remotely sensed data. We couple field data collections with GIS, remote sensing, and hierarchical clustering to characterize and map the variability of wildland fuels within and across vegetation types. One hundred fifty six fuel plots were sampled in eight vegetation types ranging in elevation from 1150 to 2600 m surrounding a Madrean 'sky island' mountain range in the southwestern US. Fuel plots within individual vegetation types were divided into classes representing various stages of structural development with unique fuel load characteristics using a hierarchical clustering method. Two Landsat satellite images were then classified into vegetation/fuel classes using a hybrid unsupervised/supervised approach. A back-classification accuracy assessment, which uses the same pixels to test as used to train the classifier, produced an overall Kappa of 50% for the vegetation/fuels map. The map with fuel classes within vegetation type collapsed into single classes was verified with an independent dataset, yielding an overall Kappa of 80%. PMID:12837253

  3. NGC 6273: Towards Defining A New Class of Galactic Globular Clusters?

    Johnson, Christian I.; Rich, Robert Michael; Pilachowski, Catherine A.; Caldwell, Nelson; Mateo, Mario L.; Ira Bailey, John; Crane, Jeffrey D.

    2016-01-01

    A growing number of observations have found that several Galactic globular clusters exhibit abundance dispersions beyond the well-known light element (anti-)correlations. These clusters tend to be very massive, have >0.1 dex intrinsic metallicity dispersions, have complex sub-giant branch morphologies, and have correlated [Fe/H] and s-process element enhancements. Interestingly, nearly all of these clusters discovered so far have [Fe/H]~-1.7. In this context, we have examined the chemical composition of 18 red giant branch (RGB) stars in the massive, metal-poor Galactic bulge globular cluster NGC 6273 using high signal-to-noise, high resolution (R~27,000) spectra obtained with the Michigan/Magellan Fiber System (M2FS) and MSpec spectrograph mounted on the Magellan-Clay 6.5m telescope at Las Campanas Observatory. We find that the cluster exhibits a metallicity range from [Fe/H]=-1.80 to -1.30 and is composed of two dominant populations separated in [Fe/H] and [La/Fe] abundance. The increase in [La/Eu] as a function of [La/H] suggests that the increase in [La/Fe] with [Fe/H] is due to almost pure s-process enrichment. The most metal-rich star in our sample is not strongly La-enhanced, but is α-poor and may belong to a third "anomalous" stellar population. The two dominant populations exhibit the same [Na/Fe]-[Al/Fe] correlation found in other "normal" globular clusters. Therefore, NGC 6273 joins ω Centauri, M 22, M 2, and NGC 5286 as a possible new class of Galactic globular clusters.

  4. Incremental multi-class semi-supervised clustering regularized by Kalman filtering.

    Mehrkanoon, Siamak; Agudelo, Oscar Mauricio; Suykens, Johan A K

    2015-11-01

    This paper introduces an on-line semi-supervised learning algorithm formulated as a regularized kernel spectral clustering (KSC) approach. We consider the case where new data arrive sequentially but only a small fraction of it is labeled. The available labeled data act as prototypes and help to improve the performance of the algorithm to estimate the labels of the unlabeled data points. We adopt a recently proposed multi-class semi-supervised KSC based algorithm (MSS-KSC) and make it applicable for on-line data clustering. Given a few user-labeled data points the initial model is learned and then the class membership of the remaining data points in the current and subsequent time instants are estimated and propagated in an on-line fashion. The update of the memberships is carried out mainly using the out-of-sample extension property of the model. Initially the algorithm is tested on computer-generated data sets, then we show that video segmentation can be cast as a semi-supervised learning problem. Furthermore we show how the tracking capabilities of the Kalman filter can be used to provide the labels of objects in motion and thus regularizing the solution obtained by the MSS-KSC algorithm. In the experiments, we demonstrate the performance of the proposed method on synthetic data sets and real-life videos where the clusters evolve in a smooth fashion over time. PMID:26319050

  5. Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering.

    Sebastian Will

    2007-04-01

    Full Text Available The RFAM database defines families of ncRNAs by means of sequence similarities that are sufficient to establish homology. In some cases, such as microRNAs and box H/ACA snoRNAs, functional commonalities define classes of RNAs that are characterized by structural similarities, and typically consist of multiple RNA families. Recent advances in high-throughput transcriptomics and comparative genomics have produced very large sets of putative noncoding RNAs and regulatory RNA signals. For many of them, evidence for stabilizing selection acting on their secondary structures has been derived, and at least approximate models of their structures have been computed. The overwhelming majority of these hypothetical RNAs cannot be assigned to established families or classes. We present here a structure-based clustering approach that is capable of extracting putative RNA classes from genome-wide surveys for structured RNAs. The LocARNA (local alignment of RNA tool implements a novel variant of the Sankoff algorithm that is sufficiently fast to deal with several thousand candidate sequences. The method is also robust against false positive predictions, i.e., a contamination of the input data with unstructured or nonconserved sequences. We have successfully tested the LocARNA-based clustering approach on the sequences of the RFAM-seed alignments. Furthermore, we have applied it to a previously published set of 3,332 predicted structured elements in the Ciona intestinalis genome (Missal K, Rose D, Stadler PF (2005 Noncoding RNAs in Ciona intestinalis. Bioinformatics 21 (Supplement 2: i77-i78. In addition to recovering, e.g., tRNAs as a structure-based class, the method identifies several RNA families, including microRNA and snoRNA candidates, and suggests several novel classes of ncRNAs for which to date no representative has been experimentally characterized.

  6. A conserved cluster of three PRD-class homeobox genes (homeobrain, rx and orthopedia in the Cnidaria and Protostomia

    Mazza Maureen E

    2010-07-01

    Full Text Available Abstract Background Homeobox genes are a superclass of transcription factors with diverse developmental regulatory functions, which are found in plants, fungi and animals. In animals, several Antennapedia (ANTP-class homeobox genes reside in extremely ancient gene clusters (for example, the Hox, ParaHox, and NKL clusters and the evolution of these clusters has been implicated in the morphological diversification of animal bodyplans. By contrast, similarly ancient gene clusters have not been reported among the other classes of homeobox genes (that is, the LIM, POU, PRD and SIX classes. Results Using a combination of in silico queries and phylogenetic analyses, we found that a cluster of three PRD-class homeobox genes (Homeobrain (hbn, Rax (rx and Orthopedia (otp is present in cnidarians, insects and mollusks (a partial cluster comprising hbn and rx is present in the placozoan Trichoplax adhaerens. We failed to identify this 'HRO' cluster in deuterostomes; in fact, the Homeobrain gene appears to be missing from the chordate genomes we examined, although it is present in hemichordates and echinoderms. To illuminate the ancestral organization and function of this ancient cluster, we mapped the constituent genes against the assembled genome of a model cnidarian, the sea anemone Nematostella vectensis, and characterized their spatiotemporal expression using in situ hybridization. In N. vectensis, these genes reside in a span of 33 kb with the same gene order as previously reported in insects. Comparisons of genomic sequences and expressed sequence tags revealed the presence of alternative transcripts of Nv-otp and two highly unusual protein-coding polymorphisms in the terminal helix of the Nv-rx homeodomain. A population genetic survey revealed the Rx polymorphisms to be widespread in natural populations. During larval development, all three genes are expressed in the ectoderm, in non-overlapping territories along the oral-aboral axis, with distinct

  7. Cluster exponential synchronization of a class of complex networks with hybrid coupling and time-varying delay

    This paper deals with the cluster exponential synchronization of a class of complex networks with hybrid coupling and time-varying delay. Through constructing an appropriate Lyapunov—Krasovskii functional and applying the theory of the Kronecker product of matrices and the linear matrix inequality (LMI) technique, several novel sufficient conditions for cluster exponential synchronization are obtained. These cluster exponential synchronization conditions adopt the bounds of both time delay and its derivative, which are less conservative. Finally, the numerical simulations are performed to show the effectiveness of the theoretical results. (general)

  8. Impacts of fast food and food retail environment on overweight and obesity in China: a multilevel latent class cluster approach

    Zhang XiaoYong, Xiaoyong; Lans, van der I.A.; Dagevos, H.

    2012-01-01

    Objective To simultaneously identify consumer segments based on individual-level consumption and community-level food retail environment data and to investigate whether the segments are associated with BMI and dietary knowledge in China. Design A multilevel latent class cluster model was applied to

  9. Identifying victims of workplace bullying by integrating traditional estimation approaches into a latent class cluster model.

    Leon-Perez, Jose M; Notelaers, Guy; Arenas, Alicia; Munduate, Lourdes; Medina, Francisco J

    2014-05-01

    Research findings underline the negative effects of exposure to bullying behaviors and document the detrimental health effects of being a victim of workplace bullying. While no one disputes its negative consequences, debate continues about the magnitude of this phenomenon since very different prevalence rates of workplace bullying have been reported. Methodological aspects may explain these findings. Our contribution to this debate integrates behavioral and self-labeling estimation methods of workplace bullying into a measurement model that constitutes a bullying typology. Results in the present sample (n = 1,619) revealed that six different groups can be distinguished according to the nature and intensity of reported bullying behaviors. These clusters portray different paths for the workplace bullying process, where negative work-related and person-degrading behaviors are strongly intertwined. The analysis of the external validity showed that integrating previous estimation methods into a single measurement latent class model provides a reliable estimation method of workplace bullying, which may overcome previous flaws. PMID:24257593

  10. Human HLA class I- and HLA class II-restricted cloned cytotoxic T lymphocytes identify a cluster of epitopes on the measles virus fusion protein.

    van Binnendijk, R S; Versteeg-van Oosten, J P; Poelen, M C; Brugghe, H F; Hoogerhout, P; Osterhaus, A D; Uytdehaag, F G

    1993-01-01

    The transmembrane fusion (F) glycoprotein of measles virus is an important target antigen of human HLA class I- and class II-restricted cytotoxic T lymphocytes (CTL). Genetically engineered F proteins and nested sets of synthetic peptides spanning the F protein were used to determine sequences of F recognized by a number of F-specific CTL clones. Combined N- and C-terminal deletions of the respective peptides revealed that human HLA class I and HLA class II-restricted CTL efficiently recognize nonapeptides or decapeptides representing epitopes of F. Three distinct sequences recognized by three different HLA class II (DQw1, DR2, and DR4/w53)-restricted CTL clones appear to cluster between amino acids 379 and 466 of F, thus defining an important T-cell epitope area of F. Within this same region, a nonamer peptide of F was found to be recognized by an HLA-B27-restricted CTL clone, as expected on the basis of the structural homology between this peptide and other known HLA-B27 binding peptides. PMID:7680390

  11. Merged consensus clustering to assess and improve class discovery with microarray data

    Jarman Andrew P

    2010-12-01

    Full Text Available Abstract Background One of the most commonly performed tasks when analysing high throughput gene expression data is to use clustering methods to classify the data into groups. There are a large number of methods available to perform clustering, but it is often unclear which method is best suited to the data and how to quantify the quality of the classifications produced. Results Here we describe an R package containing methods to analyse the consistency of clustering results from any number of different clustering methods using resampling statistics. These methods allow the identification of the the best supported clusters and additionally rank cluster members by their fidelity within the cluster. These metrics allow us to compare the performance of different clustering algorithms under different experimental conditions and to select those that produce the most reliable clustering structures. We show the application of this method to simulated data, canonical gene expression experiments and our own novel analysis of genes involved in the specification of the peripheral nervous system in the fruitfly, Drosophila melanogaster. Conclusions Our package enables users to apply the merged consensus clustering methodology conveniently within the R programming environment, providing both analysis and graphical display functions for exploring clustering approaches. It extends the basic principle of consensus clustering by allowing the merging of results between different methods to provide an averaged clustering robustness. We show that this extension is useful in correcting for the tendency of clustering algorithms to treat outliers differently within datasets. The R package, clusterCons, is freely available at CRAN and sourceforge under the GNU public licence.

  12. The XMM-LSS survey: optical assessment and properties of different X-ray selected cluster classes

    Adami, C; Pierre, M; Sprimont, P G; Libbrecht, C; Pacaud, F; Clerc, N; Sadibekova, T; Surdej, J; Altieri, B; Duc, P A; Galaz, G; Gueguen, A; Guennou, L; Hertling, G; Ilbert, O; LeFèvre, J P; Quintana, H; Valtchanov, I; Willis, J P; Akiyama, M; Aussel, H; Chiappetti, L; Detal, A; Garilli, B; LeBrun, V; LeFèvre, O; Maccagni, D; Melin, J B; Ponman, T J; Ricci, D; Tresse, L

    2010-01-01

    XMM and Chandra opened a new area for the study of clusters of galaxies. Not only for cluster physics but also, for the detection of faint and distant clusters that were inaccessible with previous missions. This article presents 66 spectroscopically confirmed clusters (0.05classes, of extended-sources are defined in a two-dimensional X-ray parameter space allowing for various degrees of completeness and contamination. We describe the procedure developed to assess the reality of these cluster candidates using the CFHTLS photometric data and spectroscopic information from our own follow-up campaigns. Most of these objects are low mass clusters, hence constituting a still poorly studied population. In a second step, we quantify correlations between the optical prop...

  13. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering

    Suraj; Purnendu Tiwari; Subhojit Ghosh; Rakesh Kumar Sinha

    2015-01-01

    Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means c...

  14. Impacts of fast food and food retail environment on overweight and obesity in China: a multilevel latent class cluster approach

    Zhang XiaoYong, Xiaoyong; Lans, van der, A.M.; Dagevos, H.

    2012-01-01

    Objective To simultaneously identify consumer segments based on individual-level consumption and community-level food retail environment data and to investigate whether the segments are associated with BMI and dietary knowledge in China. Design A multilevel latent class cluster model was applied to identify consumer segments based not only on their individual preferences for fast food, salty snack foods, and soft drinks and sugared fruit drinks, but also on the food retail environment at the ...

  15. Signaling at the inhibitory natural killer cell immune synapse regulates lipid raft polarization but not class I MHC clustering.

    Fassett, M S; Davis, D M; Valter, M M; Cohen, G B; Strominger, J L

    2001-12-01

    Natural killer (NK) cell cytotoxicity is determined by a balance of positive and negative signals. Negative signals are transmitted by NK inhibitory receptors (killer immunoglobulin-like receptors, KIR) at the site of membrane apposition between an NK cell and a target cell, where inhibitory receptors become clustered with class I MHC ligands in an organized structure known as an inhibitory NK immune synapse. Immune synapse formation in NK cells is poorly understood. Because signaling by NK inhibitory receptors could be involved in this process, the human NK tumor line YTS was transfected with signal-competent and signal-incompetent KIR2DL1. The latter were generated by truncating the KIR2DL1 cytoplasmic tail or by introducing mutations in the immunoreceptor tyrosine-based inhibition motifs. The KIR2DL1 mutants retained their ability to cluster class I MHC ligands on NK cell interaction with appropriate target cells. Therefore, receptor-ligand clustering at the inhibitory NK immune synapse occurs independently of KIR2DL1 signal transduction. However, parallel examination of NK cell membrane lipid rafts revealed that KIR2DL1 signaling is critical for blocking lipid raft polarization and NK cell cytotoxicity. Moreover, raft polarization was inhibited by reagents that disrupt microtubules and actin filaments, whereas synapse formation was not. Thus, NK lipid raft polarization and inhibitory NK immune synapse formation occur by fundamentally distinct mechanisms. PMID:11724921

  16. PARTIAL TRAINING METHOD FOR HEURISTIC ALGORITHM OF POSSIBLE CLUSTERIZATION UNDER UNKNOWN NUMBER OF CLASSES

    D. A. Viattchenin

    2009-01-01

    Full Text Available A method for constructing a subset of labeled objects which is used in a heuristic algorithm of possible  clusterization with partial  training is proposed in the  paper.  The  method  is  based  on  data preprocessing by the heuristic algorithm of possible clusterization using a transitive closure of a fuzzy tolerance. Method efficiency is demonstrated by way of an illustrative example.

  17. Manganese-centered tubular boron cluster - MnB16-: A new class of transition-metal molecules

    Jian, Tian; Li, Wan-Lu; Popov, Ivan A.; Lopez, Gary V.; Chen, Xin; Boldyrev, Alexander I.; Li, Jun; Wang, Lai-Sheng

    2016-04-01

    We report the observation of a manganese-centered tubular boron cluster (MnB16-), which is characterized by photoelectron spectroscopy and ab initio calculations. The relatively simple pattern of the photoelectron spectrum indicates the cluster to be highly symmetric. Ab initio calculations show that MnB16- has a Mn-centered tubular structure with C4v symmetry due to first-order Jahn-Teller effect, while neutral MnB16 reduces to C2v symmetry due to second-order Jahn-Teller effect. In MnB16-, two unpaired electrons are observed, one on the Mn 3dz2 orbital and another on the B16 tube, making it an unusual biradical. Strong covalent bonding is found between the Mn 3d orbitals and the B16 tube, which helps to stabilize the tubular structure. The current result suggests that there may exist a whole class of metal-stabilized tubular boron clusters. These metal-doped boron clusters provide a new bonding modality for transition metals, as well as a new avenue to design boron-based nanomaterials.

  18. CLUSTER TAXOMETRY OF ATTENTION DEFICIT/ HYPERACTIVITY DISORDER WITH LATENT CLASS AND CORRESPONDENCE ANALYSIS

    DAVID A PINEDA

    2007-08-01

    Full Text Available Attention deficit/hyperactivity disorder (ADHD has heterogeneous symptoms with diverse grades of severity. Latentclass cluster analysis (LCCA can be used to classify children, using direct data from any instrument that reports thesesymptoms, without previous gold standard diagnosis. One ADHD symptoms checklist, and one ADHD comorbiditiesquestionnaire were used. LCCAs were developed for each instrument, which were administered to a sample of 540children and adolescents, aged 4-17 years, from the regular school of Manizales-Colombia. A simple correspondenceanalysis (SCA was done to determine the relationships between the groups classified from both LCCAs. Six clusters were obtained from ADHD checklist and five from the ADHD comorbidities questionnaire. SCA found fourindependent groups, derived from the concordances between the 11 clusters obtained by the LCCAs from bothinstruments. These findings suggest that LCCA and SCA can be use as accurate taxometric procedures to classifyexternalizing psychopathologies.

  19. Clustering

    Jinfei Liu

    2013-04-01

    Full Text Available DBSCAN is a well-known density-based clustering algorithm which offers advantages for finding clusters of arbitrary shapes compared to partitioning and hierarchical clustering methods. However, there are few papers studying the DBSCAN algorithm under the privacy preserving distributed data mining model, in which the data is distributed between two or more parties, and the parties cooperate to obtain the clustering results without revealing the data at the individual parties. In this paper, we address the problem of two-party privacy preserving DBSCAN clustering. We first propose two protocols for privacy preserving DBSCAN clustering over horizontally and vertically partitioned data respectively and then extend them to arbitrarily partitioned data. We also provide performance analysis and privacy proof of our solution..

  20. CLUSTER TAXOMETRY OF ATTENTION DEFICIT/ HYPERACTIVITY DISORDER WITH LATENT CLASS AND CORRESPONDENCE ANALYSIS

    David A. Pineda; DANIEL CAMILO AGUIRRE-ACEVEDO; FRANCISCO LOPERA; DANIEL A PINEDA; MAURICIO ARCOS-BURGOS

    2007-01-01

    Attention deficit/hyperactivity disorder (ADHD) has heterogeneous symptoms with diverse grades of severity. Latentclass cluster analysis (LCCA) can be used to classify children, using direct data from any instrument that reports thesesymptoms, without previous gold standard diagnosis. One ADHD symptoms checklist, and one ADHD comorbiditiesquestionnaire were used. LCCAs were developed for each instrument, which were administered to a sample of 540children and adolescents, aged 4-17 years, from...

  1. Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes

    Cannistraci, Carlo

    2010-09-01

    Motivation: Nonlinear small datasets, which are characterized by low numbers of samples and very high numbers of measures, occur frequently in computational biology, and pose problems in their investigation. Unsupervised hybrid-two-phase (H2P) procedures-specifically dimension reduction (DR), coupled with clustering-provide valuable assistance, not only for unsupervised data classification, but also for visualization of the patterns hidden in high-dimensional feature space. Methods: \\'Minimum Curvilinearity\\' (MC) is a principle that-for small datasets-suggests the approximation of curvilinear sample distances in the feature space by pair-wise distances over their minimum spanning tree (MST), and thus avoids the introduction of any tuning parameter. MC is used to design two novel forms of nonlinear machine learning (NML): Minimum Curvilinear embedding (MCE) for DR, and Minimum Curvilinear affinity propagation (MCAP) for clustering. Results: Compared with several other unsupervised and supervised algorithms, MCE and MCAP, whether individually or combined in H2P, overcome the limits of classical approaches. High performance was attained in the visualization and classification of: (i) pain patients (proteomic measurements) in peripheral neuropathy; (ii) human organ tissues (genomic transcription factor measurements) on the basis of their embryological origin. Conclusion: MC provides a valuable framework to estimate nonlinear distances in small datasets. Its extension to large datasets is prefigured for novel NMLs. Classification of neuropathic pain by proteomic profiles offers new insights for future molecular and systems biology characterization of pain. Improvements in tissue embryological classification refine results obtained in an earlier study, and suggest a possible reinterpretation of skin attribution as mesodermal. © The Author(s) 2010. Published by Oxford University Press.

  2. A Cluster Randomized-Controlled Trial of a Classroom-Based Drama Workshop Program to Improve Mental Health Outcomes among Immigrant and Refugee Youth in Special Classes

    Rousseau, Cécile; Beauregard, Caroline; Daignault, Katherine; Petrakos, Harriet; Thombs, Brett D.; Steele, Russell; Vasiliadis, Helen-Maria; Hechtman, Lily

    2014-01-01

    Objectives The aim of this cluster randomized trial was to evaluate the effectiveness of a school-based theatre intervention program for immigrant and refugee youth in special classes for improving mental health and academic outcomes. The primary hypothesis was that students in the theatre intervention group would report a greater reduction in impairment from symptoms compared to students in the control and tutoring groups. Methods Special classrooms in five multiethnic high schools were rand...

  3. Differences in the expressed HLA class I alleles effect the differential clustering of HIV type 1-specific T cell responses in infected Chinese and Caucasians

    Yu,XG; Addo,MM; Perkins,BA; Wej,FL; Rathod,A; Geer,SC; Parta,M; Cohen,D; Stone,DR; Russell,CJ; Tanzi,G; Mei,S; Wureel,AG; Frahm,N; Lichterfeld,M; Heath,L; Mullins,JI; Marincola,F; Goulder,PJR; Brander,C; Allen,T; Cao,YZ; Walker,BD; Altfeld,M

    2005-01-01

    China is a region of the world with a rapidly spreading HIV-1 epidemic. Studies providing insights into HIV-1 pathogenesis in infected Chinese are urgently needed to support the design and testing of an effective HIV-1 vaccine for this population. HIV-1-specific T cell responses were characterized in 32 HIV-1-infected individuals of Chinese origin and compared to 34 infected caucasians using 410 overlapping peptides spanning the entire HIV-1 clade B consensus sequence in an IFN-gamma ELISpot assay. All HIV-1 proteins were targeted with similar frequency in both populations and all study subjects recognized at least one overlapping peptide. HIV-1-specific T cell responses clustered in seven different regions of the HIV-1 genome in the Chinese cohort and in nine different regions in the caucasian cohort. The dominant HLA class I alleles expressed in the two populations differed significantly, and differences in epitope clustering pattern were shown to be influenced by differences in class I alleles that restrict immunodominant epitopes. These studies demonstrate that the clustering of HIV-1-specific T cell responses is influenced by the genetic HLA class I background in the study populations. The design and testing of candidate vaccines to fight the rapidly growing HIV-1 epidemic must therefore take the HLA genetics of the population into account as specific regions of the virus can be expected to be differentially targeted in ethnically diverse populations.

  4. OptimClass: Using species-to-cluster fidelity to determine the optimal partition in classification of ecological communities

    Tichý, L.; Chytrý, M.; Hájek, Michal; Talbot, S. S.; Botta-Dukát, Z.

    2010-01-01

    Roč. 21, č. 2 (2010), s. 287-299. ISSN 1100-9233 R&D Projects: GA ČR GA206/09/0329 Institutional research plan: CEZ:AV0Z60050516 Keywords : cluster analysis * cover transformation * dendrogram Subject RIV: EF - Botanics Impact factor: 2.457, year: 2010

  5. Clustering and classification

    Arabie, Phipps

    1996-01-01

    At a moderately advanced level, this book seeks to cover the areas of clustering and related methods of data analysis where major advances are being made. Topics include: hierarchical clustering, variable selection and weighting, additive trees and other network models, relevance of neural network models to clustering, the role of computational complexity in cluster analysis, latent class approaches to cluster analysis, theory and method with applications of a hierarchical classes model in psychology and psychopathology, combinatorial data analysis, clusterwise aggregation of relations, review

  6. Cluster automorphism groups of cluster algebras with coefficients

    Chang, Wen; Zhu, Bin

    2015-01-01

    We study the cluster automorphism group of a skew-symmetric cluster algebra with geometric coefficients. For this, we introduce the notion of gluing free cluster algebra, and show that under a weak condition the cluster automorphism group of a gluing free cluster algebra is a subgroup of the cluster automorphism group of its principal part cluster algebra (i.e. the corresponding cluster algebra without coefficients). We show that several classes of cluster algebras with coefficients are gluin...

  7. A cluster randomized-controlled trial of a classroom-based drama workshop program to improve mental health outcomes among immigrant and refugee youth in special classes.

    Cécile Rousseau

    Full Text Available The aim of this cluster randomized trial was to evaluate the effectiveness of a school-based theatre intervention program for immigrant and refugee youth in special classes for improving mental health and academic outcomes. The primary hypothesis was that students in the theatre intervention group would report a greater reduction in impairment from symptoms compared to students in the control and tutoring groups.Special classrooms in five multiethnic high schools were randomly assigned to theater intervention (n = 10, tutoring (n = 10 or control status (n = 9, for a total of 477 participants. Students and teachers were non-blinded to group assignment. The primary outcome was impairment from emotional and behavioural symptoms assessed by the Impact Supplement of the Strengths and Difficulties Questionnaire (SDQ completed by the adolescents. The secondary outcomes were the SDQ global scores (teacher and youth reports, impairment assessed by teachers and school performance. The effect of the interventions was assessed through linear mixed effect models which incorporate the correlation between students in the same class, due to the nature of the randomization of the interventions by classroom.The theatre intervention was not associated with a greater reduction in self-reported impairment and symptoms in youth placed in special class because of learning, emotional and behavioural difficulties than a tutoring intervention or a non-active control group. The estimates of the different models show a non-significant decrease in both self-reported and impairment scores in the theatre intervention group for the overall group, but the impairment score decreased significantly for first generation adolescents while it increased for second generation adolescents.The difference between the population of immigrant and refugee youth newcomers studied previously and the sample of this trial may explain some of the differences in the observed impact of

  8. Fuzzy Clustering

    Berks, G.; Keyserlingk, Diedrich Graf von; Jantzen, Jan;

    2000-01-01

    and clustering are the basic concerns in medicine. Classification depends on definitions of the classes and their required degree of participant of the elements in the cases' symptoms. In medicine imprecise conditions are the rule and therefore fuzzy methods are much more suitable than crisp ones...

  9. Implementation and experimental analysis of consensus clustering

    Perc, Domen

    2011-01-01

    Consensus clustering is a machine learning tehnique for class discovery and clustering validation. The method uses various clustering algorithms in conjunction with different resampling tehniques for data clustering. It is based on multiple runs of clustering and sampling algorithm. Data gathered in these runs is used for clustering and for visual representation of clustering. Visual representation helps us to understand clustering results. In this thesis we compare consensus clustering with ...

  10. Cluster categories and cluster-tilted algebras

    Torkildsen, Hermund Andre

    2006-01-01

    We have given an introduction to the theory of cluster categories and cluster-tilted algebras, and this was one of our main objectives in this thesis. We have seen that cluster-tilted algebras are relation-extension algebras, and this gave us a way of constructing the quiver of a cluster-tilted algebra from a tilted algebra. A cluster-tilted algebra of finite representation type is determined by its quiver, and this raised questions about the generality of this result. We defined a new class...

  11. Meaningful Effect Sizes, Intra-Class Correlations, and Proportions of Variance Explained by Covariates for Planning 3 Level Cluster Randomized Experiments in Prevention Science

    Dong, Nianbo; Reinke, Wendy M.; Herman, Keith C.; Bradshaw, Catherine P.; Murray, Desiree W.

    2015-01-01

    Cluster randomized experiments are now widely used to examine intervention effects in prevention science. It is meaningful to use empirical benchmarks for interpreting effect size in prevention science. The effect size (i.e., the standardized mean difference, calculated by the difference of the means between the treatment and control groups,…

  12. Clustering high dimensional data

    Assent, Ira

    2012-01-01

    for clustering are required. Consequently, recent research has focused on developing techniques and clustering algorithms specifically for high-dimensional data. Still, open research issues remain. Clustering is a data mining task devoted to the automatic grouping of data based on mutual similarity. Each cluster......High-dimensional data, i.e., data described by a large number of attributes, pose specific challenges to clustering. The so-called ‘curse of dimensionality’, coined originally to describe the general increase in complexity of various computational problems as dimensionality increases, is known...... groups objects that are similar to one another, whereas dissimilar objects are assigned to different clusters, possibly separating out noise. In this manner, clusters describe the data structure in an unsupervised manner, i.e., without the need for class labels. A number of clustering paradigms exist...

  13. Crossings in Clustered Level Graphs

    Forster, Michael

    2005-01-01

    Clustered graphs are an enhanced graph model with a recursive clustering of the vertices according to a given nesting relation. This prime technique for expressing coherence of certain parts of the graph is used in many applications, such as biochemical pathways and UML class diagrams. For directed clustered graphs usually level drawings are used, leading to clustered level graphs. In this thesis we analyze the interrelation of clusters and levels and their influence on edge crossings and clu...

  14. Deep observations of the Super-CLASS super-cluster at 325 MHz with the GMRT: the low-frequency source catalogue

    Riseley, C J; Hales, C A; Harrison, I; Birkinshaw, M; Battye, R A; Beswick, R J; Brown, M L; Casey, C M; Chapman, S C; Demetroullas, C; Hung, C -L; Jackson, N J; Muxlow, T; Watson, B

    2016-01-01

    We present the results of 325 MHz GMRT observations of a super-cluster field, known to contain five Abell clusters at redshift $z \\sim 0.2$. We achieve a nominal sensitivity of $34\\,\\mu$Jy beam$^{-1}$ toward the phase centre. We compile a catalogue of 3257 sources with flux densities in the range $183\\,\\mu\\rm{Jy}\\,-\\,1.5\\,\\rm{Jy}$ within the entire $\\sim 6.5$ square degree field of view. Subsequently, we use available survey data at other frequencies to derive the spectral index distribution for a sub-sample of these sources, recovering two distinct populations -- a dominant population which exhibit spectral index trends typical of steep-spectrum synchrotron emission, and a smaller population of sources with typically flat or rising spectra. We identify a number of sources with ultra-steep spectra or rising spectra for further analysis, finding two candidate high-redshift radio galaxies and three gigahertz-peaked-spectrum radio sources. Finally, we derive the Euclidean-normalised differential source counts us...

  15. Clustering signatures classify directed networks

    Ahnert, S. E.; Fink, T. M. A.

    2008-09-01

    We use a clustering signature, based on a recently introduced generalization of the clustering coefficient to directed networks, to analyze 16 directed real-world networks of five different types: social networks, genetic transcription networks, word adjacency networks, food webs, and electric circuits. We show that these five classes of networks are cleanly separated in the space of clustering signatures due to the statistical properties of their local neighborhoods, demonstrating the usefulness of clustering signatures as a classifier of directed networks.

  16. A vision for growing a world-class power technology cluster in a smart, sustainable British Columbia : full report to the Premier's Technology Council

    This report presents a framework for power technology in British Columbia and the development of new sources of energy while ensuring the sustainable economic growth. It also explores the opportunities present in the power technology sector. A definition of the power technology industry was provided, and market drivers were identified, describing the region's competitive advantage and assets. Five market opportunities were introduced, comprising the report's targeted innovation strategy: remote power solutions; sustainable urban practices; smart transport; smart grid; and large scale clean green power production. An outline of the current energy market in British Columbia was presented with details of research and development in renewable energy sources. Global power demands were also outlined. A regional action plan was presented in order to develop the power technology cluster. Leadership strategies were presented, with economic development goals and working teams geared towards an implementation resource plan. A commercialization strategy was suggested in order to address local demand, commercialization funds, and increasing access and resources. A growth strategy was also presented to assist in the development of access to world markets, create partnerships and assist in branding and collaborations with industry and government. An innovation strategy was outlined, with the aim of developing research initiatives, support centres in key market and technology areas and connecting existing efforts in basic sciences to power technology applications. It was concluded that in order to achieve full implementation of these strategies, a short term task force is necessary to shape overall plans. Additionally, an ongoing vision team, working groups and coordination is necessary to implement overall strategies and subcomponents. Appendices were included with reference to each of the five market opportunities presented in the report. 58 refs

  17. Projection effects in cluster catalogues

    Van Haarlem, M P; White, S D M

    1997-01-01

    We investigate the importance of projection effects in the identification of galaxy clusters in 2D galaxy maps and their effect on the estimation of cluster velocity dispersions. A volume limited galaxy catalogue that was derived from a Standard CDM N-body simulation was used. We select clusters using criteria that match those employed in the construction of real cluster catalogues and find that our mock Abell cluster catalogues are heavily contaminated and incomplete. Over one third (34 per cent) of clusters of richness class R>=1 are miclassifications arising from the projection of one or more clumps onto an intrinsically poor cluster. Conversely, 32 per cent of intrinsically rich clusters are missed from the R>=1 catalogues, mostly because of statistical fluctuations in the background count. Selection by X-ray luminosity rather than optical richness reduces, but does not completely eliminate, these problems. Contamination by unvirialised sub-clumps near a cluster leads to a considerable overestimation of t...

  18. Voltage Graphs and Cluster Consensus with Point Group Symmetries

    Chen, Xudong; Belabbas, Mohamed-Ali; Basar, Tamer

    2016-01-01

    A cluster consensus system is a multi-agent system in which the autonomous agents communicate to form multiple clusters, with each cluster of agents asymptotically converging to the same clustering point. We introduce in this paper a special class of cluster consensus dynamics, termed the $G$-clustering dynamics for $G$ a point group, whereby the autonomous agents can form as many as $|G|$ clusters, and moreover, the associated $|G|$ clustering points exhibit a geometric symmetry induced by t...

  19. Cluster Automorphisms

    Assem, Ibrahim; Schiffler, Ralf; Shramchenko, Vasilisa

    2010-01-01

    In this article, we introduce the notion of cluster automorphism of a given cluster algebra as a $\\ZZ$-automorphism of the cluster algebra that sends a cluster to another and commutes with mutations. We study the group of cluster automorphisms in detail for acyclic cluster algebras and cluster algebras from surfaces, and we compute this group explicitly for the Dynkin types and the Euclidean types.

  20. Tune Your Brown Clustering, Please

    Derczynski, Leon; Chester, Sean; Bøgh, Kenneth Sejdenfaden

    2015-01-01

    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly...... unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We...... explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal....

  1. Cluster Structure on Generalized Weyl Algebras

    Saleh, Ibrahim

    2011-01-01

    We introduce a class of non-commutative algebras that carry a non-commutative (geometric) cluster structure which are generated by identical copies of generalized Weyl algebras. Equivalent conditions for the finiteness of the set of the cluster variables of these cluster structures are provided. Some combinatorial data, called \\textit{cluster strands,} arising from the cluster structure are used to construct irreducible representations of generalized Weyl algebras.

  2. Clustering of attribute and/or relational data:

    Ferligoj, Anuška; Kronegger, Luka

    2009-01-01

    A large class of clustering problems can be formulated as an optimizational problem in which the best clustering is searched for among all feasible clustering according to a selected criterion function. This clustering approach can be applied to a variety of very interesting clustering problems, as it is possible to adapt it to a concrete clustering problem by an appropriate specification of the criterion function and/or by the definition of the set of feasible clusterings. Both, the blockmod...

  3. Possibilistic clustering for shape recognition

    Keller, James M.; Krishnapuram, Raghu

    1993-01-01

    Clustering methods have been used extensively in computer vision and pattern recognition. Fuzzy clustering has been shown to be advantageous over crisp (or traditional) clustering in that total commitment of a vector to a given class is not required at each iteration. Recently fuzzy clustering methods have shown spectacular ability to detect not only hypervolume clusters, but also clusters which are actually 'thin shells', i.e., curves and surfaces. Most analytic fuzzy clustering approaches are derived from Bezdek's Fuzzy C-Means (FCM) algorithm. The FCM uses the probabilistic constraint that the memberships of a data point across classes sum to one. This constraint was used to generate the membership update equations for an iterative algorithm. Unfortunately, the memberships resulting from FCM and its derivatives do not correspond to the intuitive concept of degree of belonging, and moreover, the algorithms have considerable trouble in noisy environments. Recently, the clustering problem was cast into the framework of possibility theory. Our approach was radically different from the existing clustering methods in that the resulting partition of the data can be interpreted as a possibilistic partition, and the membership values may be interpreted as degrees of possibility of the points belonging to the classes. An appropriate objective function whose minimum will characterize a good possibilistic partition of the data was constructed, and the membership and prototype update equations from necessary conditions for minimization of our criterion function were derived. The ability of this approach to detect linear and quartic curves in the presence of considerable noise is shown.

  4. Weighted Clustering

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina;

    2012-01-01

    We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both the...... partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...

  5. Cluster Headache

    Frederick G Freitag

    1985-01-01

    Learning Objectives: Review the current understanding of the pathophysiology of cluster headache Be able to recognize the clinical features of cluster headache Be able to develop a strategy for treatment of cluster headache Cluster headache is divided into multiple subtypes under the IHC classification criteria. The vast majority of patients present with episodic cluster headache (3.1.1). This will be the focus of the presentation. The syndrome is characterized by repeated at...

  6. Class size versus class composition

    Jones, Sam

    Raising schooling quality in low-income countries is a pressing challenge. Substantial research has considered the impact of cutting class sizes on skills acquisition. Considerably less attention has been given to the extent to which peer effects, which refer to class composition, also may affect...

  7. Yellow supergiants in open clusters

    Superluminous giant stars (SLGs) have been reported in young globular clusters in the Large Magellanic Cloud (LMC). These stars appear to be in the post-asymptotic-giant-branch phase of evolution. This program was an investigation of galactic SLG candidates in open clusters, which are more like the LMC young globular clusters. These were chosen because luminosity, mass, and age determinations can be made for members since cluster distances and interstellar reddenings are known. Color magnitude diagrams were searched for candidates, using the same selection criteria as for SLGs in the LMC. Classification spectra were obtained of 115 program stars from McGraw-Hill Observatory and of 68 stars from Cerro Tololo Inter-American Observatory Chile. These stars were visually classified on the MK system using spectral scans of standard stars taken at the respective observations. Published information was combined with this program's data for 83 stars in 30 clusters. Membership probabilities were assigned to these stars, and the clusters were analyzed according to age. It was seen that the intrinsically brightest supergiants are found in the youngest clusters. With increasing cluster age, the absolute luminosities attained by the supergiants decline. Also, it appears that the evolutionary tracks of luminosity class II stars are more similar to those of class I than of class III

  8. Word classes

    Rijkhoff, Jan

    2007-01-01

    This article provides an overview of recent literature and research on word classes, focusing in particular on typological approaches to word classification. The cross-linguistic classification of word class systems (or parts-of-speech systems) presented in this article is based on statements found...... in grammatical descriptions of some 50 languages, which together constitute a representative sample of the world’s languages (Hengeveld et al. 2004: 529). It appears that there are both quantitative and qualitative differences between word class systems of individual languages. Whereas some languages...... employ a parts-of-speech system that includes the categories Verb, Noun, Adjective and Adverb, other languages may use only a subset of these four lexical categories. Furthermore, quite a few languages have a major word class whose members cannot be classified in terms of the categories Verb – Noun...

  9. Clustering via Kernel Decomposition

    Have, Anna Szynkowiak; Girolami, Mark A.; Larsen, Jan

    2006-01-01

    Methods for spectral clustering have been proposed recently which rely on the eigenvalue decomposition of an affinity matrix. In this work it is proposed that the affinity matrix is created based on the elements of a non-parametric density estimator. This matrix is then decomposed to obtain...... posterior probabilities of class membership using an appropriate form of nonnegative matrix factorization. The troublesome selection of hyperparameters such as kernel width and number of clusters can be obtained using standard cross-validation methods as is demonstrated on a number of diverse data sets....

  10. Isotopic clusters

    Spectra of isotopically mixed clusters (dimers of SF6) are calculated as well as transition frequencies. The result leads to speculations about the suitability of the laser-cluster fragmentation process for isotope separation. (Auth.)

  11. Cluster headache

    Histamine headache; Headache - histamine; Migrainous neuralgia; Headache - cluster; Horton's headache ... Doctors do not know exactly what causes cluster headaches. They ... (chemical in the body released during an allergic response) or ...

  12. Weighted Clustering

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina; Loker, David

    2012-01-01

    We investigate a natural generalization of the classical clusteringproblem, considering clustering tasks in which differentinstances may have different weights.We conduct the firstextensive theoretical analysis on the influence of weighteddata on standard clustering algorithms in both the partitionaland hierarchical settings, characterizing the conditions underwhich algorithms react to weights. Extending a recent frameworkfor clustering algorithm selection, we propose intuitiveproperties that...

  13. Meaningful Clusters

    Sanfilippo, Antonio P.; Calapristi, Augustin J.; Crow, Vernon L.; Hetzler, Elizabeth G.; Turner, Alan E.

    2004-05-26

    We present an approach to the disambiguation of cluster labels that capitalizes on the notion of semantic similarity to assign WordNet senses to cluster labels. The approach provides interesting insights on how document clustering can provide the basis for developing a novel approach to word sense disambiguation.

  14. A possibilistic approach to clustering

    Krishnapuram, Raghu; Keller, James M.

    1993-01-01

    Fuzzy clustering has been shown to be advantageous over crisp (or traditional) clustering methods in that total commitment of a vector to a given class is not required at each image pattern recognition iteration. Recently fuzzy clustering methods have shown spectacular ability to detect not only hypervolume clusters, but also clusters which are actually 'thin shells', i.e., curves and surfaces. Most analytic fuzzy clustering approaches are derived from the 'Fuzzy C-Means' (FCM) algorithm. The FCM uses the probabilistic constraint that the memberships of a data point across classes sum to one. This constraint was used to generate the membership update equations for an iterative algorithm. Recently, we cast the clustering problem into the framework of possibility theory using an approach in which the resulting partition of the data can be interpreted as a possibilistic partition, and the membership values may be interpreted as degrees of possibility of the points belonging to the classes. We show the ability of this approach to detect linear and quartic curves in the presence of considerable noise.

  15. Graph partitioning advance clustering technique

    Madhulatha, T Soni

    2012-01-01

    Clustering is a common technique for statistical data analysis, Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Dissimilarities are assessed based on the attribute values describing the objects. Often, distance measures are used. Clustering is an unsupervised learning technique, where interesting patterns and structures can be found directly from very large data sets with little or none of the background knowledge. This paper also considers the partitioning of m-dimensional lattice graphs using Fiedler's approach, which requires the determination of the eigenvector belonging to the second smallest Eigenvalue of the Laplacian with K-means partitioning algorithm.

  16. Cluster Lenses

    Kneib, Jean-Paul; 10.1007/s00159-011-0047-3

    2012-01-01

    Clusters of galaxies are the most recently assembled, massive, bound structures in the Universe. As predicted by General Relativity, given their masses, clusters strongly deform space-time in their vicinity. Clusters act as some of the most powerful gravitational lenses in the Universe. Light rays traversing through clusters from distant sources are hence deflected, and the resulting images of these distant objects therefore appear distorted and magnified. Lensing by clusters occurs in two regimes, each with unique observational signatures. The strong lensing regime is characterized by effects readily seen by eye, namely, the production of giant arcs, multiple-images, and arclets. The weak lensing regime is characterized by small deformations in the shapes of background galaxies only detectable statistically. Cluster lenses have been exploited successfully to address several important current questions in cosmology: (i) the study of the lens(es) - understanding cluster mass distributions and issues pertaining...

  17. Evaluating Mixture Modeling for Clustering: Recommendations and Cautions

    Steinley, Douglas; Brusco, Michael J.

    2011-01-01

    This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…

  18. Data Clustering

    Wagstaff, Kiri L.

    2012-03-01

    On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained

  19. Cluster Chemistry

    2011-01-01

    @@ Cansisting of eight scientists from the State Key Laboratory of Physical Chemistry of Solid Surfaces and Xiamen University, this creative research group is devoted to the research of cluster chemistry and creation of nanomaterials.After three-year hard work, the group scored a series of encouraging progresses in synthesis of clusters with special structures, including novel fullerenes, fullerene-like metal cluster compounds as well as other related nanomaterials, and their properties study.

  20. Clustering by Pattern Similarity

    Hai-xun Wang; Jian Pei

    2008-01-01

    The task of clustering is to identify classes of similar objects among a set of objects. The definition of similarity varies from one clustering model to another. However, in most of these models the concept of similarity is often based on such metrics as Manhattan distance, Euclidean distance or other Lp distances. In other words, similar objects must have close values in at least a set of dimensions. In this paper, we explore a more general type of similarity. Under the pCluster model we proposed, two objects are similar if they exhibit a coherent pattern on a subset of dimensions. The new similarity concept models a wide range of applications. For instance, in DNA microarray analysis, the expression levels of two genes may rise and fall synchronously in response to a set of environmental stimuli. Although the magnitude of their expression levels may not be close, the patterns they exhibit can be very much alike. Discovery of such clusters of genes is essential in revealing significant connections in gene regulatory networks. E-commerce applications, such as collaborative filtering, can also benefit from the new model, because it is able to capture not only the closeness of values of certain leading indicators but also the closeness of (purchasing, browsing, etc.) patterns exhibited by the customers. In addition to the novel similarity model, this paper also introduces an effective and efficient algorithm to detect such clusters, and we perform tests on several real and synthetic data sets to show its performance.

  1. COMPARISON OF PURITY AND ENTROPY OF K-MEANS CLUSTERING AND FUZZY C MEANS CLUSTERING

    Satya Chaitanya Sripada

    2011-06-01

    Full Text Available Clustering is one the main area in data mining literature. There are various algorithms for clustering. The evaluation of the performance isdone by validation measures. The external validation measures are used to measure the extent to which cluster labels affirm with theexternally given class labels. The aim of this paper is to compare the for K-means and Fuzzy C means clustering using the Purity andEntropy. The data used for evaluating the external measures is medical data.

  2. Cancer Clusters

    ... of cancer. Cancer clusters can help scientists identify cancer-causing substances in the environment. For example, in the early 1970s, a cluster ... the area and time period over which the cancers were diagnosed. They also ask about specific environmental hazards or concerns in the affected area. If ...

  3. Class distinction

    White, M. Catherine

    Typical 101 courses discourage many students from pursuing higher level science and math courses. Introductory classes in science and math serve largely as a filter, screening out all but the most promising students, and leaving the majority of college graduates—including most prospective teachers—with little understanding of how science works, according to a study conducted for the National Science Foundation. Because few teachers, particularly at the elementary level, experience any collegiate science teaching that stresses skills of inquiry and investigation, they simply never learn to use those methods in their teaching, the report states.

  4. Clustering processes

    Ryabko, Daniil

    2010-01-01

    The problem of clustering is considered, for the case when each data point is a sample generated by a stationary ergodic process. We propose a very natural asymptotic notion of consistency, and show that simple consistent algorithms exist, under most general non-parametric assumptions. The notion of consistency is as follows: two samples should be put into the same cluster if and only if they were generated by the same distribution. With this notion of consistency, clustering generalizes such classical statistical problems as homogeneity testing and process classification. We show that, for the case of a known number of clusters, consistency can be achieved under the only assumption that the joint distribution of the data is stationary ergodic (no parametric or Markovian assumptions, no assumptions of independence, neither between nor within the samples). If the number of clusters is unknown, consistency can be achieved under appropriate assumptions on the mixing rates of the processes. (again, no parametric ...

  5. Attracting higher income class to public transport in socially clustered cities. The case of Caracas. / Atracción del mayor nivel de ingresos al transporte público en las ciudades socialmente segregadas. El caso de Caracas.

    Flórez, Josefina

    2000-03-01

    Full Text Available In Caracas, as in most socially clustered cities, modal split is highly related to income. High income population is mostly car dependant, while lower income people are captive of public transport. This typical situation is explained by world-wide social values and fashion but also by the fact that new, segregated residential areas for the upper social levels have been located in areas poorly served by public transport, creating a dependency on the private car. It is not surprising that, during the 1970's, a high proportion of Caracas's middle and high-income citizens were systematically using their car even in areas where there was a good offer of public transport. What is more unusual is to realise that, since 1983 when the metro system was inaugurated, there is a new pattern of travel behaviour. The metro has mainly attracted high-income people. Besides the few of them who have transferred from surface to underground public transport, many of the wealthier patrons seem to be regular car users that presently take the metro when it provides a good alternative. Currently, the transit system in Caracas is comprised of four main modes: the metro (since 1983; the "por puesto", which are minibus vehicles of 18 to 32 seats; the jeeps, which are dual traction vehicles of up to 12 seat (most of them serving hilly areas, basically slums; and the bus system, consisting of metro-bus and private operators. CA Metro operates the metro and, since 1987, metro-bus lines, which are bus feeder services to its heavy rail metro operation that extend the cover area of the system into the less central zones of the city. While the metro and metro-bus offer transit services to middle and high income users, the mini-buses and jeeps provide flexible transit service to low income groups. The metro and metro-bus services are more reliable and offer higher quality that mini-buses and jeeps. This higher quality service is one of the main attributes attracting the wealthier

  6. Clustering analysis

    Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K-mean method' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods

  7. Cluster analysis

    Everitt, Brian S; Leese, Morven; Stahl, Daniel

    2011-01-01

    Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons

  8. Cluster editing

    Böcker, S.; Baumbach, Jan

    2013-01-01

    The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side. The...... algorithms for biological problems. © 2013 Springer-Verlag....... problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications of these...

  9. Spitzer Clusters

    Krick, Kessica

    This proposal is a specific response to the strategic goal of NASA's research program to "discover how the universe works and explore how the universe evolved into its present form." Towards this goal, we propose to mine the Spitzer archive for all observations of galaxy groups and clusters for the purpose of studying galaxy evolution in clusters, contamination rates for Sunyaev Zeldovich cluster surveys, and to provide a database of Spitzer observed clusters to the broader community. Funding from this proposal will go towards two years of support for a Postdoc to do this work. After searching the Spitzer Heritage Archive, we have found 194 unique galaxy groups and clusters that have data from both the Infrared array camera (IRAC; Fazio et al. 2004) at 3.6 - 8 microns and the multiband imaging photometer for Spitzer (MIPS; Rieke et al. 2004) at 24microns. This large sample will add value beyond the individual datasets because it will be a larger sample of IR clusters than ever before and will have sufficient diversity in mass, redshift, and dynamical state to allow us to differentiate amongst the effects of these cluster properties. An infrared sample is important because it is unaffected by dust extinction while at the same time is an excellent measure of both stellar mass (IRAC wavelengths) and star formation rate (MIPS wavelengths). Additionally, IRAC can be used to differentiate star forming galaxies (SFG) from active galactic nuclei (AGN), due to their different spectral shapes in this wavelength regime. Specifically, we intend to identify SFG and AGN in galaxy groups and clusters. Groups and clusters differ from the field because the galaxy densities are higher, there is a large potential well due mainly to the mass of the dark matter, and there is hot X-ray gas (the intracluster medium; ICM). We will examine the impact of these differences in environment on galaxy formation by comparing cluster properties of AGN and SFG to those in the field. Also, we will

  10. Cluster Bulleticity

    Massey, Richard; Kitching, Thomas D.; Nagai, Daisuke

    2010-01-01

    The unique properties of dark matter are revealed during collisions between clusters of galaxies, like the bullet cluster (1E 0657-56) and baby bullet (MACSJ0025-12). These systems provide evidence for an additional, invisible mass in the separation between the distribution of their total mass, measured via gravitational lensing, and their ordinary 'baryonic' matter, measured via its X-ray emission. Unfortunately, the information available from these systems is limited by th...

  11. Cluster Bulleticity

    Massey, R; Kitching, T.; Nagai, D.

    2010-01-01

    The unique properties of dark matter are revealed during collisions between clusters of galaxies, such as the bullet cluster (1E 0657−56) and baby bullet (MACS J0025−12). These systems provide evidence for an additional, invisible mass in the separation between the distributions of their total mass, measured via gravitational lensing, and their ordinary ‘baryonic’ matter, measured via its X-ray emission. Unfortunately, the information available from these systems is limited by their rarity. C...

  12. Cluster generator

    Donchev, Todor I.; Petrov, Ivan G.

    2011-05-31

    Described herein is an apparatus and a method for producing atom clusters based on a gas discharge within a hollow cathode. The hollow cathode includes one or more walls. The one or more walls define a sputtering chamber within the hollow cathode and include a material to be sputtered. A hollow anode is positioned at an end of the sputtering chamber, and atom clusters are formed when a gas discharge is generated between the hollow anode and the hollow cathode.

  13. Euclidean Distances, soft and spectral Clustering on Weighted Graphs

    Bavaud, François

    2010-01-01

    We define a class of Euclidean distances on weighted graphs, enabling to perform thermodynamic soft graph clustering. The class can be constructed form the "raw coordinates" encountered in spectral clustering, and can be extended by means of higher-dimensional embeddings (Schoenberg transformations). Geographical flow data, properly conditioned, illustrate the procedure as well as visualization aspects.

  14. Poincare Invariance, Cluster Properties, and Particle Production

    Polyzou, W. N.

    2002-01-01

    A method is presented for constructing a class of Poincare invariant quantum mechanical models of systems of a finite number of degrees of freedom that satisfy cluster separability, the spectral condition, but do not conserve particle number. The class of models includes the relativistic Lee model and relativistic isobar models.

  15. M$_1$ - M* correlation in galaxy clusters

    Trevese, D; Appodia, B

    1994-01-01

    Photographic F band photometry of a sample of 36 Abell clusters has been used to study the relation between the magnitude M_1 of the brightest cluster member and the Schechter function parameter M^*. Clusters appear segregated in the M_1-M^* plane according to their Rood \\& Sastry class. We prove on a statistical basis that on average, going from early to late RS classes, M_1 becomes brighter while M^* becomes fainter. The result agrees with the predictions of galactic cannibalism models, never confirmed by previous analyses.

  16. Denominators of cluster variables

    Buan, Aslak Bakke; Marsh, Robert J.; Reiten, Idun

    2007-01-01

    Associated to any acyclic cluster algebra is a corresponding triangulated category known as the cluster category. It is known that there is a one-to-one correspondence between cluster variables in the cluster algebra and exceptional indecomposable objects in the cluster category inducing a correspondence between clusters and cluster-tilting objects. Fix a cluster-tilting object T and a corresponding initial cluster. By the Laurent phenomenon, every cluster variable can be written as a Laurent...

  17. Validity Index and number of clusters

    Mohamed Fadhel Saad

    2012-01-01

    Full Text Available Clustering (or cluster analysis has been used widely in pattern recognition, image processing, and data analysis. It aims to organize a collection of data items into c clusters, such that items within a cluster are more similar to each other than they are items in the other clusters. The number of clusters c is the most important parameter, in the sense that the remaining parameters have less influence on the resulting partition. To determine the best number of classes several methods were made, and are called validity index. This paper presents a new validity index for fuzzy clustering called a Modified Partition Coefficient And Exponential Separation (MPCAES index. The efficiency of the proposed MPCAES index is compared with several popular validity indexes. More information about these indexes is acquired in series of numerical comparisons and also real data Iris.

  18. Cluster-lensing: A Python Package for Galaxy Clusters & Miscentering

    Ford, Jes

    2016-01-01

    We describe a new open source package for calculating properties of galaxy clusters, including NFW halo profiles with and without the effects of cluster miscentering. This pure-Python package, cluster-lensing, provides well-documented and easy-to-use classes and functions for calculating cluster scaling relations, including mass-richness and mass-concentration relations from the literature, as well as the surface mass density $\\Sigma(R)$ and differential surface mass density $\\Delta\\Sigma(R)$ profiles, probed by weak lensing magnification and shear. Galaxy cluster miscentering is especially a concern for stacked weak lensing shear studies of galaxy clusters, where offsets between the assumed and the true underlying matter distribution can lead to a significant bias in the mass estimates if not accounted for. This software has been developed and released in a public GitHub repository, and is licensed under the permissive MIT license. The cluster-lensing package is archived on Zenodo (Ford 2016). Full documenta...

  19. High Dimensional Data Clustering Using Fast Cluster Based Feature Selection

    Karthikeyan.P

    2014-03-01

    Full Text Available Feature selection involves identifying a subset of the most useful features that produces compatible results as the original entire set of features. A feature selection algorithm may be evaluated from both the efficiency and effectiveness points of view. While the efficiency concerns the time required to find a subset of features, the effectiveness is related to the quality of the subset of features. Based on these criteria, a fast clustering-based feature selection algorithm (FAST is proposed and experimentally evaluated in this paper. The FAST algorithm works in two steps. In the first step, features are divided into clusters by using graph-theoretic clustering methods. In the second step, the most representative feature that is strongly related to target classes is selected from each cluster to form a subset of features. Features in different clusters are relatively independent; the clustering-based strategy of FAST has a high probability of producing a subset of useful and independent features. To ensure the efficiency of FAST, we adopt the efficient minimum-spanning tree (MST using the Kruskal‟s Algorithm clustering method. The efficiency and effectiveness of the FAST algorithm are evaluated through an empirical study. Index Terms—

  20. EM Clustering Analysis of Diabetes Patients Basic Diagnosis Index

    Wu, Cai; Steinbauer, Jeffrey R.; Kuo, Grace M

    2005-01-01

    Cluster analysis can group similar instances into same group. Partitioning cluster assigns classes to samples without known the classes in advance. Most common algorithms are K-means and Expectation Maximization (EM). EM clustering algorithm can find number of distributions of generating data and build “mixture models”. It identifies groups that are either overlapping or varying sizes and shapes. In this project, by using EM in Machine Learning Algorithm in JAVA (WEKA) syste...

  1. Cluster Bulleticity

    Massey, Richard; Nagai, Daisuke

    2010-01-01

    The unique properties of dark matter are revealed during collisions between clusters of galaxies, like the bullet cluster (1E 0657-56) and baby bullet (MACSJ0025-12). These systems provide evidence for an additional, invisible mass in the separation between the distribution of their total mass, measured via gravitational lensing, and their ordinary 'baryonic' matter, measured via its X-ray emission. Unfortunately, the information available from these systems is limited by their rarity. Constraints on the properties of dark matter, such as its interaction cross-section, are therefore restricted by uncertainties in the individual systems' impact velocity, impact parameter and orientation with respect to the line of sight. Here we develop a complementary, statistical measurement in which every piece of substructure falling into every massive cluster is treated as a bullet. We define 'bulleticity' as the mean separation between dark matter and ordinary matter, and we measure a positive signal in hydrodynamical si...

  2. Evolution of major histocompatibility complex class I and class II genes in the brown bear

    Kuduk Katarzyna

    2012-10-01

    Full Text Available Abstract Background Major histocompatibility complex (MHC proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae. Results We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN exceeded the rate of synonymous substitutions (dS at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca. Conclusions Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South–north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia.

  3. Document Clustering based on Topic Maps

    Rafi, Muhammad; Farooq, Amir; 10.5120/1640-2204

    2011-01-01

    Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next challenge lies in semantically performing clustering based on the semantic contents of the document. The problem of document clustering has two main components: (1) to represent the document in such a form that inherently captures semantics of the text. This may also help to reduce dimensionality of the document, and (2) to define a similarity measure based on the semantic representation such that it assigns higher numerical values to document pairs which have higher semantic relationship. Feature space of the documents can be very challenging for document clustering. A document may contain multiple topics, it may contain a large set of class-independent general-words, and a handful class-specific core-words. With these features in mind, traditional agglomerative clustering algori...

  4. Quotients of cluster categories

    Jorgensen, Peter

    2007-01-01

    Higher cluster categories were recently introduced as a generalization of cluster categories. This paper shows that in Dynkin types A and D, half of all higher cluster categories are actually just quotients of cluster categories. The other half can be obtained as quotients of 2-cluster categories, the "lowest" type of higher cluster categories. Hence, in Dynkin types A and D, all higher cluster phenomena are implicit in cluster categories and 2-cluster categories. In contrast, the same is not...

  5. Regional Innovation Clusters

    Small Business Administration — The Regional Innovation Clusters serve a diverse group of sectors and geographies. Three of the initial pilot clusters, termed Advanced Defense Technology clusters,...

  6. Multiway Spectral Clustering: A Margin-Based Perspective

    Zhang, Zhihua; Jordan, Michael I.

    2008-01-01

    Spectral clustering is a broad class of clustering procedures in which an intractable combinatorial optimization formulation of clustering is "relaxed" into a tractable eigenvector problem, and in which the relaxed solution is subsequently "rounded" into an approximate discrete solution to the original problem. In this paper we present a novel margin-based perspective on multiway spectral clustering. We show that the margin-based perspective illuminates both the relaxation and rounding aspect...

  7. Class Vectors: Embedding representation of Document Classes

    Sachan, Devendra Singh; Kumar, Shailesh

    2015-01-01

    Distributed representations of words and paragraphs as semantic embeddings in high dimensional data are used across a number of Natural Language Understanding tasks such as retrieval, translation, and classification. In this work, we propose "Class Vectors" - a framework for learning a vector per class in the same embedding space as the word and paragraph embeddings. Similarity between these class vectors and word vectors are used as features to classify a document to a class. In experiment o...

  8. Teachers in Class

    Van Galen, Jane

    2008-01-01

    In this article, I argue for a closer read of the daily "class work" of teachers, as posited by Reay, 1998. In developing exploratory class portraits of four teachers who occupy distinctive social positions (two from working-class homes now teaching upper-middle-class children and two from upper-middle-class homes now teaching poor children), I…

  9. Applying Machine Learning to Star Cluster Classification

    Fedorenko, Kristina; Grasha, Kathryn; Calzetti, Daniela; Mahadevan, Sridhar

    2016-01-01

    Catalogs describing populations of star clusters are essential in investigating a range of important issues, from star formation to galaxy evolution. Star cluster catalogs are typically created in a two-step process: in the first step, a catalog of sources is automatically produced; in the second step, each of the extracted sources is visually inspected by 3-to-5 human classifiers and assigned a category. Classification by humans is labor-intensive and time consuming, thus it creates a bottleneck, and substantially slows down progress in star cluster research.We seek to automate the process of labeling star clusters (the second step) through applying supervised machine learning techniques. This will provide a fast, objective, and reproducible classification. Our data is HST (WFC3 and ACS) images of galaxies in the distance range of 3.5-12 Mpc, with a few thousand star clusters already classified by humans as a part of the LEGUS (Legacy ExtraGalactic UV Survey) project. The classification is based on 4 labels (Class 1 - symmetric, compact cluster; Class 2 - concentrated object with some degree of asymmetry; Class 3 - multiple peak system, diffuse; and Class 4 - spurious detection). We start by looking at basic machine learning methods such as decision trees. We then proceed to evaluate performance of more advanced techniques, focusing on convolutional neural networks and other Deep Learning methods. We analyze the results, and suggest several directions for further improvement.

  10. Clustering experiments

    Wang, Zhengwei; Tan, Ken; Di, Zengru; Roehner, Bertrand M

    2011-01-01

    It is well known that bees cluster together in cold weather, in the process of swarming (when the ``old'' queen leaves with part of the colony) or absconding (when the queen leaves with all the colony) and in defense against intruders such as wasps or hornets. In this paper we describe a fairly different clustering process which occurs at any temperature and independently of any special stimulus or circumstance. As a matter of fact, this process is about four times faster at 28 degree Celsius than at 15 degrees. Because of its simplicity and low level of ``noise'' we think that this phenomenon can provide a means for exploring the strength of inter-individual attraction between bees or other living organisms. For instance, and at first sight fairly surprisingly, our observations showed that this attraction does also exist between bees belonging to different colonies. As this study is aimed at providing a comparative perspective, we also describe a similar clustering experiment for red fire ants.

  11. Does Class Size Matter?

    Ehrenberg, Ronald G.; Brewer, Dominic J.; Gamoran, Adam; Willms, J. Douglas

    2001-01-01

    Reports on the significance of class size to student learning. Includes an overview of class size in various countries, the importance of teacher adaptability, and the Asian paradox of large classes allied to high test scores. (MM)

  12. Factor PD-Clustering

    Gettler Summa, Mireille; Palumbo, Francesco; Tortora, Cristina

    2012-01-01

    Factorial clustering methods have been developed in recent years thanks to the improving of computational power. These methods perform a linear transformation of data and a clustering on transformed data optimizing a common criterion. Factorial PD-clustering is based on Probabilistic Distance clustering (PD-clustering). PD-clustering is an iterative, distribution free, probabilistic, clustering method. Factor PD-clustering make a linear transformation of original variables into a reduced numb...

  13. Clustering Analysis on E-commerce Transaction Based on K-means Clustering

    Xuan HUANG

    2014-02-01

    Full Text Available Based on the density, increment and grid etc, shortcomings like the bad elasticity, weak handling ability of high-dimensional data, sensitive to time sequence of data, bad independence of parameters and weak handling ability of noise are usually existed in clustering algorithm when facing a large number of high-dimensional transaction data. Making experiments by sampling data samples of the 300 mobile phones of Taobao, the following conclusions can be obtained: compared with Single-pass clustering algorithm, the K-means clustering algorithm has a high intra-class dissimilarity and inter-class similarity when analyzing e-commerce transaction. In addition, the K-means clustering algorithm has very high efficiency and strong elasticity when dealing with a large number of data items. However, clustering effects of this algorithm are affected by clustering number and initial positions of clustering center. Therefore, it is easy to show the local optimization for clustering results. Therefore, how to determine clustering number and initial positions of the clustering center of this algorithm is still the important job to be researched in the future.

  14. Image Segmentation Via Color Clustering

    Kaveh Heidary

    2014-01-01

    This paper develops a computationally efficient process for segmentation of color images. The input image is partitioned into a set of output images in accordance to color characteristics of various image regions. The algorithm is based on random sampling of the input image and fuzzy clustering of the training data followed by crisp classification of the input image. The user prescribes the number of randomly selected pixels comprising the trainer set and the number of color classes character...

  15. Classification of open clusters by centroid method of taxonomical analysis

    Distributions of open clusters of the Galaxy in spaces with coordinates being mass, absolute magnitude, integrated colour index, diameter, metallicity, and age, are considered. Majority of clusters are shown to enter several taxons (classes) with narrow enough limits of these parameters. The classes form a linear sequence by age and two-dimensional sequence on colour - magnitude diagram. They are not isolated but transit into each other continuously. It possibly means an absence of significant gaps in cluster formation process. Bifurcation of age sequence of classes depending on mass and diameter values is found. This allows an evolutionary interpretation

  16. Magnetic properties of icosahedral MRu12 clusters

    The magnetic properties of icosahedral MRu12 clusters are studied using the discrete-variational local-spin-density-functional method, where M = V, Cr, Mn, Fe, Co, and Ni. The results show that all of the Ih MRu12 clusters, just like the case for the Ih Ru13 cluster, have double magnetic solutions. In contrast to the moment of 4 μB for the Ih Ru13 cluster, the total magnetic moments of the Ih MRu12 clusters, ranging from 1 μB to 20 μB, have been changed greatly by the substitution of the central Ru atom with M. Among them, the NiRu12 cluster has a giant moment of 20 μB. Furthermore, the NiRu12 cluster has nondegenerate ground state and could be expected to be remarkably stable. Therefore, for the purpose of enhancing the magnetic moment of the Ih Ru13 cluster, Ni is a promising candidate as a dopant. Finally, we predict that all the Ih MRu12 clusters except NiRu12 might belong to the class in which the magnetization of the cluster increases with temperature. (author)

  17. Clustering of drinker prototype characteristics: what characterizes the typical drinker?

    van Lettow, Britt; Vermunt, Jeroen K; de Vries, Hein; Burdorf, Alex; van Empelen, Pepijn

    2013-08-01

    Prototypes (social images) have been shown to influence behaviour, which is likely to depend on the type of image. Prototype evaluation is based on (un)desirable characteristics related to that image. By an elicitation procedure we examined which adjectives are attributed to specific drinker prototypes. In total 149 young Dutch adults (18-25 years of age) provided adjectives for five drinker prototypes: abstainer, moderate drinker, heavy drinker, tipsy, and drunk person. Twenty-three unique adjectives were found. Multilevel latent class cluster analysis revealed six adjective clusters, each with unique and minor overlapping adjectives: 'negative, excessive drinker,' 'moderate, responsible drinker,' 'funny tipsy drinker,' 'determined abstainer cluster,' 'uncontrolled excessive drinker,' and 'elated tipsy cluster.' In addition, four respondent classes were identified. Respondent classes showed differences in their focus on specific adjective clusters. Classes could be labelled 'focus-on-control class,' 'focus-on-hedonism class,' 'contrasting-extremes-prototypes class,' and 'focus-on-elation class.' Respondent classes differed in gender, educational level and drinking behaviour. The results underscore the importance to differentiate between various prototypes and in prototype adjectives among young adults: subgroup differences in prototype salience and relevance are possibly due to differences in adjective labelling. The results provide insights into explaining differences in drinking behaviour and could potentially be used to target and tailor interventions aimed at lowering alcohol consumption among young adults via prototype alteration. PMID:23848388

  18. Cluster automorphisms and compatibility of cluster variables

    Assem, Ibrahim; Schiffler, Ralf; Shramchenko, Vasilisa

    2013-01-01

    In this paper, we introduce a notion of unistructural cluster algebras, for which the set of cluster variables uniquely determines the clusters. We prove that cluster algebras of Dynkin type and cluster algebras of rank 2 are unistructural, then prove that if $\\mathcal{A}$ is unistructural or of Euclidean type, then $f: \\mathcal{A}\\to \\mathcal{A}$ is a cluster automorphism if and only if $f$ is an automorphism of the ambient field which restricts to a permutation of the cluster variables. In ...

  19. Horizontal Transfer and Death of a Fungal Secondary Metabolic Gene Cluster

    Campbell, Matthew A; Rokas, Antonis; Slot, Jason C.

    2012-01-01

    A cluster composed of four structural and two regulatory genes found in several species of the fungal genus Fusarium (class Sordariomycetes) is responsible for the production of the red pigment bikaverin. We discovered that the unrelated fungus Botrytis cinerea (class Leotiomycetes) contains a cluster of five genes that is highly similar in sequence and gene order to the Fusarium bikaverin cluster. Synteny conservation, nucleotide composition, and phylogenetic analyses of the cluster genes in...

  20. Globular Cluster Formation in the Virgo Cluster

    Moran, C Corbett; Lake, G

    2014-01-01

    Metal poor globular clusters (MPGCs) are a unique probe of the early universe, in particular the reionization era. Systems of globular clusters in galaxy clusters are particularly interesting as it is in the progenitors of galaxy clusters that the earliest reionizing sources first formed. Although the exact physical origin of globular clusters is still debated, it is generally admitted that globular clusters form in early, rare dark matter peaks (Moore et al. 2006; Boley et al. 2009). We provide a fully numerical analysis of the Virgo cluster globular cluster system by identifying the present day globular cluster system with exactly such early, rare dark matter peaks. A popular hypothesis is that that the observed truncation of blue metal poor globular cluster formation is due to reionization (Spitler et al. 2012; Boley et al. 2009; Brodie & Strader 2006); adopting this view, constraining the formation epoch of MPGCs provides a complementary constraint on the epoch of reionization. By analyzing both the l...

  1. Adaptive Evolutionary Clustering

    Xu, Kevin S.; Kliger, Mark; Hero III, Alfred O.

    2011-01-01

    In many practical applications of clustering, the objects to be clustered evolve over time, and a clustering result is desired at each time step. In such applications, evolutionary clustering typically outperforms traditional static clustering by producing clustering results that reflect long-term trends while being robust to short-term variations. Several evolutionary clustering algorithms have recently been proposed, often by adding a temporal smoothness penalty to the cost function of a st...

  2. Relational visual cluster validity

    Ding, Y.; Harrison, R F

    2007-01-01

    The assessment of cluster validity plays a very important role in cluster analysis. Most commonly used cluster validity methods are based on statistical hypothesis testing or finding the best clustering scheme by computing a number of different cluster validity indices. A number of visual methods of cluster validity have been produced to display directly the validity of clusters by mapping data into two- or three-dimensional space. However, these methods may lose too much information to corre...

  3. Issues Challenges and Tools of Clustering Algorithms

    Parul Agarwal

    2011-05-01

    Full Text Available Clustering is an unsupervised technique of Data Mining. It means grouping similar objects together and separating the dissimilar ones. Each object in the data set is assigned a class label in the clustering process using a distance measure. This paper has captured the problems that are faced in real when clustering algorithms are implemented .It also considers the most extensively used tools which are readily available and support functions which ease the programming. Once algorithms have been implemented, they also need to be tested for its validity. There exist several validation indexes for testing the performance and accuracy which have also been discussed here.

  4. A pattern theorem for lattice clusters

    Madras, Neal

    1999-01-01

    We consider general classes of lattice clusters, including various kinds of animals and trees on different lattices. We prove that if a given local configuration ("pattern") of sites and bonds can occur in large clusters, then it occurs at least cN times in most clusters of size n, for some constant c>0. An analogous theorem for self-avoiding walks was proven in 1963 by Kesten. The results also apply to weighted sums, and in particular we can take a$sub n$ to be the probability that the perco...

  5. Cluster formation in quantum critical systems

    The presence of magnetic clusters has been verified in both antiferromagnetic and ferromagnetic quantum critical systems. We review some of the strongest evidence for strongly doped quantum critical systems (Ce(Ru0.24Fe0.76)2Ge2) and we discuss the implications for the response of the system when cluster formation is combined with finite size effects. In particular, we discuss the change of universality class that is observed close to the order-disorder transition. We detail the conditions under which clustering effects will play a significant role also in the response of stoichiometric systems and their experimental signature.

  6. Text Clustering with String Kernels in R

    Karatzoglou, Alexandros; Feinerer , Ingo

    2006-01-01

    We present a package which provides a general framework, including tools and algorithms, for text mining in R using the S4 class system. Using this package and the kernlab R package we explore the use of kernel methods for clustering (e.g., kernel k-means and spectral clustering) on a set of text documents, using string kernels. We compare these methods to a more traditional clustering technique like k-means on a bag of word representation of the text and evaluate the viability of kernel-base...

  7. Double-partition Quantum Cluster Algebras

    Jakobsen, Hans Plesner; Zhang, Hechun

    2012-01-01

    A family of quantum cluster algebras is introduced and studied. In general, these algebras are new, but sub-classes have been studied previously by other authors. The algebras are indexed by double parti- tions or double flag varieties. Equivalently, they are indexed by broken lines L. By grouping...... together neighboring mutations into quantum line mutations we can mutate from the cluster algebra of one broken line to another. Compatible pairs can be written down. The algebras are equal to their upper cluster algebras. The variables of the quantum seeds are given by elements of the dual canonical basis....

  8. Loosely coupled class families

    Ernst, Erik

    2001-01-01

    Families of mutually dependent classes that may be accessed polymor- phically provide an advanced tool for separation of concerns, in that it enables client code to use a group of instances of related classes safely without depending on the exact classes involved. However, class families which are...... expressed using virtual classes seem to be very tightly coupled internally. While clients have achieved the freedom to dynamically use one or the other family, it seems that any given family contains a xed set of classes and we will need to create an entire family of its own just in order to replace one of...... the members with another class. This paper shows how to express class families in such a manner that the classes in these families can be used in many dierent combinations, still enabling family polymorphism and ensuring type safety....

  9. Comparative genomics of vertebrate Fox cluster loci

    Shimeld Sebastian M

    2006-10-01

    Full Text Available Abstract Background Vertebrate genomes contain numerous duplicate genes, many of which are organised into paralagous regions indicating duplication of linked groups of genes. Comparison of genomic organisation in different lineages can often allow the evolutionary history of such regions to be traced. A classic example of this is the Hox genes, where the presence of a single continuous Hox cluster in amphioxus and four vertebrate clusters has allowed the genomic evolution of this region to be established. Fox transcription factors of the C, F, L1 and Q1 classes are also organised in clusters in both amphioxus and humans. However in contrast to the Hox genes, only two clusters of paralogous Fox genes have so far been identified in the Human genome and the organisation in other vertebrates is unknown. Results To uncover the evolutionary history of the Fox clusters, we report on the comparative genomics of these loci. We demonstrate two further paralogous regions in the Human genome, and identify orthologous regions in mammalian, chicken, frog and teleost genomes, timing the duplications to before the separation of the actinopterygian and sarcopterygian lineages. An additional Fox class, FoxS, was also found to reside in this duplicated genomic region. Conclusion Comparison of loci identifies the pattern of gene duplication, loss and cluster break up through multiple lineages, and suggests FoxS1 is a likely remnant of Fox cluster duplication.

  10. NEO-FFI personality clusters in trichotillomania.

    Keuthen, Nancy J; Tung, Esther S; Tung, Matthew G; Curley, Erin E; Flessner, Christopher A

    2016-05-30

    The purpose of this study was to determine whether personality prototypes exist among hair pullers and if these groups differ in hair pulling (HP) characteristics, clinical correlates, and quality of life. 164 adult hair pullers completed the NEO-Five Factor Inventory (NEO-FFI; Costa and McCrae, 1992) and self-report measures of HP severity, HP style, affective state, and quality of life. A latent class cluster analysis using NEO-FFI scores was performed to separate participants into clusters. Bonferroni-corrected t-tests were used to compare clusters on HP, affective, and quality of life variables. Multiple regression was used to determine which variables significantly predicted quality of life. Two distinct personality prototypes were identified. Cluster 1 (n=96) had higher neuroticism and lower extraversion, agreeableness, and conscientiousness when compared to cluster 2 (n=68). No significant differences in demographics were reported for the two personality clusters. The clusters differed on extent of focused HP, severity of depression, anxiety, and stress, as well as quality of life. Those in cluster 1 endorsed greater depression, anxiety, and stress, and worse quality of life. Additionally, only depression and cluster membership (based on NEO scores) significantly predicted quality of life. PMID:27016621

  11. Partitional clustering algorithms

    2015-01-01

    This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised classification of patterns into groups, is one of the most important tasks in exploratory data analysis. Primary goals of clustering include gaining insight into, classifying, and compressing data. Clustering has a long and rich history that spans a variety of scientific disciplines including anthropology, biology, medicine, psychology, statistics, mathematics, engineering, and computer science. As a result, numerous clustering algorithms have been proposed since the early 1950s. Among these algorithms, partitional (nonhierarchical) ones have found many applications, especially in engineering and computer science. This book provides coverage of consensus clustering, constrained clustering, large scale and/or high dimensional clustering, cluster validity, cluster visualization, and applications of clustering. Examines clustering as it applies to large and/or high-dimensional data sets commonly encountered in reali...

  12. Cluster Evaluation of Density Based Subspace Clustering

    Sembiring, Rahmat Widia; Zain, Jasni Mohamad

    2010-01-01

    Clustering real world data often faced with curse of dimensionality, where real world data often consist of many dimensions. Multidimensional data clustering evaluation can be done through a density-based approach. Density approaches based on the paradigm introduced by DBSCAN clustering. In this approach, density of each object neighbours with MinPoints will be calculated. Cluster change will occur in accordance with changes in density of each object neighbours. The neighbours of each object ...

  13. Sparse Convex Clustering

    Wang, Binhuan; Zhang, Yilong; Sun, Wei; Fang, Yixin

    2016-01-01

    Convex clustering, a convex relaxation of k-means clustering and hierarchical clustering, has drawn recent attentions since it nicely addresses the instability issue of traditional nonconvex clustering methods. Although its computational and statistical properties have been recently studied, the performance of convex clustering has not yet been investigated in the high-dimensional clustering scenario, where the data contains a large number of features and many of them carry no information abo...

  14. Clustering with Spectral Methods

    Gaertler, Marco

    2002-01-01

    Grouping and sorting are problems with a great tradition in the history of mankind. Clustering and cluster analysis is a small aspect in the wide spectrum. But these topics have applications in most scientific disciplines. Graph clustering is again a little fragment in the clustering area. Nevertheless it has the potential for new pioneering and innovative methods. One such method is the Markov Clustering presented by van Dongen in 'Graph Clustering by Flow Simulation'. We investigated the qu...

  15. A Virtual Class Calculus

    Ernst, Erik; Ostermann, Klaus; Cook, William Randall

    2006-01-01

    , statically typed model for virtual classes has been a long-standing open question. This paper presents a virtual class calculus, vc, that captures the essence of virtual classes in these full-fledged programming languages. The key contributions of the paper are a formalization of the dynamic and static...

  16. FarMon: an extensible, efficient cluster monitoring system

    The authors present the design and implementation of FarMon- a flexible event monitoring system for computing cluster. Using several techniques including DCL (Dynamic Class Loading) technique, module publish/subscribe/unsubscribe protocol and directory service, the authors create a high efficient, high extensible and high portable cluster monitoring system

  17. Gennclus: New Models for General Nonhierarchical Clustering Analysis.

    Desarbo, Wayne S.

    1982-01-01

    A general class of nonhierarchical clustering models and associated algorithms for fitting them are presented. These models generalize the Shepard-Arabie Additive clusters model. Two applications are given and extensions to three-way models, nonmetric analyses, and other model specifications are provided. (Author/JKS)

  18. A Survey of Popular R Packages for Cluster Analysis

    Flynt, Abby; Dean, Nema

    2016-01-01

    Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring data sets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans, and hclust functions; the mclust library; the poLCA…

  19. Entropic One-Class Classifiers.

    Livi, Lorenzo; Sadeghian, Alireza; Pedrycz, Witold

    2015-12-01

    The one-class classification problem is a well-known research endeavor in pattern recognition. The problem is also known under different names, such as outlier and novelty/anomaly detection. The core of the problem consists in modeling and recognizing patterns belonging only to a so-called target class. All other patterns are termed nontarget, and therefore, they should be recognized as such. In this paper, we propose a novel one-class classification system that is based on an interplay of different techniques. Primarily, we follow a dissimilarity representation-based approach; we embed the input data into the dissimilarity space (DS) by means of an appropriate parametric dissimilarity measure. This step allows us to process virtually any type of data. The dissimilarity vectors are then represented by weighted Euclidean graphs, which we use to determine the entropy of the data distribution in the DS and at the same time to derive effective decision regions that are modeled as clusters of vertices. Since the dissimilarity measure for the input data is parametric, we optimize its parameters by means of a global optimization scheme, which considers both mesoscopic and structural characteristics of the data represented through the graphs. The proposed one-class classifier is designed to provide both hard (Boolean) and soft decisions about the recognition of test patterns, allowing an accurate description of the classification process. We evaluate the performance of the system on different benchmarking data sets, containing either feature-based or structured patterns. Experimental results demonstrate the effectiveness of the proposed technique. PMID:25879977

  20. Learning predictive clustering rules

    Ženko, Bernard; Džeroski, Sašo; Struyf, Jan

    2005-01-01

    The two most commonly addressed data mining tasks are predictive modelling and clustering. Here we address the task of predictive clustering, which contains elements of both and generalizes them to some extent. We propose a novel approach to predictive clustering called predictive clustering rules, present an initial implementation and its preliminary experimental evaluation.

  1. Clustering of correlated networks

    Dorogovtsev, S. N.

    2003-01-01

    We obtain the clustering coefficient, the degree-dependent local clustering, and the mean clustering of networks with arbitrary correlations between the degrees of the nearest-neighbor vertices. The resulting formulas allow one to determine the nature of the clustering of a network.

  2. Structures of Mn clusters

    Tina M Briere; Marcel H F Sluiter; Vijay Kumar; Yoshiyuki Kawazoe

    2003-01-01

    The geometries of several Mn clusters in the size range Mn13–Mn23 are studied via the generalized gradient approximation to density functional theory. For the 13- and 19-atom clusters, the icosahedral structures are found to be most stable, while for the 15-atom cluster, the bcc structure is more favoured. The clusters show ferrimagnetic spin configurations.

  3. Foodservice Occupations Cluster Guide.

    Oregon State Dept. of Education, Salem.

    Intended to assist vocational teachers in developing and implementing a cluster program in food service occupations, this guide contains sections on cluster organization and implementation and instructional emphasis areas. The cluster organization and implementation section covers goal-based planning and includes a proposed cluster curriculum, a…

  4. On Comparison of Clustering Methods for Pharmacoepidemiological Data.

    Feuillet, Fanny; Bellanger, Lise; Hardouin, Jean-Benoit; Victorri-Vigneau, Caroline; Sébille, Véronique

    2015-01-01

    The high consumption of psychotropic drugs is a public health problem. Rigorous statistical methods are needed to identify consumption characteristics in post-marketing phase. Agglomerative hierarchical clustering (AHC) and latent class analysis (LCA) can both provide clusters of subjects with similar characteristics. The objective of this study was to compare these two methods in pharmacoepidemiology, on several criteria: number of clusters, concordance, interpretation, and stability over time. From a dataset on bromazepam consumption, the two methods present a good concordance. AHC is a very stable method and it provides homogeneous classes. LCA is an inferential approach and seems to allow identifying more accurately extreme deviant behavior. PMID:24905478

  5. Relevant Subspace Clustering

    Müller, Emmanuel; Assent, Ira; Günnemann, Stephan; Krieger, Ralph; Seidl, Thomas

    Subspace clustering aims at detecting clusters in any subspace projection of a high dimensional space. As the number of possible subspace projections is exponential in the number of dimensions, the result is often tremendously large. Recent approaches fail to reduce results to relevant subspace...... clusters. Their results are typically highly redundant, i.e. many clusters are detected multiple times in several projections. In this work, we propose a novel model for relevant subspace clustering (RESCU). We present a global optimization which detects the most interesting non-redundant subspace clusters...... achieves top clustering quality while competing approaches show greatly varying performance....

  6. Tilting theory and cluster algebras

    Reiten, Idun

    2010-01-01

    We give an introduction to the theory of cluster categories and cluster tilted algebras. We include some background on the theory of cluster algebras, and discuss the interplay with cluster categories and cluster tilted algebras.

  7. Cluster ion beam facilities

    A brief state-of-the-art review in the field of cluster-surface interactions is presented. Ionised cluster beams could become a powerful and versatile tool for the modification and processing of surfaces as an alternative to ion implantation and ion assisted deposition. The main effects of cluster-surface collisions and possible applications of cluster ion beams are discussed. The outlooks of the Cluster Implantation and Deposition Apparatus (CIDA) being developed in Guteborg University are shown

  8. Parallel Local Graph Clustering

    Shun, Julian; Roosta-Khorasani, Farbod; Fountoulakis, Kimon; Mahoney, Michael W.

    2016-01-01

    Graph clustering has many important applications in computing, but due to growing sizes of graph, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest. Motivated partly by this, so-called local algorithms for graph clustering have received significant interest due to the fact that they can find good clusters in a graph with work proportional to the size of the cluster rather than that of the entire graph. T...

  9. Graded cluster algebras

    Grabowski, Jan

    2015-01-01

    In the cluster algebra literature, the notion of a graded cluster algebra has been implicit since the origin of the subject. In this work, we wish to bring this aspect of cluster algebra theory to the foreground and promote its study. We transfer a definition of Gekhtman, Shapiro and Vainshtein to the algebraic setting, yielding the notion of a multi-graded cluster algebra. We then study gradings for finite type cluster algebras without coefficients, giving a full classification. Translating ...

  10. A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset

    K. R.S. Kumar

    2012-01-01

    Full Text Available Problem statement: Clustering and visualizing high-dimensional dynamic data is a challenging problem. Most of the existing clustering algorithms are based on the static statistical relationship among data. Dynamic clustering is a mechanism to adopt and discover clusters in real time environments. There are many applications such as incremental data mining in data warehousing applications, sensor network, which relies on dynamic data clustering algorithms. Approach: In this work, we present a density based dynamic data clustering algorithm for clustering incremental dataset and compare its performance with full run of normal DBSCAN, Chameleon on the dynamic dataset. Most of the clustering algorithms perform well and will give ideal performance with good accuracy measured with clustering accuracy, which is calculated using the original class labels and the calculated class labels. However, if we measure the performance with a cluster validation metric, then it will give another kind of result. Results: This study addresses the problems of clustering a dynamic dataset in which the data set is increasing in size over time by adding more and more data. So to evaluate the performance of the algorithms, we used Generalized Dunn Index (GDI, Davies-Bouldin index (DB as the cluster validation metric and as well as time taken for clustering. Conclusion: In this study, we have successfully implemented and evaluated the proposed density based dynamic clustering algorithm. The performance of the algorithm was compared with Chameleon and DBSCAN clustering algorithms. The proposed algorithm performed significantly well in terms of clustering accuracy as well as speed.

  11. Cool Core Clusters from Cosmological Simulations

    Rasia, E; Murante, G; Planelles, S; Beck, A M; Biffi, V; Ragone-Figueroa, C; Granato, G L; Steinborn, L K; Dolag, K

    2015-01-01

    We present results obtained from a set of cosmological hydrodynamic simulations of galaxy clusters, aimed at comparing predictions with observational data on the diversity between cool-core and non-cool-core clusters. Our simulations include the effects of stellar and AGN feedback and are based on an improved version of the Smoothed-Particle-Hydrodynamics code GADGET-3, which ameliorates gas mixing and better captures gas-dynamical instabilities by including a suitable artificial thermal diffusion. In this Letter, we focus our analysis on the entropy profiles, our primary diagnostic to classify the degree of cool-coreness of clusters, and on the iron profiles. In keeping with observations, our simulated clusters display a variety of behaviors in entropy profiles: they range from steadily decreasing profiles at small radii, characteristic of cool-core systems, to nearly flat core isentropic profiles, characteristic of non cool-core systems. Using observational criteria to distinguish between the two classes of...

  12. UCD Candidates in the Hydra Cluster

    Wehner, Elizabeth

    2007-01-01

    NGC 3311, the giant cD galaxy in the Hydra cluster (A1060), has one of the largest globular cluster systems known. We describe new Gemini GMOS (g',i') photometry of the NGC 3311 field which reveals that the red, metal-rich side of its globular cluster population extends smoothly upward into the mass range associated with the new class of Ultra-Compact Dwarfs (UCDs). We identify 29 UCD candidates with estimated masses > 6x10^6 solar masses and discuss their characteristics. This UCD-like sequence is the most well defined one yet seen, and reinforces current ideas that the high-mass end of the globular cluster sequence merges continuously into the UCD sequence, which connects in turn to the E galaxy structural sequence.

  13. Semantic Analysis of Virtual Classes and Nested Classes

    Madsen, Ole Lehrmann

    1999-01-01

    Virtual classes and nested classes are distinguishing features of BETA. Nested classes originated from Simula, but until recently they have not been part of main stream object- oriented languages. C++ has a restricted form of nested classes and they were included in Java 1.1. Virtual classes is the...... classes and parameterized classes have been made. Although virtual classes and nested classes have been used in BETA for more than a decade, their implementation has not been published. The purpose of this paper is to contribute to the understanding of virtual classes and nested classes by presenting the...

  14. Class network routing

    Bhanot, Gyan; Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Steinmacher-Burow, Burkhard D.; Takken, Todd E.; Vranas, Pavlos M.

    2009-09-08

    Class network routing is implemented in a network such as a computer network comprising a plurality of parallel compute processors at nodes thereof. Class network routing allows a compute processor to broadcast a message to a range (one or more) of other compute processors in the computer network, such as processors in a column or a row. Normally this type of operation requires a separate message to be sent to each processor. With class network routing pursuant to the invention, a single message is sufficient, which generally reduces the total number of messages in the network as well as the latency to do a broadcast. Class network routing is also applied to dense matrix inversion algorithms on distributed memory parallel supercomputers with hardware class function (multicast) capability. This is achieved by exploiting the fact that the communication patterns of dense matrix inversion can be served by hardware class functions, which results in faster execution times.

  15. Subpopulation Discovery in Epidemiological Data with Subspace Clustering

    Niemann Uli

    2014-12-01

    Full Text Available A prerequisite of personalized medicine is the identification of groups of people who share specific risk factors towards an outcome. We investigate the potential of subspace clustering for finding such groups in epidemiological data. We propose a workflow that encompasses clusterability assessment before cluster discovery and quality assessment after learning the clusters. Epidemiological usually do not have a ground truth for the verification of clusters found in subspaces. Hence, we introduce quality assessment through juxtaposition of the learned models to “models-of-randomness”, i.e. models that do not reflect a true cluster structure. On the basis of this workflow, we select subspace clustering methods, compare and discuss their performance. We use a dataset with hepatic steatosis as outcome, but our findings apply on arbitrary epidemiological cohort data that have tenths of variables and exhibit class skew.

  16. Cluster Evaluation of Density Based Subspace Clustering

    Sembiring, Rahmat Widia

    2010-01-01

    Clustering real world data often faced with curse of dimensionality, where real world data often consist of many dimensions. Multidimensional data clustering evaluation can be done through a density-based approach. Density approaches based on the paradigm introduced by DBSCAN clustering. In this approach, density of each object neighbours with MinPoints will be calculated. Cluster change will occur in accordance with changes in density of each object neighbours. The neighbours of each object typically determined using a distance function, for example the Euclidean distance. In this paper SUBCLU, FIRES and INSCY methods will be applied to clustering 6x1595 dimension synthetic datasets. IO Entropy, F1 Measure, coverage, accurate and time consumption used as evaluation performance parameters. Evaluation results showed SUBCLU method requires considerable time to process subspace clustering; however, its value coverage is better. Meanwhile INSCY method is better for accuracy comparing with two other methods, altho...

  17. Clustering Algorithm Based on Crowding Niche%小生境排挤聚类算法

    业宁; 董逸生

    2003-01-01

    A new clustering algorithm is proposed in this paper, which is based on crowding niche. Homogeneityspontaneous to withstands heterogeneity when organisms are evolving. Contemporary, Individual in same class com-pete each other to strive for limited resource. Individual that has bad fitness will be eliminated. We propose a cluster-ing algorithm based on this idea. Experiment evaluation has proved its efficiency.

  18. Coupled Two-Way Clustering Analysis of Breast Cancer and Colon Cancer Gene Expression Data

    Getz, G; Kela, I; Domany, E; Notterman, D A; Getz, Gad; Gal, Hilah; Kela, Itai; Domany, Eytan; Notterman, Dan A.

    2003-01-01

    We present and review Coupled Two Way Clustering, a method designed to mine gene expression data. The method identifies submatrices of the total expression matrix, whose clustering analysis reveals partitions of samples (and genes) into biologically relevant classes. We demonstrate, on data from colon and breast cancer, that we are able to identify partitions that elude standard clustering analysis.

  19. Fostering a Middle Class

    YAO BIN

    2011-01-01

    Though there is no official definition of "middle class" in China,the tag has become one few Chinese people believe they deserve anyway.In early August,the Chinese Academy of Social Sciences released a report on China's urban development,saying China had a middle-class population of 230 million in 2009,or 37 percent of its urban residents.It also forecast half of city dwellers in China would be part of the middle class by 2023.

  20. Media Clusters and Media Cluster Policies

    Karlsson, Charlie; Picard, Robert

    2011-01-01

    Large media clusters have emerged in a limited number of large cities, characterizing the geographical concentration of the global media industry. This paper explores the reasons behind the localization patterns of media industries, the effect of the rapid advancement of Information and Communication Technologies (ICT) on media clusters and the role of media cluster policies. One might draw the conclusion that with the developments of the ICT sector and the fact that there are no raw material...

  1. Class, Culture and Politics

    Harrits, Gitte Sommer

    2013-01-01

    , discussions within political sociology have not yet utilized the merits of a multidimensional conception of class. In light of this, the article suggests a comprehensive Bourdieusian framework for class analysis, integrating culture as both a structural phenomenon co-constitutive of class and as symbolic...... practice. Further, the article explores this theoretical framework in a multiple correspondence analysis of a Danish survey, demonstrating how class and political practices are indeed homologous. However, the analysis also points at several elements of field autonomy, and the concluding discussion...

  2. New laser classes

    By an up-dated international standard (IEC 60825-1 + Amendment 2) on laser safety some new laser classes are introduced. The new set of laser classes consists of 1, 1M, 2, 2M, 3R, 3B, and 4. This is a result of intense discussions in the committee and was laid down in 2000, slightly adjusted 2001. The previous classes 1, 2, 3A, 3B, and 4, established since more than 25 years, are partly abandoned. This paper compares the new classes to the old ones. (orig.)

  3. Cluster selection in divisive clustering algorithms

    Savaresi, Sergio,; Boley, Daniel L.; Bittanti, Sergio; Gazzaniga, Giovanna

    2002-01-01

    This paper deals with the problem of clustering a data-set. In particular, the bisecting divisive approach is here considered. This approach can be naturally divided into two sub-problems: the problem of choosing which cluster must be divided, and the problem of splitting the selected cluster. The focus here is on the first problem. The contribution of this work is to propose a new technique for the selection of the cluster to split. This technique is based upon the shape of...

  4. All about RIKEN Integrated Cluster of Clusters (RICC

    Maho Nakata

    2012-07-01

    Full Text Available

    This is an introduction to the RIKEN's supercomputer RIKEN Integrated Cluster of Clusters (RICC, that has been in operation since August 2009. The basic concept of the RICC is to "provide an environment with high power computational resources to facilitate research and development for RIKEN's researchers". Based on this concept, we have been operating the RICC system as a (i data analysis environment for experimental researchers, (ii development environment targeting the next-generation supercomputer; i.e., the "K" computer, and (iii GPU (graphics processing unit computers for exploring challenges in developing a future computer environment. The total performance of RICC is 97.94 TFlops, ranking it as the 125th on the Top500 list in Nov. 2011. We prepared four job class accounts, based on the researchers' proposals prior to evaluation by our Review Committee. We also provided backup services to RIKEN's researchers, such as conducting RICC training classes, software installation services, and speed up and visualization support. To encourage affirmative participation and proactive initiation, all the services were free of charge; however, access to RICC was limited to researchers and collaborators of RIKEN. As a result, RICC has been able to maintain a high activity ratio (> 90% since the beginning of its operation.

  5. Galaxy Luminosity Functions in WINGS clusters

    Moretti, A; Poggianti, B M; Fasano, G; Varela, J; D'Onofrio, M; Vulcani, B; Cava, A; Fritz, J; Couch, W J; Moles, M; Kjærgaard, P

    2015-01-01

    Using V band photometry of the WINGS survey, we derive galaxy luminosity functions (LF) in nearby clusters. This sample is complete down to Mv=-15.15, and it is homogeneous, thus allowing the study of an unbiased sample of clusters with different characteristics. We constructed the photometric LF for 72 out of the original 76 WINGS clusters, excluding only those without a velocity dispersion estimate. For each cluster we obtained the LF for galaxies in a region of radius=0.5 x r200, and fitted them with single and double Schechter's functions. We also derive the composite LF for the entire sample, and those pertaining to different morphological classes. Finally we derive the spectroscopic cumulative LF for 2009 galaxies that are cluster members. The double Schechter fit parameters are neither correlated with the cluster velocity dispersion, nor with the X-ray luminosity. Our median values of the Schechter's fit slope are, on average, in agreement with measurements of nearby clusters, but are less steep that t...

  6. Young massive star clusters

    Zwart, Simon Portegies; Gieles, Mark

    2010-01-01

    Young massive clusters are dense aggregates of young stars that form the fundamental building blocks of galaxies. Several examples exist in the Milky Way Galaxy and the Local Group, but they are particularly abundant in starburst and interacting galaxies. The few young massive clusters that are close enough to resolve are of prime interest for studying the stellar mass function and the ecological interplay between stellar evolution and stellar dynamics. The distant unresolved clusters may be effectively used to study the star-cluster mass function, and they provide excellent constraints on the formation mechanisms of young cluster populations. Young massive clusters are expected to be the nurseries for many unusual objects, including a wide range of exotic stars and binaries. So far only a few such objects have been found in young massive clusters, although their older cousins, the globular clusters, are unusually rich in stellar exotica. In this review we focus on star clusters younger than $\\sim100$\\,Myr, m...

  7. Determining the Optimal Number of Clusters with the Clustergram

    Fluegemann, Joseph K.; Davies, Misty D.; Aguirre, Nathan D.

    2011-01-01

    Cluster analysis aids research in many different fields, from business to biology to aerospace. It consists of using statistical techniques to group objects in large sets of data into meaningful classes. However, this process of ordering data points presents much uncertainty because it involves several steps, many of which are subject to researcher judgment as well as inconsistencies depending on the specific data type and research goals. These steps include the method used to cluster the data, the variables on which the cluster analysis will be operating, the number of resulting clusters, and parts of the interpretation process. In most cases, the number of clusters must be guessed or estimated before employing the clustering method. Many remedies have been proposed, but none is unassailable and certainly not for all data types. Thus, the aim of current research for better techniques of determining the number of clusters is generally confined to demonstrating that the new technique excels other methods in performance for several disparate data types. Our research makes use of a new cluster-number-determination technique based on the clustergram: a graph that shows how the number of objects in the cluster and the cluster mean (the ordinate) change with the number of clusters (the abscissa). We use the features of the clustergram to make the best determination of the cluster-number.

  8. Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Machine Learning

    Ntampaka, Michelle; Trac, Hy; Sutherland, Dougal; Fromenteau, Sebastien; Poczos, Barnabas; Schneider, Jeff

    2016-01-01

    Galaxy clusters are a rich source of information for examining fundamental astrophysical processes and cosmological parameters, however, employing clusters as cosmological probes requires accurate mass measurements derived from cluster observables. We study dynamical mass measurements of galaxy clusters contaminated by interlopers, and show that a modern machine learning (ML) algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create a mock catalog from Multidark's publicly-available N-body MDPL1 simulation where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power law scaling relation to infer cluster mass from galaxy line of sight (LOS) velocity dispersion. The presence of interlopers in the catalog produces a wide, flat fractional mass error distribution, with width = 2.13. We employ the Support Distribution Machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement (width = 0.67). Remarkably, SDM applied to contaminated clusters is better able to recover masses than even a scaling relation approach applied to uncontaminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.

  9. What Makes Clusters Decline?

    Østergaard, Christian Richter; Park, Eun Kyung

    2015-01-01

    Most studies on regional clusters focus on identifying factors and processes that make clusters grow. However, sometimes technologies and market conditions suddenly shift, and clusters decline. This paper analyses the process of decline of the wireless communication cluster in Denmark. The...... longitudinal study on the high-tech cluster reveals that technological lock-in and exit of key firms have contributed to decline. Entrepreneurship has a positive effect on the cluster’s adaptive capabilities, while multinational companies have contradicting effects by bringing in new resources to the cluster...

  10. Teaching Social Class

    Tablante, Courtney B.; Fiske, Susan T.

    2015-01-01

    Discussing socioeconomic status in college classes can be challenging. Both teachers and students feel uncomfortable, yet social class matters more than ever. This is especially true, given increased income inequality in the United States and indications that higher education does not reduce this inequality as much as many people hope. Resources…

  11. Teaching Large Evening Classes

    Wambuguh, Oscar

    2008-01-01

    High enrollments, conflicting student work schedules, and the sheer convenience of once-a-week classes are pushing many colleges to schedule evening courses. Held from 6 to 9 pm or 7 to 10 pm, these classes are typically packed, sometimes with more than 150 students in a large lecture theater. How can faculty effectively teach, control, or even…

  12. DEFINING THE MIDDLE CLASS

    2011-01-01

    Classifying the middle class remains controversial despite its alleged growth China’s cities housed more than 230 million middle-class residents in 2009 or 37 percent of the urban population,according to the 2011 Blue Book of Cities in China released on August 3.

  13. Class in disguise

    Faber, Stine Thidemann; Prieur, Annick

    This paper asks how class can have importance in one of the worlds’ most equal societies: Denmark. The answer is that class here appears in disguised forms. The field under study is a city, Aalborg, in the midst of transition from a stronghold of industrialism to a post industrial economy...

  14. Morphology of a class of kinetic-growth models

    We study a class of local probabilistic growth processes that includes the kinetic-growth algorithm for generating percolation clusters. The shapes of the growing clusters are controlled by p, the probability of growth. For p > p/sub c/, the shapes are scale invariant with time and show interesting morphological features including both smoothly curved sections and straight facets. The facets are shown to be related to the problem of directed percolation and disappear below the directed-percolation threshold. A simple random-walk model for computing the shapes of our clusters is described

  15. Analysis of Various Clustering Algorithms

    Asst Prof. Sunila Godara,; Ms. Amita Verma,

    2013-01-01

    Data clustering is a process of putting similar data into groups. A clustering algorithm partitions a data set into several groups such that the similarity within a group is larger than among groups. This paper reviews four types of clustering techniques- k-Means Clustering, Farther first clustering, Density Based Clustering, Filtered clusterer. These clustering techniques are implemented and analyzed using a clustering tool WEKA. Performance of the 4 techniques are presented and compared.

  16. Monopole clusters in Abelian projected gauge theories

    Hart, A.; Teper, M.

    1997-01-01

    We show that the monopole currents which one obtains in the maximally Abelian gauge of SU(2) fall into two quite distinct classes (when the volume is large enough). In each field configuration there is precisely one cluster that permeates the whole lattice volume. It has a current density and a magnetic screening mass that scale and it produces the whole of the string tension. The remaining clusters have a number density that follows an approximate power law proportional to the inverse cube o...

  17. Niching method using clustering crowding

    GUO Guan-qi; GUI Wei-hua; WU Min; YU Shou-yi

    2005-01-01

    This study analyzes drift phenomena of deterministic crowding and probabilistic crowding by using equivalence class model and expectation proportion equations. It is proved that the replacement errors of deterministic crowding cause the population converging to a single individual, thus resulting in premature stagnation or losing optional optima. And probabilistic crowding can maintain equilibrium multiple subpopulations as the population size is adequate large. An improved niching method using clustering crowding is proposed. By analyzing topology of fitness landscape using hill valley function and extending the search space for similarity analysis, clustering crowding determines the locality of search space more accurately, thus greatly decreasing replacement errors of crowding. The integration of deterministic and probabilistic replacement increases the capacity of both parallel local hill climbing and maintaining multiple subpopulations. The experimental results optimizing various multimodal functions show that,the performances of clustering crowding, such as the number of effective peaks maintained, average peak ratio and global optimum ratio are uniformly superior to those of the evolutionary algorithms using fitness sharing, simple deterministic crowding and probabilistic crowding.

  18. Star clusters and associations

    All 33 papers presented at the symposium were inputted to INIS. They dealt with open clusters, globular clusters, stellar associations and moving groups, and local kinematics and galactic structures. (E.S.)

  19. Melting of clusters

    Haberland, H. [Freiburg Univ., Facultat fur Physik (Germany)

    2001-07-01

    An experiment is described which allows to measure the caloric curve of size selected sodium cluster ions. This allows to determine rather easily the melting temperatures, and latent heats in the size range between 55 and 340 atoms per cluster. A more detailed analysis is necessary to show that the cluster Na{sub 147}{sup +} has a negative microcanonical heat capacity, and how to determine the entropy of the cluster from the data. (authors)

  20. An Efficient Enhanced Clustering Algorithm of Information System For Law Enforcement

    Dr. A. Malathi; Dr. P. Rajarajeswari

    2014-01-01

    Clustering is a popular data mining techniques which is intended to help the user discover and understand the structure or grouping of the data in the set according to a certain similarity measure and predict future structure or group respectively. Clustering is the process of class discovery, where the objects are grouped into clusters. In this paper Enhanced K-Means and Enhanced DBSCAN algorithms are designed and used for the clustering crime data in the proposed crime analysis tool. Anothe...

  1. Using Curvature and Markov Clustering in Graphs for Lexical Acquisition and Word Sense Discrimination

    Dorow, Beate; Widdows, Dominic; Ling, Katarina; Eckmann, Jean-Pierre; Sergi, Danilo; Moses, Elisha

    2004-01-01

    We introduce two different approaches for clustering semantically similar words. We accommodate ambiguity by allowing a word to belong to several clusters. Both methods use a graph-theoretic representation of words and their paradigmatic relationships. The first approach is based on the concept of curvature and divides the word graph into classes of similar words by removing words of low curvature which connect several dispersed clusters. The second method, instead of clustering the nodes, cl...

  2. Pattern Formation and a Clustering Transition in Power-Law Sequential Adsorption

    Biham, Ofer; Malcai, Ofer; Lidar, Daniel A.; Avnir, David

    1999-01-01

    A new model that describes adsorption and clustering of particles on a surface is introduced. A {\\it clustering} transition is found which separates between a phase of weakly correlated particle distributions and a phase of strongly correlated distributions in which the particles form localized fractal clusters. The order parameter of the transition is identified and the fractal nature of both phases is examined. The model is relevant to a large class of clustering phenomena such as aggregati...

  3. Cluster beam sources. Part 1. Methods of cluster beams generation

    A.Ju. Karpenko

    2012-10-01

    Full Text Available The short review on cluster beams generation is proposed. The basic types of cluster sources are considered and the processes leading to cluster formation are analyzed. The parameters, that affects the work of cluster sources are presented.

  4. Cluster beam sources. Part 1. Methods of cluster beams generation

    A.Ju. Karpenko; V.A. Baturin

    2012-01-01

    The short review on cluster beams generation is proposed. The basic types of cluster sources are considered and the processes leading to cluster formation are analyzed. The parameters, that affects the work of cluster sources are presented.

  5. Multireference Coupled Cluster Ansatz

    Jeziorski, Bogumil

    2010-01-01

    Abstract The origin of the multireference coupled cluster Ansatz for the wave function and the wave operator, discovered in Quantum Theory Project in 1981, is presented from the historical perspective. Various methods of obtaining the cluster amplitudes - both state universal and state selective are critically reviewed and further prospects of using the multireference coupled cluster Ansatz in electronic structure theory are briefly discussed.

  6. Quantum Annealing for Clustering

    Kurihara, Kenichi; Tanaka, Shu; Miyashita, Seiji

    2014-01-01

    This paper studies quantum annealing (QA) for clustering, which can be seen as an extension of simulated annealing (SA). We derive a QA algorithm for clustering and propose an annealing schedule, which is crucial in practice. Experiments show the proposed QA algorithm finds better clustering assignments than SA. Furthermore, QA is as easy as SA to implement.

  7. The Durban Auto Cluster

    Lorentzen, Jochen; Robbins, Glen; Barnes, Justin

    2004-01-01

    The paper describes the formation of the Durban Auto Cluster in the context of trade liberalization. It argues that the improvement of operational competitiveness of firms in the cluster is prominently due to joint action. It tests this proposition by comparing the gains from cluster activities i...

  8. Minimalist's linux cluster

    Using barebone PC components and NIC's, we construct a linux cluster which has 2-dimensional mesh structure. This cluster has smaller footprint, is less expensive, and use less power compared to conventional linux cluster. Here, we report our experience in building such a machine and discuss our current lattice project on the machine

  9. Relational aspects of clusters

    Gjerding, Allan Næs

    The present paper is the first preliminary account of a project being planned for 2013, focussing on the development of the biomedico cluster in North Denmark. The project focusses on the relational capabilities of the cluster in terms of a number of organizational roles which are argued to be...... necessary for the development and growth of the upcoming cluster in question....

  10. Social Class Dialogues and the Fostering of Class Consciousness

    Madden, Meredith

    2015-01-01

    How do critical pedagogies promote undergraduate students' awareness of social class, social class identity, and social class inequalities in education? How do undergraduate students experience class consciousness-raising in the intergroup dialogue classroom? This qualitative study explores undergraduate students' class consciousness-raising in an…

  11. Cluster Physics with Merging Galaxy Clusters

    Sandor M. Molnar

    2016-02-01

    Full Text Available Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard LCDM model, where the total density is dominated by the cosmological constant ($Lambda$ and the matter density by cold dark matter (CDM, structure formation is hierarchical, and clusters grow mostly by merging.Mergers of two massive clusters are the most energetic events in the universe after the Big Bang,hence they provide a unique laboratory to study cluster physics.The two main mass components in clusters behave differently during collisions:the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulenceare developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thusour review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clustersis to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses.New high spatial and spectral resolution ground and space based telescopeswill come online in the near future. Motivated by these new opportunities,we briefly discuss methods which will be feasible in the near future in studying merging clusters.

  12. Clustering high dimensional data using subspace and projected clustering algorithms

    Rahmat Widia Sembiring; Jasni Mohamad Zain; Abdullah Embong

    2010-01-01

    Problem statement: Clustering has a number of techniques that have been developed in statistics, pattern recognition, data mining, and other fields. Subspace clustering enumerates clusters of objects in all subspaces of a dataset. It tends to produce many over lapping clusters. Approach: Subspace clustering and projected clustering are research areas for clustering in high dimensional spaces. In this research we experiment three clustering oriented algorithms, PROCLUS, P3C and STATPC. Results...

  13. Teaching Heterogeneous Classes.

    Millrood, Radislav

    2002-01-01

    Discusses an approach to teaching heterogeneous English-as-a-Second/Foreign-Language classes. Draws on classroom research data to describe the features of a success-building lesson context. (Author/VWL)

  14. IELP Class Observation

    陈了了

    2010-01-01

    @@ As an exchange student majoring in English, I am curious about how English is taught to international students here in America. Therefore, I observed an IELP (Intensive English Learning Program) class in Central Connecticut State University where I study.

  15. PSYCH 515 Complete Class

    admin

    2015-01-01

      PSYCH 515 Advanced Abnormal Psychology To purchase this material click on below link http://www.assignmentcloud.com/PSYCH-515/PSYCH-515-Complete-Class-Guide For more details www.assignmentcloud.com

  16. Raradox of class description

    吕光

    2004-01-01

    We have a more active class atmophere, but more passive self-study situations. We are too talktive when we should bury ourselves in books, but too less efficient when we spend too much time. We complain teachers

  17. Equilibrium and flow of cluster-forming complex fluids

    Full text: In this talk, I will present an overview of the unusual properties of a novel class of systems in soft matter physics, in which cluster formation takes place in the complete absence of attractions. After formulating a mathematical criterion as a necessary and sufficient condition for cluster formation, I will discuss the unusual structural, dynamical and phononic properties of cluster solids in equilibrium, showing, among others, that these are diffusive, that they provide for a realization of the Einstein model of solids and that at low temperatures cluster solids posses infinitely many isostructural critical points. Under shear flow, cluster solids organize in forms resembling the Abrikosov lattice of superconductors and they show a pressure flow behavior typical of colloidal glasses. Finally, I will demonstrate the construction of realistic microscopic models that allow for the formation of cluster crystals in the computer, opening the way to their experimental realization. (author)

  18. A Novel Clustering Algorithm Inspired by Membrane Computing

    Hong Peng

    2015-01-01

    Full Text Available P systems are a class of distributed parallel computing models; this paper presents a novel clustering algorithm, which is inspired from mechanism of a tissue-like P system with a loop structure of cells, called membrane clustering algorithm. The objects of the cells express the candidate centers of clusters and are evolved by the evolution rules. Based on the loop membrane structure, the communication rules realize a local neighborhood topology, which helps the coevolution of the objects and improves the diversity of objects in the system. The tissue-like P system can effectively search for the optimal partitioning with the help of its parallel computing advantage. The proposed clustering algorithm is evaluated on four artificial data sets and six real-life data sets. Experimental results show that the proposed clustering algorithm is superior or competitive to k-means algorithm and several evolutionary clustering algorithms recently reported in the literature.

  19. Embodying class and gender

    Geers, Alexie

    2015-01-01

    In March 1937, when the first issue of Marie-Claire was published, the images of the female body it presented to its female readers from working-class backgrounds contrasted sharply with those featured in previous magazines. The female bodies are dressed and groomed to seduce and replace the hieratic bodies that presented fashions synonymous with membership in the upper classes. The present essay examines this shift and shows that the visual repertoire employed is borrowed from that of the fe...

  20. Generalized Fourier transforms classes

    Berntsen, Svend; Møller, Steen

    2002-01-01

    The Fourier class of integral transforms with kernels $B(\\omega r)$ has by definition inverse transforms with kernel $B(-\\omega r)$. The space of such transforms is explicitly constructed. A slightly more general class of generalized Fourier transforms are introduced. From the general theory foll...... follows that integral transform with kernels which are products of a Bessel and a Hankel function or which is of a certain general hypergeometric type have inverse transforms of the same structure....

  1. Nordic Walking Classes

    Fitness Club

    2015-01-01

    Four classes of one hour each are held on Tuesdays. RDV barracks parking at Entrance A, 10 minutes before class time. Spring Course 2015: 05.05/12.05/19.05/26.05 Prices 40 CHF per session + 10 CHF club membership 5 CHF/hour pole rental Check out our schedule and enroll at: https://espace.cern.ch/club-fitness/Lists/Nordic%20Walking/NewForm.aspx? Hope to see you among us! fitness.club@cern.ch

  2. Polynuclear technetium halide clusters

    Development of chemistry of polynuclear technetium halide clusters in works devoted to synthesis, structure and investigation of their chemical and physical properties is considered. The role of academician V.I. Spitsyn as an initiator of investigation of polynuclear technetium halide clusters in the Institute of Physical Chemistry of Academy of Science of USSR is noted. Reactions and stability of cluster halides, their molecular and electronic structures are analyzed. Prospects of development of polynuclear technetium halide clusters chemistry as a direction being on the junction of cluster chemistry and theory of metal-metal multiple bonds are appreciated

  3. Cluster analysis for applications

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  4. Chaotic map clustering algorithm for EEG analysis

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  5. Survey on Text Document Clustering

    M.Thangamani; Dr.P.Thangaraj

    2010-01-01

    Document clustering is also referred as text clustering, and its concept is merely equal to data clustering. It is hardly difficult to find the selective information from an ‘N’number of series information, so that document clustering came into picture. Basically cluster means a group of similar data, document clustering means segregating the data into different groups of similar data. Clustering can be of mathematical, statistical or numerical domain. Clustering is a fundamental data analysi...

  6. Unconventional methods for clustering

    Kotyrba, Martin

    2016-06-01

    Cluster analysis or clustering is a task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is the main task of exploratory data mining and a common technique for statistical data analysis used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. The topic of this paper is one of the modern methods of clustering namely SOM (Self Organising Map). The paper describes the theory needed to understand the principle of clustering and descriptions of algorithm used with clustering in our experiments.

  7. Clusters in nuclei

    Beck, Christian

    Following the pioneering discovery of alpha clustering and of molecular resonances, the field of nuclear clustering is today one of those domains of heavy-ion nuclear physics that faces the greatest challenges, yet also contains the greatest opportunities. After many summer schools and workshops, in particular over the last decade, the community of nuclear molecular physicists has decided to collaborate in producing a comprehensive collection of lectures and tutorial reviews covering the field. This third volume follows the successful Lect. Notes Phys. 818 (Vol. 1) and 848 (Vol. 2), and comprises six extensive lectures covering the following topics:  - Gamma Rays and Molecular Structure - Faddeev Equation Approach for Three Cluster Nuclear Reactions - Tomography of the Cluster Structure of Light Nuclei Via Relativistic Dissociation - Clustering Effects Within the Dinuclear Model : From Light to Hyper-heavy Molecules in Dynamical Mean-field Approach - Clusterization in Ternary Fission - Clusters in Light N...

  8. Spatial cluster modelling

    Lawson, Andrew B

    2002-01-01

    Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space and space-time, spatial and spatio-temporal process modelling, nonparametric methods for clustering, and spatio-temporal ...

  9. Evolution of Galaxy and Quasar Clustering

    Bagla, J. S.

    1997-01-01

    We study the evolution of correlation function of dark matter halos in the CDM class of models. We show that the halo correlation function does not evolve in proportion with the correlation function of the underlying mass distribution. Earliest halos to collapse, which correspond to rare peaks in the density field, cluster very strongly. The amplitude of halo correlation function decreases from its initial, large, value. This decrease continues till the average peaks have collapsed, after whi...

  10. Agricultural Clusters in the Netherlands

    Schouten, M.A.; Heijman, W.J.M.

    2012-01-01

    Michael Porter was the first to use the term cluster in an economic context. He introduced the term in The Competitive Advantage of Nations (1990). The term cluster is also known as business cluster, industry cluster, competitive cluster or Porterian cluster. This article aims at determining and mea

  11. Endogenous Small RNA Clusters in Plants

    Yong-Xin Liu; Meng Wang; Xiu-Jie Wang

    2014-01-01

    In plants, small RNAs (sRNAs) usually refer to non-coding RNAs (ncRNAs) with lengths of 20-24 nucleotides. sRNAs are involved in the regulation of many essential processes related to plant development and environmental responses. sRNAs in plants are mainly grouped into microRNAs (miRNAs) and small interfering RNAs (siRNAs), and the latter can be further classified into trans-acting siRNAs (ta-siRNAs), repeat-associated siRNAs (ra-siRNAs), natural anti-sense siRNAs (nat-siRNAs), etc. Many sRNAs exhibit a clustered distribution pattern in the genome. Here, we summarize the features and functions of cluster-distributed sRNAs, aimed to not only provide a thorough picture of sRNA clusters (SRCs) in plants, but also shed light on the identification of new classes of functional sRNAs.

  12. Zipf's Law and the Universality Class of the Fragmentation Phase Transition

    Bauer, Wolfgang; Pratt, Scott; Alleman, Brandon

    2005-01-01

    We show that Zipf's Law for the largest clusters is not valid in an exact sense at the critical point of the fragmentation phase transition, contrary to previous claims. Instead, the extracted distributions of the largest clusters reflects the choice of universality class through the value of the critical exponent tau.

  13. theories of class

    Gaiotto, Davide; Razamat, Shlomo S.

    2015-07-01

    We construct classes of superconformal theories elements of which are labeled by punctured Riemann surfaces. Degenerations of the surfaces correspond, in some cases, to weak coupling limits. Different classes are labeled by two integers ( N, k). The k = 1 case coincides with A N - 1 theories of class and simple examples of theories with k > 1 are orbifolds of some of the A N - 1 class theories. For the space of theories to be complete in an appropriate sense we find it necessary to conjecture existence of new strongly coupled SCFTs. These SCFTs when coupled to additional matter can be related by dualities to gauge theories. We discuss in detail the A 1 case with k = 2 using the supersymmetric index as our analysis tool. The index of theories in classes with k > 1 can be constructed using eigenfunctions of elliptic quantum mechanical models generalizing the Ruijsenaars-Schneider integrable model. When the elliptic curve of the model degenerates these eigenfunctions become polynomials with coefficients being algebraic expressions in fugacities, generalizing the Macdonald polynomials with rational coefficients appearing when k = 1.

  14. Does class attendance still matter?

    Abel Nyamapfene

    2010-01-01

    This paper presents a study on the impact of class attendance on academic performance in a second year Electronics Engineering course module with online notes and no mandatory class attendance policy. The study shows that class attendance is highly correlated to academic performance, despite the availability of online class notes. In addition, there is significant correlation between class attendance and non-class contact with the lecturer and between student performance in the first year of ...

  15. Two steps in the evolution of Antennapedia-class vertebrate homeobox genes.

    Kappen, C. (Christian); Schughart, K; Ruddle, F H

    1989-01-01

    Antennapedia-class vertebrate homeobox genes have been classified with regard to their chromosomal locations and nucleotide sequence similarities within the 183-base-pair homeobox domain. The results of these comparisons support the view that in mammals and most likely the vertebrates, four clusters of homeobox genes exist that were created by duplications of an entire primordial gene cluster. We present evidence that this primordial cluster arose by local gene duplications of homeoboxes that...

  16. Spatial Scan Statistic: Selecting clusters and generating elliptic clusters

    Christiansen, Lasse Engbo; Andersen, Jens Strodl

    2004-01-01

    The spatial scan statistic is widely used to search for clusters. This paper shows that the usually applied elimination of overlapping clusters to find secondary clusters is sensitive to smooth changes in the shape of the clusters. We present an algorithm for generation of set of confocal elliptic...... clusters. In addition, we propose a new way to present the information in a given set of clusters based on the significance of the clusters....

  17. Clustering Categorical Data:A Cluster Ensemble Approach

    He Zengyou(何增友); Xu Xiaofei; Deng Shengchun

    2003-01-01

    Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from the viewpoint of cluster ensemble, and apply cluster ensemble approach for clustering categorical data. Experimental results on real datasets show that better clustering accuracy can be obtained by comparing with existing categorical data clustering algorithms.

  18. Cluster brand as a competitive advantage. Case: Airport cluster Finland

    Väinölä, Lotta-Elviira

    2015-01-01

    Objective of the Study: The objective of this study is to explore the phenomenon of cluster branding. This study investigates cluster brand as a competitive advantage that impacts the success or decline of the cluster. The research questions examine three aspects: (1) cluster branding as a process, (2) the concrete tools that can be used in cluster branding and (3) the perceived benefits of cluster brand. The study aims to produce a generic model for cluster branding, which can be used as...

  19. Integrating cluster formation and cluster evaluation in interactive visual analysis

    Turkay, C.; Parulek, J.; Reuter, N.; Hauser, H.

    2011-01-01

    Cluster analysis is a popular method for data investigation where data items are structured into groups called clusters. This analysis involves two sequential steps, namely cluster formation and cluster evaluation. In this paper, we propose the tight integration of cluster formation and cluster evaluation in interactive visual analysis in order to overcome the challenges that relate to the black-box nature of clustering algorithms. We present our conceptual framework in the form of an interac...

  20. Translation in ESL Classes

    Nagy Imola Katalin

    2015-12-01

    Full Text Available The problem of translation in foreign language classes cannot be dealt with unless we attempt to make an overview of what translation meant for language teaching in different periods of language pedagogy. From the translation-oriented grammar-translation method through the complete ban on translation and mother tongue during the times of the audio-lingual approaches, we have come today to reconsider the role and status of translation in ESL classes. This article attempts to advocate for translation as a useful ESL class activity, which can completely fulfil the requirements of communicativeness. We also attempt to identify some activities and games, which rely on translation in some books published in the 1990s and the 2000s.

  1. Coping With New Challengens for Density-Based Clustering

    Kröger, Peer

    2004-01-01

    Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. The core step of the KDD process is the application of a Data Mining algorithm in order to produce a particular enumeration of patterns and relationships in large databases. Clustering is one of the major data mining tasks and aims at grouping the data objects into meaningful classes (clusters) such that the similarity of objects wi...

  2. Bridging the gap between cluster and grid computing

    Alves, Albano; Pina, António

    2006-01-01

    The Internet computing model with its ubiquitous networking and computing infrastructure is driving a new class of interoperable applications that benefit both from high computing power and multiple Internet connections. In this context, grids are promising computing platforms that allow to aggregate distributed resources such as workstations and clusters to solve large-scale problems. However, because most parallel programming tools were primarily developed for MPP and cluster computing, to ...

  3. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    Grazioli, Jacopo; Tuia, Devis; Berne, Alexis

    2015-01-01

    A data-driven approach to the classification of hydrometeors from measurements collected with polarimetric weather radars is proposed. In a first step, the optimal number of hydrometeor classes (nopt) that can be reliably identified from a large set of polarimetric data is determined. This is done by means of an unsupervised clustering technique guided by criteria related both to data similarity and to spatial smoothness of the classified images. In a second step, the nopt clusters are assign...

  4. Classıfıcatıon of Dıstrıct TR72 Towns wıth Fuzzy Clusterıng Analysıs Usıng Socıo-Economıc Data

    Erilli, N. Alp

    2014-01-01

    In economy policies, socioeconomic indicators have an important place in determining the development levels. The determination and classification of the current social and economic structure of cities and districts are considerably important in analyzing the development of cities and districts in their probable development tendencies and in developing the regional development policies in parallel to this. It is essential to use fuzzy clustering analysis, whether clusters separated well or the...

  5. Class hierarchy method to find Change-Proneness

    Malan V.Gaikwad

    2011-01-01

    Full Text Available Finding Proneness of software is necessary to identify fault prone and change prone classes at earlier stages of development, so that those classes can be given special attention. Also to improves the quality and reliability of the software. For corrective and adaptive maintenance we require to make changes during the software evolution.As such changes cluster around number of key components in software, it is important to analyze the frequency of changes in individual classes and also to identify and show related changes in multiple classes. Early detection of fault prone and change prone classes can enables the developers and experts to spend their valuable time and resources on these areas of software. Prediction of change-prone and fault prone classes of a software is an active topic in the area of software engineering. Such prediction can be used to predict changes to different classes of a system from one release of software to the next release. Identifying the change-prone and fault prone classes in advance can helps to focus attention on these classes.In this paper we are focusing on finding dependency of software that can be chieved by estimating the proneness of Object Oriented Software. Two main types of proneness are associated with OO software. Fault Proneness and Change Proneness.

  6. Disentangling Porterian Clusters

    Jagtfelt, Tue

    This dissertation investigates the contemporary phenomenon of industrial clusters based on the work of Michael E. Porter, the central progenitor and promoter of the cluster notion. The dissertation pursues two central questions: 1) What is a cluster? and 2) How could Porter’s seemingly fuzzy......, contested theory become so widely disseminated and applied as a normative and prescriptive strategy for economic development? The dissertation traces the introduction of the cluster notion into the EU’s Lisbon Strategy and demonstrates how its inclusion originates from Porter’s colleagues: Professor Örjan...... Sölvell, Dr. Christian Ketels and Dr. Göran Lindqvist. Taking departure in Porter’s works and the cluster literature, the dissertations shows a considerable paradigmatic shift has occurred from the first edition of Nations to the present state of cluster cooperation. To elaborate on this change and the...

  7. From collisions to clusters

    Loukonen, Ville; Bork, Nicolai; Vehkamaki, Hanna

    2014-01-01

    overcome the possible initial non-optimal collision orientations. No post-collisional cluster break up is observed. The reasons for the efficient clustering are (i) the proton transfer reaction which takes place in each of the collision simulations and (ii) the subsequent competition over the proton......The clustering of sulphuric acid with base molecules is one of the main pathways of new-particle formation in the Earth's atmosphere. First step in the clustering process is likely the formation of a (sulphuric acid)1(base)1(water)n cluster. Here, we present results from direct first......-principles molecular dynamics collision simulations of (sulphuric acid)1(water)0, 1 + (dimethylamine) → (sulphuric acid)1(dimethylamine)1(water)0, 1 cluster formation processes. The simulations indicate that the sticking factor in the collisions is unity: the interaction between the molecules is strong enough to...

  8. Cosmology with cluster surveys

    Subhabrata Majumdar

    2004-10-01

    Surveys of clusters of galaxies provide us with a powerful probe of the density and nature of the dark energy. The red-shift distribution of detected clusters is highly sensitive to the dark energy equation of state parameter . Upcoming Sunyaev–Zel'dovich (SZ) surveys would provide us large yields of clusters to very high red-shifts. Self-calibration of cluster scaling relations, possible for such a huge sample, would be able to constrain systematic biases on mass estimators. Combining cluster red-shift abundance with limited mass follow-up and cluster mass power spectrum can then give constraints on , as well as on 8 and to a few per cents.

  9. Cluster Management Institutionalization

    Normann, Leo; Agger Nielsen, Jeppe

    2015-01-01

    This article explores a new management form – cluster management – in Danish public sector day care. Although cluster management has been widely adopted in Danish day care at the municipality level, it has attracted only sparse research attention. We use theoretical insights from Scandinavian...... institutionalism together with a longitudinal case-based inquiry into how cluster management has entered and penetrated the management practices of day care in Denmark. We demonstrate how cluster management became widely adopted in the day care field not only because of its intrinsic properties but also because...... of how it was legitimized as a “ready-to-use” management model. Further, our account reveals how cluster management translated into considerably different local variants as it travelled into specific organizations. However, these processes have not occurred sequentially with cluster management first...

  10. BIO 315 Complete Class

    admn

    2015-01-01

    BIO 315 Complete Class Check this A+ tutorial guideline at   http://www.assignmentcloud.com/BIO-315/BIO-315-Complete-Class BIO 315 Week 1 DQ 1 BIO 315 Week 1 DQ 2 BIO 315 Week 1 Individual Assignment Beren Robinson Field Study Paper BIO 315 Week 2 DQ 1 BIO 315 Week 2 DQ 2 BIO 315 Week 2 DQ 3 BIO 315 Week 2 Individual Assignment Environment Resources and Competition BIO 315 Week 2 Week Two Learning Team Exercises BIO 315 Week 3 DQ 1 BIO ...

  11. Class actions in Portugal

    Raimundo, Maria Carlos Miranda

    2013-01-01

    Even eighteen years after the implementation of Law 83/95, of August 31, about the rights of participation in class action litigation in Portugal, there is no sufficient evidence of its applicability. Contrarily to what is observed in other countries as the United States and Brazil it seems that in Portugal, there is no interest of the several parties that would be involved in a class action litigation to obtain information or inform other parties relatively to the main procedures on it and t...

  12. Talking Class in Tehroon

    Elling, Rasmus Christian; Rezakhani, Khodadad

    2016-01-01

    Persian, like any other language, is laced with references to class, both blatant and subtle. With idioms and metaphors, Iranians can identify and situate others, and thus themselves, within hierarchies of social status and privilege, both real and imagined. Some class-related terms can be traced...... back to medieval times, whereas others are of modern vintage, the linguistic legacy of television shows, pop songs, social media memes or street vernacular. Every day, it seems, an infectious set of phrases appears that make yesterday’s seem embarrassingly antiquated....

  13. An "expanded" class perspective

    Steur, Luisa Johanna

    2014-01-01

    Adivasis against their age-old colonization or the work of ‘external’ agitators. Capitalist restructuring and ‘globalization’ was generally seen as simply the latest chapter in the suffering of these Adivasis. Little focused attention was paid to the recent class trajectory of their lives under changing...... analysis, as elaborated in Marxian anthropology, this article provides an alternative to the liberal-culturalist explanation of indigenism in Kerala, arguing instead that contemporary class processes—as experienced close to the skin by the people who decided to participate in the Muthanga struggle...

  14. Residues of Chern classes

    Suwa, Tatsuo; 諏訪, 立雄

    2003-01-01

    If we have a finite number of sections of a complex vector bundle E over a manifold M, certain Chern classes of E are localized at the singular set S, i.e., the set of points where the sections fail to be linearly independent. When S is compact, the localizations define the residues at each connected component of S by the Alexander duality. If M itself is compact, the sum of the residues is equal to the Poincaré dual of the corresponding Chern class. This type of theory is also developed for ...

  15. Residues of Chern classes

    Suwa, Tatsuo

    2003-01-01

    If we have a finite number of sections of a complex vector bundle $E$ over a manifold $M$ , certain Chern classes of $E$ are localized at the singular set $S$ , i.e., the set of points where the sections fail to be linearly independent. When $S$ is compact, the localizations define the residues at each connected component of $S$ by the Alexander duality. If $M$ itself is compact, the sum of the residues is equal to the Poincaré dual of the corresponding Chern class. This type of theory is als...

  16. Nanophase materials assembled from clusters

    Siegel, R.W.

    1992-02-01

    The preparation of metal and ceramic atom clusters by means of the gas-condensation method, followed by their in situ collection and consolidation under high-vacuum conditions, has recently led to the synthesis of a new class of ultrafine-grained materials. These nanophase materials, with typical average grain sizes of 5 to 50 nm and, hence, a large fraction of their atoms in interfaces, exhibit properties that are often considerably improved relative to those of conventional materials. Furthermore, their synthesis and processing characteristics should enable the design of new materials with unique properties. Some examples are ductile ceramics that can be formed and sintered to full density at low temperatures without the need for binding or sintering aids, and metals with dramatically increased strength. The synthesis of these materials is briefly described along with what is presently known of their structure and properties. Their future impact on materials science and technology is also considered.

  17. A Study of the Classification Capabilities of Neural Networks Using Unsupervised Learning: A Comparison with K-Means Clustering.

    Balakrishnan, P. V. (Sunder); And Others

    1994-01-01

    A simulation study compares nonhierarchical clustering capabilities of a class of neural networks using Kohonen learning with a K-means clustering procedure. The focus is on the ability of the procedures to recover correctly the known cluster structure in the data. Advantages and disadvantages of the procedures are reviewed. (SLD)

  18. Agricultural Clusters in China

    Kiminami, Lily; Kiminami, Akira

    2009-01-01

    The purpose of this study is to assess the potential of clustering in the development of agriculture and rural communities in China. We shall examine in detail the food industry, which is the link in the food chain that propels the industrialization of agriculture, and identify instances of industrial agglomeration and business collaboration. Next, we shall analyze the externalities (i.e. spillovers) of clusters, demand conditions in cluster formation, and the effectiveness of business collab...

  19. The Durban Auto Cluster

    Lorentzen, Jochen; Robbins, Glen; Barnes, Justin

    2004-01-01

    The paper describes the formation of the Durban Auto Cluster in the context of trade liberalization. It argues that the improvement of operational competitiveness of firms in the cluster is prominently due to joint action. It tests this proposition by comparing the gains from cluster activities in the areas of supplier development, human resource development, logistics, and benchmarking, and by contrasting the impact of joint action against a host of other variables, notably international com...

  20. Clustering Techniques in Bioinformatics

    Muhammad Ali Masood

    2015-01-01

    Full Text Available Dealing with data means to group information into a set of categories either in order to learn new artifacts or understand new domains. For this purpose researchers have always looked for the hidden patterns in data that can be defined and compared with other known notions based on the similarity or dissimilarity of their attributes according to well-defined rules. Data mining, having the tools of data classification and data clustering, is one of the most powerful techniques to deal with data in such a manner that it can help researchers identify the required information. As a step forward to address this challenge, experts have utilized clustering techniques as a mean of exploring hidden structure and patterns in underlying data. Improved stability, robustness and accuracy of unsupervised data classification in many fields including pattern recognition, machine learning, information retrieval, image analysis and bioinformatics, clustering has proven itself as a reliable tool. To identify the clusters in datasets algorithm are utilized to partition data set into several groups based on the similarity within a group. There is no specific clustering algorithm, but various algorithms are utilized based on domain of data that constitutes a cluster and the level of efficiency required. Clustering techniques are categorized based upon different approaches. This paper is a survey of few clustering techniques out of many in data mining. For the purpose five of the most common clustering techniques out of many have been discussed. The clustering techniques which have been surveyed are: K-medoids, K-means, Fuzzy C-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN and Self-Organizing Map (SOM clustering.

  1. Cluster Symmetries and Dynamics

    Freer Martin

    2016-01-01

    Full Text Available Many light nuclei display behaviour that indicates that rather than behaving as an A-body systems, the protons and neutrons condense into clusters. The α-particle is the most obvious example of such clustering. This contribution examines the role of such α-clustering on the structure, symmetries and dynamics of the nuclei 8Be, 12C and 16O, recent experimental measurements and future perspectives.

  2. Cluster headache with aura

    Martínez-Fernández, Eva; Alberca, Roman; Mir, Pablo; Franco, Emilio; Montes, Enrique; Lozano, Pilar

    2002-01-01

    The objective of our study is to report the frequency and characteristics of cluster headache with aura among the population of patients with cluster headache treated in our outpatient neurology clinic. 254 patients were submitted to semi-structured interviews to identify the presence of symptoms similar to the migraine aura. 5 patients who suffered from a cluster headache with aura filled a diary with the characteristics of the pain attacks and the aura. All the patients with either episodic...

  3. Securing personal network clusters

    Jehangir, Assed; Heemstra de Groot, Sonia M.

    2007-01-01

    A Personal Network is a self-organizing, secure and private network of a user’s devices notwithstanding their geographic location. It aims to utilize pervasive computing to provide users with new and improved services. In this paper we propose a model for securing Personal Network clusters. Clusters are ad-hoc networks of co-located personal devices. The ad-hoc makeup of clusters, coupled with the resource constrained nature of many constituent devices, makes enforcing security a challenging ...

  4. 15th Cluster workshop

    Laakso, Harri; Escoubet, C. Philippe; The Cluster Active Archive : Studying the Earth’s Space Plasma Environment

    2010-01-01

    Since the year 2000 the ESA Cluster mission has been investigating the small-scale structures and processes of the Earth's plasma environment, such as those involved in the interaction between the solar wind and the magnetospheric plasma, in global magnetotail dynamics, in cross-tail currents, and in the formation and dynamics of the neutral line and of plasmoids. This book contains presentations made at the 15th Cluster workshop held in March 2008. It also presents several articles about the Cluster Active Archive and its datasets, a few overview papers on the Cluster mission, and articles reporting on scientific findings on the solar wind, the magnetosheath, the magnetopause and the magnetotail.

  5. Management of cluster headache.

    Tfelt-Hansen, Peer C; Jensen, Rigmor H

    2012-07-01

    The prevalence of cluster headache is 0.1% and cluster headache is often not diagnosed or misdiagnosed as migraine or sinusitis. In cluster headache there is often a considerable diagnostic delay - an average of 7 years in a population-based survey. Cluster headache is characterized by very severe or severe orbital or periorbital pain with a duration of 15-180 minutes. The cluster headache attacks are accompanied by characteristic associated unilateral symptoms such as tearing, nasal congestion and/or rhinorrhoea, eyelid oedema, miosis and/or ptosis. In addition, there is a sense of restlessness and agitation. Patients may have up to eight attacks per day. Episodic cluster headache (ECH) occurs in clusters of weeks to months duration, whereas chronic cluster headache (CCH) attacks occur for more than 1 year without remissions. Management of cluster headache is divided into acute attack treatment and prophylactic treatment. In ECH and CCH the attacks can be treated with oxygen (12 L/min) or subcutaneous sumatriptan 6 mg. For both oxygen and sumatriptan there are two randomized, placebo-controlled trials demonstrating efficacy. In both ECH and CCH, verapamil is the prophylactic drug of choice. Verapamil 360 mg/day was found to be superior to placebo in one clinical trial. In clinical practice, daily doses of 480-720 mg are mostly used. Thus, the dose of verapamil used in cluster headache treatment may be double the dose used in cardiology, and with the higher doses the PR interval should be checked with an ECG. At the start of a cluster, transitional preventive treatment such as corticosteroids or greater occipital nerve blockade can be given. In CCH and in long-standing clusters of ECH, lithium, methysergide, topiramate, valproic acid and ergotamine tartrate can be used as add-on prophylactic treatment. In drug-resistant CCH, neuromodulation with either occipital nerve stimulation or deep brain stimulation of the hypothalamus is an alternative treatment strategy

  6. Statistical Properties of Convex Clustering

    Tan, Kean Ming; Witten, Daniela

    2015-01-01

    In this manuscript, we study the statistical properties of convex clustering. We establish that convex clustering is closely related to single linkage hierarchical clustering and $k$-means clustering. In addition, we derive the range of tuning parameter for convex clustering that yields a non-trivial solution. We also provide an unbiased estimate of the degrees of freedom, and provide a finite sample bound for the prediction error for convex clustering. We compare convex clustering to some tr...

  7. A Uniqueness Theorem for Clustering

    Zadeh, Reza Bosagh; Ben-David, Shai

    2012-01-01

    Despite the widespread use of Clustering, there is distressingly little general theory of clustering available. Questions like "What distinguishes a clustering of data from other data partitioning?", "Are there any principles governing all clustering paradigms?", "How should a user choose an appropriate clustering algorithm for a particular task?", etc. are almost completely unanswered by the existing body of clustering literature. We consider an axiomatic approach to the theory of Clustering...

  8. Statistical properties of convex clustering

    Tan, Kean Ming; Witten, Daniela

    2015-01-01

    In this manuscript, we study the statistical properties of convex clustering. We establish that convex clustering is closely related to single linkage hierarchical clustering and $k$-means clustering. In addition, we derive the range of the tuning parameter for convex clustering that yields a non-trivial solution. We also provide an unbiased estimator of the degrees of freedom, and provide a finite sample bound for the prediction error for convex clustering. We compare convex clustering to so...

  9. Fast Density Based Clustering Algorithm

    Priyanka Trikha; Singh Vijendra

    2013-01-01

    Clustering problem is an unsupervised learning problem. It is a procedure that partition data objects into matching clusters. The data objects in the same cluster are quite similar to each other and dissimilar in the other clusters. The traditional algorithms do not meet the latest multiple requirements simultaneously for objects. Density-based clustering algorithms find clusters based on density of data points in a region. DBSCAN algorithm is one of the density-based clustering algorithms. I...

  10. Document Clustering Based on Semi-Supervised Term Clustering

    Hamid Mahmoodi

    2012-05-01

    Full Text Available The study is conducted to propose a multi-step feature (term selection process and in semi-supervised fashion, provide initial centers for term clusters. Then utilize the fuzzy c-means (FCM clustering algorithm for clustering terms. Finally assign each of documents to closest associated term clusters. While most text clustering algorithms directly use documents for clustering, we propose to first group the terms using FCM algorithm and then cluster documents based on terms clusters. We evaluate effectiveness of our technique on several standard text collections and compare our results with the some classical text clustering algorithms.