WorldWideScience

Sample records for class cluster analysis

  1. Exploring the Relationship between Autism Spectrum Disorder and Epilepsy Using Latent Class Cluster Analysis

    Science.gov (United States)

    Cuccaro, Michael L.; Tuchman, Roberto F.; Hamilton, Kara L.; Wright, Harry H.; Abramson, Ruth K.; Haines, Jonathan L.; Gilbert, John R.; Pericak-Vance, Margaret

    2012-01-01

    Epilepsy co-occurs frequently in autism spectrum disorders (ASD). Understanding this co-occurrence requires a better understanding of the ASD-epilepsy phenotype (or phenotypes). To address this, we conducted latent class cluster analysis (LCCA) on an ASD dataset (N = 577) which included 64 individuals with epilepsy. We identified a 5-cluster…

  2. Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Rita Ismayilova

    2014-01-01

    Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.

  3. Clustering Educational Digital Library Usage Data: A Comparison of Latent Class Analysis and K-Means Algorithms

    Science.gov (United States)

    Xu, Beijie; Recker, Mimi; Qi, Xiaojun; Flann, Nicholas; Ye, Lei

    2013-01-01

    This article examines clustering as an educational data mining method. In particular, two clustering algorithms, the widely used K-means and the model-based Latent Class Analysis, are compared, using usage data from an educational digital library service, the Instructional Architect (IA.usu.edu). Using a multi-faceted approach and multiple data…

  4. Context-sensitive intra-class clustering

    KAUST Repository

    Yu, Yingwei

    2014-02-01

    This paper describes a new semi-supervised learning algorithm for intra-class clustering (ICC). ICC partitions each class into sub-classes in order to minimize overlap across clusters from different classes. This is achieved by allowing partitioning of a certain class to be assisted by data points from other classes in a context-dependent fashion. The result is that overlap across sub-classes (both within- and across class) is greatly reduced. ICC is particularly useful when combined with algorithms that assume that each class has a unimodal Gaussian distribution (e.g., Linear Discriminant Analysis (LDA), quadratic classifiers), an assumption that is not always true in many real-world situations. ICC can help partition non-Gaussian, multimodal distributions to overcome such a problem. In this sense, ICC works as a preprocessor. Experiments with our ICC algorithm on synthetic data sets and real-world data sets indicated that it can significantly improve the performance of LDA and quadratic classifiers. We expect our approach to be applicable to a broader class of pattern recognition problems where class-conditional densities are significantly non-Gaussian or multi-modal. © 2013 Elsevier Ltd. All rights reserved.

  5. 1842676957299765Latent class cluster analysis to understand heterogeneity in prostate cancer treatment utilities

    Directory of Open Access Journals (Sweden)

    Meghani Salimah

    2009-01-01

    Full Text Available Abstract Background Men with prostate cancer are often challenged to choose between conservative management and a range of available treatment options each carrying varying risks and benefits. The trade-offs are between an improved life-expectancy with treatment accompanied by important risks such as urinary incontinence and erectile dysfunction. Previous studies of preference elicitation for prostate cancer treatment have found considerable heterogeneity in individuals' preferences for health states given similar treatments and clinical risks. Methods Using latent class mixture model (LCA, we first sought to understand if there are unique patterns of heterogeneity or subgroups of individuals based on their prostate cancer treatment utilities (calculated time trade-off utilities for various health states and if such unique subgroups exist, what demographic and urological variables may predict membership in these subgroups. Results The sample (N = 244 included men with prostate cancer (n = 188 and men at-risk for disease (n = 56. The sample was predominantly white (77%, with mean age of 60 years (SD ± 9.5. Most (85.9% were married or living with a significant other. Using LCA, a three class solution yielded the best model evidenced by the smallest Bayesian Information Criterion (BIC, substantial reduction in BIC from a 2-class solution, and Lo-Mendell-Rubin significance of < .001. The three identified clusters were named high-traders (n = 31, low-traders (n = 116, and no-traders (n = 97. High-traders were more likely to trade survival time associated with treatment to avoid potential risks of treatment. Low-traders were less likely to trade survival time and accepted risks of treatment. The no-traders were likely to make no trade-offs in any direction favouring the status quo. There was significant difference among the clusters in the importance of sexual activity (Pearson's χ2 = 16.55, P = 0.002; Goodman and Kruskal tau = 0.039, P < 0.001. In

  6. Cluster analysis

    CERN Document Server

    Everitt, Brian S; Leese, Morven; Stahl, Daniel

    2011-01-01

    Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons

  7. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  8. Genomic sequence analysis of the 238-kb swine segment with a cluster of TRIM and olfactory receptor genes located, but with no class I genes, at the distal end of the SLA class I region.

    Science.gov (United States)

    Ando, Asako; Shigenari, Atsuko; Kulski, Jerzy K; Renard, Christine; Chardon, Patrick; Shiina, Takashi; Inoko, Hidetoshi

    2005-12-01

    Continuous genomic sequence has been previously determined for the swine leukocyte antigen (SLA) class I region from the TNF gene cluster at the border between the major histocompatibility complex (MHC) class III and class I regions to the UBD gene at the telomeric end of the classical class I gene cluster (SLA-1 to SLA-5, SLA-9, SLA-11). To complete the genomic sequence of the entire SLA class I genomic region, we have analyzed the genomic sequences of two BAC clones carrying a continuous 237,633-bp-long segment spanning from the TRIM15 gene to the UBD gene located on the telomeric side of the classical SLA class I gene cluster. Fifteen non-class I genes, including the zinc finger and the tripartite motif (TRIM) ring-finger-related family genes and olfactory receptor genes, were identified in the 238-kilobase (kb) segment, and their location in the segment was similar to their apparent human homologs. In contrast, a human segment (alpha block) spanning about 375 kb from the gene ETF1P1 and from the HLA-J to HLA-F genes was absent from the 238-kb swine segment. We conclude that the gene organization of the MHC non-class I genes located in the telomeric side of the classical SLA class I gene cluster is remarkably similar between the swine and the human segments, although the swine lacks a 375-kb segment corresponding to the human alpha block.

  9. Mutation classes of finite type cluster algebras with principal coefficients

    CERN Document Server

    Seven, Ahmet

    2011-01-01

    In this paper, we prove Conjecture 4.8 of "Cluster algebras IV" by S. Fomin and A. Zelevinsky, stating that the mutation classes of rectangular matrices associated with cluster algebras of finite type are precisely those classes which are finite.

  10. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  11. [Cluster analysis in biomedical researches].

    Science.gov (United States)

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

    2013-01-01

    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research. PMID:24640781

  12. Class Restricted Clustering and Micro-Perturbation for Data Privacy

    OpenAIRE

    Li, Xiao-Bai; Sarkar, Sumit

    2013-01-01

    The extensive use of information technologies by organizations to collect and share personal data has raised strong privacy concerns. To respond to the public’s demand for data privacy, a class of clustering-based data masking techniques is increasingly being used for privacy-preserving data sharing and analytics. Traditional clustering-based approaches for masking numeric attributes, while addressing re-identification risks, typically do not consider the disclosure risk of categorical confid...

  13. Arabic web pages clustering and annotation using semantic class features

    Directory of Open Access Journals (Sweden)

    Hanan M. Alghamdi

    2014-12-01

    Full Text Available To effectively manage the great amount of data on Arabic web pages and to enable the classification of relevant information are very important research problems. Studies on sentiment text mining have been very limited in the Arabic language because they need to involve deep semantic processing. Therefore, in this paper, we aim to retrieve machine-understandable data with the help of a Web content mining technique to detect covert knowledge within these data. We propose an approach to achieve clustering with semantic similarities. This approach comprises integrating k-means document clustering with semantic feature extraction and document vectorization to group Arabic web pages according to semantic similarities and then show the semantic annotation. The document vectorization helps to transform text documents into a semantic class probability distribution or semantic class density. To reach semantic similarities, the approach extracts the semantic class features and integrates them into the similarity weighting schema. The quality of the clustering result has evaluated the use of the purity and the mean intra-cluster distance (MICD evaluation measures. We have evaluated the proposed approach on a set of common Arabic news web pages. We have acquired favorable clustering results that are effective in minimizing the MICD, expanding the purity and lowering the runtime.

  14. Clustering analysis using Swarm Intelligence

    OpenAIRE

    Farmani, Mohammad Reza

    2016-01-01

    This thesis is concerned with the application of the swarm intelligence methods in clustering analysis of datasets. The main objectives of the thesis are ∙ Take the advantage of a novel evolutionary algorithm, called artificial bee colony, to improve the capability of K-means in finding global optimum clusters in nonlinear partitional clustering problems. ∙ Consider partitional clustering as an optimization problem and an improved antbased algorithm, named Opposition-Based A...

  15. Filling the gap: a new class of old star cluster?

    CERN Document Server

    Forbes, Duncan; Usher, Christopher; Strader, Jay; Romanowsky, Aaron; Brodie, Jean; Arnold, Jacob; Spitler, Lee

    2013-01-01

    It is not understood whether long-lived star clusters possess a continuous range of sizes and masses (and hence densities), or if rather, they should be considered as distinct types with different origins. Utilizing the Hubble Space Telescope (HST) to measure sizes, and long exposures on the Keck 10m telescope to obtain distances, we have discovered the first confirmed star clusters that lie within a previously claimed size-luminosity gap dubbed the `avoidance zone' by Hwang et al (2011). The existence of these star clusters extends the range of sizes, masses and densities for star clusters, and argues against current formation models that predict well-defined size-mass relationships (such as stripped nuclei, giant globular clusters or merged star clusters). The red colours of these gap objects suggests that they are not a new class of object but are related to Faint Fuzzies observed in nearby lenticular galaxies. We also report a number of low luminosity UCDs with sizes of up to 50 pc. Future, statistically ...

  16. EM Clustering Analysis of Diabetes Patients Basic Diagnosis Index

    OpenAIRE

    Wu, Cai; Steinbauer, Jeffrey R.; Kuo, Grace M

    2005-01-01

    Cluster analysis can group similar instances into same group. Partitioning cluster assigns classes to samples without known the classes in advance. Most common algorithms are K-means and Expectation Maximization (EM). EM clustering algorithm can find number of distributions of generating data and build “mixture models”. It identifies groups that are either overlapping or varying sizes and shapes. In this project, by using EM in Machine Learning Algorithm in JAVA (WEKA) syste...

  17. The X-CLASS - redMaPPer galaxy cluster comparison: I. Identification procedures

    CERN Document Server

    Sadibekova, Tatyana; Clerc, Nicolas; Faccioli, Lorenzo; Gastaud, Rene; Fevre, Jean-Paul Le; Rozo, Eduardo; Rykoff, Eli S

    2014-01-01

    We performed a detailed and, for a large part interactive, analysis of the matching output between the X-CLASS and redMaPPer cluster catalogues. The overlap between the two catalogues has been accurately determined and possible cluster positional errors were manually recovered. The final samples comprise 270 and 355 redMaPPer and X-CLASS clusters respectively. X-ray cluster matching rates were analysed as a function of optical richness. In a second step, the redMaPPer clusters were correlated with the entire X-ray catalogue, containing point and uncharacterised sources (down to a few 10^{-15} erg s^{-1} cm^{-2} in the [0.5-2] keV band). A stacking analysis was performed for the remaining undetected optical clusters. Main results show that neither of the wavebands misses any massive cluster (as coded by X-ray luminosity or optical richness). After correcting for obvious pipeline short-comings (about 10% of the cases both in optical and X-ray), ~50% of the redMaPPer (down to a richness of 20) are found to coinc...

  18. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  19. Building pathway clusters from Random Forests classification using class votes

    Directory of Open Access Journals (Sweden)

    Zhao Hongyu

    2008-02-01

    Full Text Available Abstract Background Recent years have seen the development of various pathway-based methods for the analysis of microarray gene expression data. These approaches have the potential to bring biological insights into microarray studies. A variety of methods have been proposed to construct networks using gene expression data. Because individual pathways do not act in isolation, it is important to understand how different pathways coordinate to perform cellular functions. However, there are no published methods describing how to build pathway clusters that are closely related to traits of interest. Results We propose to build pathway clusters from pathway-based classification methods. The proposed methods allow researchers to identify clusters of pathways sharing similar functions. These pathways may or may not share genes. As an illustration, our approach is applied to three human breast cancer microarray data sets. We found that our methods yielded consistent and interpretable results for these three data sets. We further investigated one of the pathway clusters found using PubMatrix. We found that informative genes in the pathway clusters do have more publications with keywords, like estrogen receptor, compared with informative genes in other top pathways. In addition, using the shortest path analysis in GeneGo's MetaCore and Human Protein Reference Database, we were able to identify the links which connect the pathways without shared genes within the pathway cluster. Conclusion Our proposed pathway clustering methods allow bioinformaticians and biologists to investigate how informative genes within pathways are related to each other and understand possible crosstalk between pathways in a cluster. Therefore, building pathway clusters may lead to a better understanding of molecular mechanisms affecting a trait of interest, and help generate further biological hypotheses from gene expression data.

  20. Chaotic map clustering algorithm for EEG analysis

    Science.gov (United States)

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  1. Mapping Cigarettes Similarities using Cluster Analysis Methods

    Directory of Open Access Journals (Sweden)

    Lorentz Jäntschi

    2007-09-01

    Full Text Available The aim of the research was to investigate the relationship and/or occurrences in and between chemical composition information (tar, nicotine, carbon monoxide, market information (brand, manufacturer, price, and public health information (class, health warning as well as clustering of a sample of cigarette data. A number of thirty cigarette brands have been analyzed. Six categorical (cigarette brand, manufacturer, health warnings, class and four continuous (tar, nicotine, carbon monoxide concentrations and package price variables were collected for investigation of chemical composition, market information and public health information. Multiple linear regression and two clusterization techniques have been applied. The study revealed interesting remarks. The carbon monoxide concentration proved to be linked with tar and nicotine concentration. The applied clusterization methods identified groups of cigarette brands that shown similar characteristics. The tar and carbon monoxide concentrations were the main criteria used in clusterization. An analysis of a largest sample could reveal more relevant and useful information regarding the similarities between cigarette brands.

  2. Pair-Wise Cluster Analysis

    CERN Document Server

    Hardoon, David R

    2010-01-01

    This paper studies the problem of learning clusters which are consistently present in different (continuously valued) representations of observed data. Our setup differs slightly from the standard approach of (co-) clustering as we use the fact that some form of `labeling' becomes available in this setup: a cluster is only interesting if it has a counterpart in the alternative representation. The contribution of this paper is twofold: (i) the problem setting is explored and an analysis in terms of the PAC-Bayesian theorem is presented, (ii) a practical kernel-based algorithm is derived exploiting the inherent relation to Canonical Correlation Analysis (CCA), as well as its extension to multiple views. A content based information retrieval (CBIR) case study is presented on the multi-lingual aligned Europal document dataset which supports the above findings.

  3. Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment

    Directory of Open Access Journals (Sweden)

    Mirela Praisler

    2014-01-01

    Full Text Available An essential factor influencing the efficiency of the predictive models built with principal component analysis (PCA is the quality of the data clustering revealed by the score plots. The sensitivity and selectivity of the class assignment are strongly influenced by the relative position of the clusters and by their dispersion. We are proposing a set of indicators inspired from analytical geometry that may be used for an objective quantitative assessment of the data clustering quality as well as a global clustering quality coefficient (GCQC that is a measure of the overall predictive power of the PCA models. The use of these indicators for evaluating the efficiency of the PCA class assignment is illustrated by a comparative study performed for the identification of the preprocessing function that is generating the most efficient PCA system screening for amphetamines based on their GC-FTIR spectra. The GCQC ranking of the tested feature weights is explained based on estimated density distributions and validated by using quadratic discriminant analysis (QDA.

  4. Nonlinear analysis of EAS clusters

    CERN Document Server

    Zotov, M Yu; Fomin, Y A; Fomin, Yu. A.

    2002-01-01

    We apply certain methods of nonlinear time series analysis to the extensive air shower clusters found earlier in the data set obtained with the EAS-1000 Prototype array. In particular, we use the Grassberger-Procaccia algorithm to compute the correlation dimension of samples in the vicinity of the clusters. The validity of the results is checked by surrogate data tests and some additional quantities. We compare our conclusions with the results of similar investigations performed by the EAS-TOP and LAAS groups.

  5. Survey and Analysis of University Clustering

    Directory of Open Access Journals (Sweden)

    Srinatha Karur

    2013-07-01

    Full Text Available This paper gives on Clustering of Universities in the world with respect to their country policies OR local polices OR continent level polices with sub aims. So clustering method can generally apply when objective is specifically mentioned. For general objectives clusters are available in the form of logical or physical groups without networks. In this paper we emphasis on only University Clusters directly or University Clusters with some other clusters. Data miming methods are used for useful for Sampling Analysis and Clustering of Universities and Colleges with respect to local clusters [1] pp 1.

  6. Coupled Two-Way Clustering Analysis of Breast Cancer and Colon Cancer Gene Expression Data

    CERN Document Server

    Getz, G; Kela, I; Domany, E; Notterman, D A; Getz, Gad; Gal, Hilah; Kela, Itai; Domany, Eytan; Notterman, Dan A.

    2003-01-01

    We present and review Coupled Two Way Clustering, a method designed to mine gene expression data. The method identifies submatrices of the total expression matrix, whose clustering analysis reveals partitions of samples (and genes) into biologically relevant classes. We demonstrate, on data from colon and breast cancer, that we are able to identify partitions that elude standard clustering analysis.

  7. The SMART CLUSTER METHOD - adaptive earthquake cluster analysis and declustering

    Science.gov (United States)

    Schaefer, Andreas; Daniell, James; Wenzel, Friedemann

    2016-04-01

    Earthquake declustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity with usual applications comprising of probabilistic seismic hazard assessments (PSHAs) and earthquake prediction methods. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation. Various methods have been developed to address this issue from other researchers. These have differing ranges of complexity ranging from rather simple statistical window methods to complex epidemic models. This study introduces the smart cluster method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal identification. Hereby, an adaptive search algorithm for data point clusters is adopted. It uses the earthquake density in the spatio-temporal neighbourhood of each event to adjust the search properties. The identified clusters are subsequently analysed to determine directional anisotropy, focussing on a strong correlation along the rupture plane and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010/2011 Darfield-Christchurch events, an adaptive classification procedure is applied to disassemble subsequent ruptures which may have been grouped into an individual cluster using near-field searches, support vector machines and temporal splitting. The steering parameters of the search behaviour are linked to local earthquake properties like magnitude of completeness, earthquake density and Gutenberg-Richter parameters. The method is capable of identifying and classifying earthquake clusters in space and time. It is tested and validated using earthquake data from California and New Zealand. As a result of the cluster identification process, each event in

  8. Parallel unstructured AMR and gigabit networking for Beowulf-class clusters

    Science.gov (United States)

    Norton, C. D.; Cwik, T. A.

    2001-01-01

    The impact of gigabit networking with Myrinet 2000 hardware and MPICH-GM software on a 2-way SMP Beowulf-class cluster for parallel unstructured adaptive mesh refinement using the PYRAMID library is described.

  9. Teleportation of an Arbitrary Two-Particle State via a Single Cluster-Class State

    International Nuclear Information System (INIS)

    Teleportation of an arbitrary two-qubit state with a single partially entangled state, a four-qubit linear cluster-class state, is studied. The case is more practical than previous ones using maximally entangled states as the quantum channel. In order to realize teleportation, we first construct a cluster-basis of 16 orthonormal cluster states. We show that quantum teleportation can be successfully implemented with a certain probability if the receiver can adopt appropriate unitary transformations after receiving the sender's cluster-basis measurement information. In addition, an important conclusion can be obtained that a four-qubit maximally entangled state (cluster state) can be extracted from a single copy of the cluster-class state with the same probability as the teleportation in principle. (general)

  10. A Beowulf-class computing cluster for the Monte Carlo production of the LHCb experiment

    CERN Document Server

    Avoni, G; Bertin, A; Bruschi, M; Capponi, M; Carbone, A; Collamati, A; De Castro, S; Fabbri, Franco Luigi; Faccioli, P; Galli, D; Giacobbe, B; Lax, I; Marconi, U; Massa, I; Piccinini, M; Poli, M; Semprini-Cesari, N; Spighi, R; Vagnoni, V M; Vecchi, S; Villa, M; Vitale, A; Zoccoli, A

    2003-01-01

    The computing cluster built at Bologna to provide the LHCb Collaboration with a powerful Monte Carlo production tool is presented. It is a performance oriented Beowulf-class cluster, made of rack mounted commodity components, designed to minimize operational support requirements and to provide full and continuous availability of the computing resources. In this paper we describe the architecture of the cluster, and discuss the technical solutions adopted for each specialized sub-system.

  11. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    Full Text Available BACKGROUND: Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes. METHODS: Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method. RESULTS: The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001. CONCLUSION: The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  12. Semantic Analysis of Virtual Classes and Nested Classes

    DEFF Research Database (Denmark)

    Madsen, Ole Lehrmann

    1999-01-01

    Virtual classes and nested classes are distinguishing features of BETA. Nested classes originated from Simula, but until recently they have not been part of main stream object- oriented languages. C++ has a restricted form of nested classes and they were included in Java 1.1. Virtual classes is the...... classes and parameterized classes have been made. Although virtual classes and nested classes have been used in BETA for more than a decade, their implementation has not been published. The purpose of this paper is to contribute to the understanding of virtual classes and nested classes by presenting the...

  13. DEPENDENCE ANALYSIS FOR UML CLASS DIAGRAMS

    Institute of Scientific and Technical Information of China (English)

    Wu Fangjun; Yi Tong

    2004-01-01

    Though Unified Modeling Language (UML) has been widely used in software development, the major problems confronted lie in comprehension and testing. Dependence analysis is an important approach to analyze, understand, test and maintain programs. A new kind of dependence analysis method for UML class diagrams is developed. A set of dependence relations is definedcorresponding to the relations among classes. Thus, the dependence graph of UML class diagram can be constructed from these dependence relations. Based on this model, both slicing and measurement coupling are further given as its two applications.

  14. Cyclist–motorist crash patterns in Denmark: A latent class clustering approach

    DEFF Research Database (Denmark)

    Kaplan, Sigal; Prato, Carlo Giacomo

    2013-01-01

    differentiating the latent classes were speed limit, infrastructure type, road surface conditions, number of lanes, motorized vehicle precrash maneuvers, the availability of a cycle lane, cyclist intoxication, and helmet wearing behavior. After the latent class clustering, the distribution of cyclists’ injury......Objective: The current study aimed at uncovering patterns of cyclist–motorist crashes in Denmark and investigating their prevalence and severity. The importance of implementing clustering techniques for providing a holistic overview of vulnerable road users’ crash patterns derives from the need...

  15. ASteCA - Automated Stellar Cluster Analysis

    CERN Document Server

    Perren, Gabriel I; Piatti, Andrés E

    2014-01-01

    We present ASteCA (Automated Stellar Cluster Analysis), a suit of tools designed to fully automatize the standard tests applied on stellar clusters to determine their basic parameters. The set of functions included in the code make use of positional and photometric data to obtain precise and objective values for a given cluster's center coordinates, radius, luminosity function and integrated color magnitude, as well as characterizing through a statistical estimator its probability of being a true physical cluster rather than a random overdensity of field stars. ASteCA incorporates a Bayesian field star decontamination algorithm capable of assigning membership probabilities using photometric data alone. An isochrone fitting process based on the generation of synthetic clusters from theoretical isochrones and selection of the best fit through a genetic algorithm is also present, which allows ASteCA to provide accurate estimates for a cluster's metallicity, age, extinction and distance values along with its unce...

  16. Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering

    OpenAIRE

    Luo, Zhihui; Johnson, Stephen B.; Weng, Chunhua

    2010-01-01

    This paper presents a novel approach to learning semantic classes of clinical research eligibility criteria. It uses the UMLS Semantic Types to represent semantic features and the Hierarchical Clustering method to group similar eligibility criteria. By establishing a gold standard using two independent raters, we evaluated the coverage and accuracy of the induced semantic classes. On 2,718 random eligibility criteria sentences, the inter-rater classification agreement was 85.73%. In a 10-fold...

  17. Fatal and serious road crashes involving young New Zealand drivers: a latent class clustering approach

    DEFF Research Database (Denmark)

    Weiss, Harold B.; Kaplan, Sigal; Prato, Carlo Giacomo

    2015-01-01

    classification that revealed how the identified clusters contain mostly crashes of a particular class and all the crashes of that class. The results raised three major safety concerns for young drivers that should be addressed: (1) reckless driving and traffic law violations; (2) inattention, error, and hazard...... perception problems; and (3) interaction with road geometry and lighting conditions, especially on high-speed open roads and state highways....

  18. Merged consensus clustering to assess and improve class discovery with microarray data

    Directory of Open Access Journals (Sweden)

    Jarman Andrew P

    2010-12-01

    Full Text Available Abstract Background One of the most commonly performed tasks when analysing high throughput gene expression data is to use clustering methods to classify the data into groups. There are a large number of methods available to perform clustering, but it is often unclear which method is best suited to the data and how to quantify the quality of the classifications produced. Results Here we describe an R package containing methods to analyse the consistency of clustering results from any number of different clustering methods using resampling statistics. These methods allow the identification of the the best supported clusters and additionally rank cluster members by their fidelity within the cluster. These metrics allow us to compare the performance of different clustering algorithms under different experimental conditions and to select those that produce the most reliable clustering structures. We show the application of this method to simulated data, canonical gene expression experiments and our own novel analysis of genes involved in the specification of the peripheral nervous system in the fruitfly, Drosophila melanogaster. Conclusions Our package enables users to apply the merged consensus clustering methodology conveniently within the R programming environment, providing both analysis and graphical display functions for exploring clustering approaches. It extends the basic principle of consensus clustering by allowing the merging of results between different methods to provide an averaged clustering robustness. We show that this extension is useful in correcting for the tendency of clustering algorithms to treat outliers differently within datasets. The R package, clusterCons, is freely available at CRAN and sourceforge under the GNU public licence.

  19. Cluster Analysis of Adolescent Blogs

    Science.gov (United States)

    Liu, Eric Zhi-Feng; Lin, Chun-Hung; Chen, Feng-Yi; Peng, Ping-Chuan

    2012-01-01

    Emerging web applications and networking systems such as blogs have become popular, and they offer unique opportunities and environments for learners, especially for adolescent learners. This study attempts to explore the writing styles and genres used by adolescents in their blogs by employing content, factor, and cluster analyses. Factor…

  20. Nuclear class 1 piping stress analysis

    International Nuclear Information System (INIS)

    A nuclear class 1 piping stress analysis, according to the ASME code, is presented. The TRHEAT computer code has been used to determine the piping wall thermal gradient. The Nupipe computer code was employed for the piping stress analysis. Computer results were compared with the allowable criteria from the ASME code. (Author)

  1. Clustering analysis of telecommunication customers

    Institute of Scientific and Technical Information of China (English)

    REN Hong; ZHENG Yan; WU Ye-rong

    2009-01-01

    In this article, a clustering method based on genetic algorithm (GA) for telecommunication customer subdivision is presented. First, the features of telecommunication customers (such as the calling behavior and consuming behavior) are extracted. Second, the similarities between the multidimensional feature vectors of telecommunication customers are computed and mapped as the distance between samples on a two-dimensional plane. Finally, the distances are adjusted to approximate the similarities gradually by GA. One advantage of this method is the independent distribution of the sample space. The experiments demonstrate the feasibility of the proposed method.

  2. ASteCA: Automated Stellar Cluster Analysis

    Science.gov (United States)

    Perren, G. I.; Vázquez, R. A.; Piatti, A. E.

    2015-04-01

    We present the Automated Stellar Cluster Analysis package (ASteCA), a suit of tools designed to fully automate the standard tests applied on stellar clusters to determine their basic parameters. The set of functions included in the code make use of positional and photometric data to obtain precise and objective values for a given cluster's center coordinates, radius, luminosity function and integrated color magnitude, as well as characterizing through a statistical estimator its probability of being a true physical cluster rather than a random overdensity of field stars. ASteCA incorporates a Bayesian field star decontamination algorithm capable of assigning membership probabilities using photometric data alone. An isochrone fitting process based on the generation of synthetic clusters from theoretical isochrones and selection of the best fit through a genetic algorithm is also present, which allows ASteCA to provide accurate estimates for a cluster's metallicity, age, extinction and distance values along with its uncertainties. To validate the code we applied it on a large set of over 400 synthetic MASSCLEAN clusters with varying degrees of field star contamination as well as a smaller set of 20 observed Milky Way open clusters (Berkeley 7, Bochum 11, Czernik 26, Czernik 30, Haffner 11, Haffner 19, NGC 133, NGC 2236, NGC 2264, NGC 2324, NGC 2421, NGC 2627, NGC 6231, NGC 6383, NGC 6705, Ruprecht 1, Tombaugh 1, Trumpler 1, Trumpler 5 and Trumpler 14) studied in the literature. The results show that ASteCA is able to recover cluster parameters with an acceptable precision even for those clusters affected by substantial field star contamination. ASteCA is written in Python and is made available as an open source code which can be downloaded ready to be used from its official site.

  3. Filtering Genes for Cluster and Network Analysis

    Directory of Open Access Journals (Sweden)

    Parkhomenko Elena

    2009-06-01

    Full Text Available Abstract Background Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpretability and reduce bias. Results This paper introduces modular models for representing network structure in order to study the relative effects of different filtering methods. We show that cluster analysis and principal components are strongly affected by filtering. Filtering methods intended specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. To study more realistic situations, we analyze simulated "real" data based on well-characterized E. coli and S. cerevisiae regulatory networks. Conclusion The methods introduced apply very generally, to any similarity matrix describing gene expression. One of the proposed methods, SUMCOV, performed well for all models simulated.

  4. Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering.

    Science.gov (United States)

    Luo, Zhihui; Johnson, Stephen B; Weng, Chunhua

    2010-01-01

    This paper presents a novel approach to learning semantic classes of clinical research eligibility criteria. It uses the UMLS Semantic Types to represent semantic features and the Hierarchical Clustering method to group similar eligibility criteria. By establishing a gold standard using two independent raters, we evaluated the coverage and accuracy of the induced semantic classes. On 2,718 random eligibility criteria sentences, the inter-rater classification agreement was 85.73%. In a 10-fold validation test, the average Precision, Recall and F-score of the classification results of a decision-tree classifier were 87.8%, 88.0%, and 87.7% respectively. Our induced classes well aligned with 16 out of 17 eligibility criteria classes defined by the BRIDGE model. We discuss the potential of this method and our future work. PMID:21347026

  5. Structural variation of the ribosomal gene cluster within the class Insecta

    Energy Technology Data Exchange (ETDEWEB)

    Mukha, D.V.; Sidorenko, A.P.; Lazebnaya, I.V. [Vavilov Institute of General Genetics, Moscow (Russian Federation)] [and others

    1995-09-01

    General estimation of ribosomal DNA variation within the class Insecta is presented. It is shown that, using blot-hybridization, one can detect differences in the structure of the ribosomal gene cluster not only between genera within an order, but also between species within a genera, including sibling species. Structure of the ribosomal gene cluster of the Coccinellidae family (ladybirds) is analyzed. It is shown that cloned highly conservative regions of ribosomal DNA of Tetrahymena pyriformis can be used as probes for analyzing ribosomal genes in insects. 24 refs., 4 figs.

  6. Clustering analysis of seismicity and aftershock identification.

    Science.gov (United States)

    Zaliapin, Ilya; Gabrielov, Andrei; Keilis-Borok, Vladimir; Wong, Henry

    2008-07-01

    We introduce a statistical methodology for clustering analysis of seismicity in the time-space-energy domain and use it to establish the existence of two statistically distinct populations of earthquakes: clustered and nonclustered. This result can be used, in particular, for nonparametric aftershock identification. The proposed approach expands the analysis of Baiesi and Paczuski [Phys. Rev. E 69, 066106 (2004)10.1103/PhysRevE.69.066106] based on the space-time-magnitude nearest-neighbor distance eta between earthquakes. We show that for a homogeneous Poisson marked point field with exponential marks, the distance eta has the Weibull distribution, which bridges our results with classical correlation analysis for point fields. The joint 2D distribution of spatial and temporal components of eta is used to identify the clustered part of a point field. The proposed technique is applied to several seismicity models and to the observed seismicity of southern California.

  7. Identifying victims of workplace bullying by integrating traditional estimation approaches into a latent class cluster model.

    Science.gov (United States)

    Leon-Perez, Jose M; Notelaers, Guy; Arenas, Alicia; Munduate, Lourdes; Medina, Francisco J

    2014-05-01

    Research findings underline the negative effects of exposure to bullying behaviors and document the detrimental health effects of being a victim of workplace bullying. While no one disputes its negative consequences, debate continues about the magnitude of this phenomenon since very different prevalence rates of workplace bullying have been reported. Methodological aspects may explain these findings. Our contribution to this debate integrates behavioral and self-labeling estimation methods of workplace bullying into a measurement model that constitutes a bullying typology. Results in the present sample (n = 1,619) revealed that six different groups can be distinguished according to the nature and intensity of reported bullying behaviors. These clusters portray different paths for the workplace bullying process, where negative work-related and person-degrading behaviors are strongly intertwined. The analysis of the external validity showed that integrating previous estimation methods into a single measurement latent class model provides a reliable estimation method of workplace bullying, which may overcome previous flaws. PMID:24257593

  8. Cosmological analysis of galaxy clusters surveys in X-rays

    International Nuclear Information System (INIS)

    Clusters of galaxies are the most massive objects in equilibrium in our Universe. Their study allows to test cosmological scenarios of structure formation with precision, bringing constraints complementary to those stemming from the cosmological background radiation, supernovae or galaxies. They are identified through the X-ray emission of their heated gas, thus facilitating their mapping at different epochs of the Universe. This report presents two surveys of galaxy clusters detected in X-rays and puts forward a method for their cosmological interpretation. Thanks to its multi-wavelength coverage extending over 10 sq. deg. and after one decade of expertise, the XMM-LSS allows a systematic census of clusters in a large volume of the Universe. In the framework of this survey, the first part of this report describes the techniques developed to the purpose of characterizing the detected objects. A particular emphasis is placed on the most distant ones (z ≥ 1) through the complementarity of observations in X-ray, optical and infrared bands. Then the X-CLASS survey is fully described. Based on XMM archival data, it provides a new catalogue of 800 clusters detected in X-rays. A cosmological analysis of this survey is performed thanks to 'CR-HR' diagrams. This new method self-consistently includes selection effects and scaling relations and provides a means to bypass the computation of individual cluster masses. Propositions are made for applying this method to future surveys as XMM-XXL and eRosita. (author)

  9. A class of spherical, truncated, anisotropic models for application to globular clusters

    Science.gov (United States)

    de Vita, Ruggero; Bertin, Giuseppe; Zocchi, Alice

    2016-05-01

    Recently, a class of non-truncated, radially anisotropic models (the so-called f(ν)-models), originally constructed in the context of violent relaxation and modelling of elliptical galaxies, has been found to possess interesting qualities in relation to observed and simulated globular clusters. In view of new applications to globular clusters, we improve this class of models along two directions. To make them more suitable for the description of small stellar systems hosted by galaxies, we introduce a "tidal" truncation by means of a procedure that guarantees full continuity of the distribution function. The new fT(ν)-models are shown to provide a better fit to the observed photometric and spectroscopic profiles for a sample of 13 globular clusters studied earlier by means of non-truncated models; interestingly, the best-fit models also perform better with respect to the radial-orbit instability. Then, we design a flexible but simple two-component family of truncated models to study the separate issues of mass segregation and multiple populations. We do not aim at a fully realistic description of globular clusters to compete with the description currently obtained by means of dedicated simulations. The goal here is to try to identify the simplest models, that is, those with the smallest number of free parameters, but still have the capacity to provide a reasonable description for clusters that are evidently beyond the reach of one-component models. With this tool, we aim at identifying the key factors that characterize mass segregation or the presence of multiple populations. To reduce the relevant parameter space, we formulate a few physical arguments based on recent observations and simulations. A first application to two well-studied globular clusters is briefly described and discussed.

  10. Psychiatric comorbidity among adults with schizophrenia: a latent class analysis.

    Science.gov (United States)

    Tsai, Jack; Rosenheck, Robert A

    2013-11-30

    Schizophrenia is a severe mental illness that often co-occurs with and can be exacerbated by other psychiatric conditions. There have not been adequate efforts to examine schizophrenia and psychiatric comorbidity beyond pairwise examination using clusters of diagnoses. This study used latent class analysis to characterize patterns of 5-year psychiatric comorbidity among a national sample of adults with schizophrenia. Baseline data from 1446 adults with schizophrenia across 57 sites in the United States were analyzed. Three latent classes were identified labeled Solely Schizophrenia, Comorbid Anxiety and Depressive Disorders with Schizophrenia, and Comorbid Addiction and Schizophrenia. Adults in the Solely Schizophrenia class had significantly better mental health than those in the two comorbid classes, but poorer illness and treatment insight than those with comorbid anxiety and depressive disorders. These results suggest that addiction and schizophrenia may represent a separate latent profile from depression, anxiety, and schizophrenia. More research is needed on how treatment can take advantage of the greater insight possessed by those with schizophrenia and comorbid anxiety and depression.

  11. Hierarchical genetic clusters for phenotypic analysis

    Directory of Open Access Journals (Sweden)

    Luiza Barbosa da Matta

    2015-10-01

    Full Text Available Methods to obtain phenotypic information were evaluated to help breeders choosing the best methodology for analysis of genetic diversity in backcross populations. Phenotypes were simulated for 13 characteristics generated in 10 populations with 100 individuals each. Genotypic information was generated from 100 loci of which 20 were taken at random to determine the characteristics expressing two alleles. Dissimilarity measures were calculated, and genetic diversity was analyzed through hierarchical clustering and graphic projection of the distances. A backcross was performed from the two most divergent populations. A set of characteristics with variable heritability was taken into account. The environmental effect was simulated assuming . For hierarchical clusters, the following methods were used: Gower Method, average linkage within the cluster, average linkage among clusters, the furthest neighbor method, the nearest neighbor method, Ward’s method, and the median method. The environmental effect and heritability of the analyzed variables had an influence on the pattern of hierarchical clustering populations according to the backcrossed generations. The nearest neighbor method was the most efficient in reconstructing the system of backcrossing, and it presented the highest cophenetic correlation. The efficiency of the nearest neighbor method was the highest when the analysis involved characteristics of high heritability.

  12. Cluster and constraint analysis in tetrahedron packings.

    Science.gov (United States)

    Jin, Weiwei; Lu, Peng; Liu, Lufeng; Li, Shuixiang

    2015-04-01

    The disordered packings of tetrahedra often show no obvious macroscopic orientational or positional order for a wide range of packing densities, and it has been found that the local order in particle clusters is the main order form of tetrahedron packings. Therefore, a cluster analysis is carried out to investigate the local structures and properties of tetrahedron packings in this work. We obtain a cluster distribution of differently sized clusters, and peaks are observed at two special clusters, i.e., dimer and wagon wheel. We then calculate the amounts of dimers and wagon wheels, which are observed to have linear or approximate linear correlations with packing density. Following our previous work, the amount of particles participating in dimers is used as an order metric to evaluate the order degree of the hierarchical packing structure of tetrahedra, and an order map is consequently depicted. Furthermore, a constraint analysis is performed to determine the isostatic or hyperstatic region in the order map. We employ a Monte Carlo algorithm to test jamming and then suggest a new maximally random jammed packing of hard tetrahedra from the order map with a packing density of 0.6337.

  13. Clustering

    Directory of Open Access Journals (Sweden)

    Jinfei Liu

    2013-04-01

    Full Text Available DBSCAN is a well-known density-based clustering algorithm which offers advantages for finding clusters of arbitrary shapes compared to partitioning and hierarchical clustering methods. However, there are few papers studying the DBSCAN algorithm under the privacy preserving distributed data mining model, in which the data is distributed between two or more parties, and the parties cooperate to obtain the clustering results without revealing the data at the individual parties. In this paper, we address the problem of two-party privacy preserving DBSCAN clustering. We first propose two protocols for privacy preserving DBSCAN clustering over horizontally and vertically partitioned data respectively and then extend them to arbitrarily partitioned data. We also provide performance analysis and privacy proof of our solution..

  14. NGC 6273: Towards Defining A New Class of Galactic Globular Clusters?

    Science.gov (United States)

    Johnson, Christian I.; Rich, Robert Michael; Pilachowski, Catherine A.; Caldwell, Nelson; Mateo, Mario L.; Ira Bailey, John; Crane, Jeffrey D.

    2016-01-01

    A growing number of observations have found that several Galactic globular clusters exhibit abundance dispersions beyond the well-known light element (anti-)correlations. These clusters tend to be very massive, have >0.1 dex intrinsic metallicity dispersions, have complex sub-giant branch morphologies, and have correlated [Fe/H] and s-process element enhancements. Interestingly, nearly all of these clusters discovered so far have [Fe/H]~-1.7. In this context, we have examined the chemical composition of 18 red giant branch (RGB) stars in the massive, metal-poor Galactic bulge globular cluster NGC 6273 using high signal-to-noise, high resolution (R~27,000) spectra obtained with the Michigan/Magellan Fiber System (M2FS) and MSpec spectrograph mounted on the Magellan-Clay 6.5m telescope at Las Campanas Observatory. We find that the cluster exhibits a metallicity range from [Fe/H]=-1.80 to -1.30 and is composed of two dominant populations separated in [Fe/H] and [La/Fe] abundance. The increase in [La/Eu] as a function of [La/H] suggests that the increase in [La/Fe] with [Fe/H] is due to almost pure s-process enrichment. The most metal-rich star in our sample is not strongly La-enhanced, but is α-poor and may belong to a third "anomalous" stellar population. The two dominant populations exhibit the same [Na/Fe]-[Al/Fe] correlation found in other "normal" globular clusters. Therefore, NGC 6273 joins ω Centauri, M 22, M 2, and NGC 5286 as a possible new class of Galactic globular clusters.

  15. Incremental multi-class semi-supervised clustering regularized by Kalman filtering.

    Science.gov (United States)

    Mehrkanoon, Siamak; Agudelo, Oscar Mauricio; Suykens, Johan A K

    2015-11-01

    This paper introduces an on-line semi-supervised learning algorithm formulated as a regularized kernel spectral clustering (KSC) approach. We consider the case where new data arrive sequentially but only a small fraction of it is labeled. The available labeled data act as prototypes and help to improve the performance of the algorithm to estimate the labels of the unlabeled data points. We adopt a recently proposed multi-class semi-supervised KSC based algorithm (MSS-KSC) and make it applicable for on-line data clustering. Given a few user-labeled data points the initial model is learned and then the class membership of the remaining data points in the current and subsequent time instants are estimated and propagated in an on-line fashion. The update of the memberships is carried out mainly using the out-of-sample extension property of the model. Initially the algorithm is tested on computer-generated data sets, then we show that video segmentation can be cast as a semi-supervised learning problem. Furthermore we show how the tracking capabilities of the Kalman filter can be used to provide the labels of objects in motion and thus regularizing the solution obtained by the MSS-KSC algorithm. In the experiments, we demonstrate the performance of the proposed method on synthetic data sets and real-life videos where the clusters evolve in a smooth fashion over time.

  16. Incremental multi-class semi-supervised clustering regularized by Kalman filtering.

    Science.gov (United States)

    Mehrkanoon, Siamak; Agudelo, Oscar Mauricio; Suykens, Johan A K

    2015-11-01

    This paper introduces an on-line semi-supervised learning algorithm formulated as a regularized kernel spectral clustering (KSC) approach. We consider the case where new data arrive sequentially but only a small fraction of it is labeled. The available labeled data act as prototypes and help to improve the performance of the algorithm to estimate the labels of the unlabeled data points. We adopt a recently proposed multi-class semi-supervised KSC based algorithm (MSS-KSC) and make it applicable for on-line data clustering. Given a few user-labeled data points the initial model is learned and then the class membership of the remaining data points in the current and subsequent time instants are estimated and propagated in an on-line fashion. The update of the memberships is carried out mainly using the out-of-sample extension property of the model. Initially the algorithm is tested on computer-generated data sets, then we show that video segmentation can be cast as a semi-supervised learning problem. Furthermore we show how the tracking capabilities of the Kalman filter can be used to provide the labels of objects in motion and thus regularizing the solution obtained by the MSS-KSC algorithm. In the experiments, we demonstrate the performance of the proposed method on synthetic data sets and real-life videos where the clusters evolve in a smooth fashion over time. PMID:26319050

  17. Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering.

    Directory of Open Access Journals (Sweden)

    Sebastian Will

    2007-04-01

    Full Text Available The RFAM database defines families of ncRNAs by means of sequence similarities that are sufficient to establish homology. In some cases, such as microRNAs and box H/ACA snoRNAs, functional commonalities define classes of RNAs that are characterized by structural similarities, and typically consist of multiple RNA families. Recent advances in high-throughput transcriptomics and comparative genomics have produced very large sets of putative noncoding RNAs and regulatory RNA signals. For many of them, evidence for stabilizing selection acting on their secondary structures has been derived, and at least approximate models of their structures have been computed. The overwhelming majority of these hypothetical RNAs cannot be assigned to established families or classes. We present here a structure-based clustering approach that is capable of extracting putative RNA classes from genome-wide surveys for structured RNAs. The LocARNA (local alignment of RNA tool implements a novel variant of the Sankoff algorithm that is sufficiently fast to deal with several thousand candidate sequences. The method is also robust against false positive predictions, i.e., a contamination of the input data with unstructured or nonconserved sequences. We have successfully tested the LocARNA-based clustering approach on the sequences of the RFAM-seed alignments. Furthermore, we have applied it to a previously published set of 3,332 predicted structured elements in the Ciona intestinalis genome (Missal K, Rose D, Stadler PF (2005 Noncoding RNAs in Ciona intestinalis. Bioinformatics 21 (Supplement 2: i77-i78. In addition to recovering, e.g., tRNAs as a structure-based class, the method identifies several RNA families, including microRNA and snoRNA candidates, and suggests several novel classes of ncRNAs for which to date no representative has been experimentally characterized.

  18. An Analysis of Particle Swarm Optimization with Data Clustering-Technique for Optimization in Data Mining

    Directory of Open Access Journals (Sweden)

    Amreen Khan,

    2010-07-01

    Full Text Available Data clustering is a popular approach for automatically finding classes, concepts, or groups of patterns. Clustering aims at representing large datasets by a fewer number of prototypes or clusters. It brings simplicity in modeling data and thus plays a central role in the process of knowledge discovery and data mining. Data mining tasks require fast and accurate partitioning of huge datasets, which may come with a variety of attributes or features. This imposes severe computational requirements on the relevant clustering techniques. A family of bio-inspired algorithms, well-known as Swarm Intelligence (SI has recently emerged that meets these requirements and has successfully been applied to a number ofreal world clustering problems. This paper looks into the use ofParticle Swarm Optimization for cluster analysis. The effectiveness of Fuzzy C-means clustering provides enhanced performance and maintains more diversity in the swarm and also allows the particles to be robust to trace the changing environment.

  19. Using existing questionnaires in latent class analysis

    DEFF Research Database (Denmark)

    Nielsen, Anne Molgaard; Vach, Werner; Kent, Peter;

    2016-01-01

    BACKGROUND: Latent class analysis (LCA) is increasingly being used in health research, but optimal approaches to handling complex clinical data are unclear. One issue is that commonly used questionnaires are multidimensional, but expressed as summary scores. Using the example of low back pain (LBP......), the aim of this study was to explore and descriptively compare the application of LCA when using questionnaire summary scores and when using single items to subgrouping of patients based on multidimensional data. MATERIALS AND METHODS: Baseline data from 928 LBP patients in an observational study were...

  20. Clustering Analysis on E-commerce Transaction Based on K-means Clustering

    Directory of Open Access Journals (Sweden)

    Xuan HUANG

    2014-02-01

    Full Text Available Based on the density, increment and grid etc, shortcomings like the bad elasticity, weak handling ability of high-dimensional data, sensitive to time sequence of data, bad independence of parameters and weak handling ability of noise are usually existed in clustering algorithm when facing a large number of high-dimensional transaction data. Making experiments by sampling data samples of the 300 mobile phones of Taobao, the following conclusions can be obtained: compared with Single-pass clustering algorithm, the K-means clustering algorithm has a high intra-class dissimilarity and inter-class similarity when analyzing e-commerce transaction. In addition, the K-means clustering algorithm has very high efficiency and strong elasticity when dealing with a large number of data items. However, clustering effects of this algorithm are affected by clustering number and initial positions of clustering center. Therefore, it is easy to show the local optimization for clustering results. Therefore, how to determine clustering number and initial positions of the clustering center of this algorithm is still the important job to be researched in the future.

  1. MANNER OF STOCKS SORTING USING CLUSTER ANALYSIS METHODS

    Directory of Open Access Journals (Sweden)

    Jana Halčinová

    2014-06-01

    Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.

  2. Intelligent Pattern Mining and Data Clustering for Pattern Cluster Analysis using Cancer Data

    Directory of Open Access Journals (Sweden)

    G.Raj Kumar

    2010-12-01

    Full Text Available Data mining techniques are used for the knowledge discovery process under the large data set environment. Clustering techniques are used to group up the relevant data sets. Hierarchical and partitioned clustering techniques are used for the clustering process. The clustering process is the complex task with high process time. The pattern extraction scheme is applied to find frequent item sets. Association rule mining techniques are applied to carry out the pattern extraction process. The pattern extraction scheme and the clustering scheme are integrated in the simultaneous pattern extraction and clustering scheme. The clustering process is improved with pattern comparison and transaction transfer process. The simultaneous clustering scheme is implemented to analyze the cancer patient diagnosis reports. The system is implemented as four major modules data set management, pattern extraction, clustering process and performance analysis. The data sets are preprocessed before the pattern extraction process. The patterns are used in the simultaneous clustering process. The performance analysis is done with the comparison of the data clustering scheme and pattern clustering schemes. The process time and memory factors are used in the performance analysis process. The cluster accuracy is represented using the fitness values. The system is enhanced with the K-means clustering algorithm.

  3. ClusterViz: A Cytoscape APP for Cluster Analysis of Biological Network.

    Science.gov (United States)

    Wang, Jianxin; Zhong, Jiancheng; Chen, Gang; Li, Min; Wu, Fang-xiang; Pan, Yi

    2015-01-01

    Cluster analysis of biological networks is one of the most important approaches for identifying functional modules and predicting protein functions. Furthermore, visualization of clustering results is crucial to uncover the structure of biological networks. In this paper, ClusterViz, an APP of Cytoscape 3 for cluster analysis and visualization, has been developed. In order to reduce complexity and enable extendibility for ClusterViz, we designed the architecture of ClusterViz based on the framework of Open Services Gateway Initiative. According to the architecture, the implementation of ClusterViz is partitioned into three modules including interface of ClusterViz, clustering algorithms and visualization and export. ClusterViz fascinates the comparison of the results of different algorithms to do further related analysis. Three commonly used clustering algorithms, FAG-EC, EAGLE and MCODE, are included in the current version. Due to adopting the abstract interface of algorithms in module of the clustering algorithms, more clustering algorithms can be included for the future use. To illustrate usability of ClusterViz, we provided three examples with detailed steps from the important scientific articles, which show that our tool has helped several research teams do their research work on the mechanism of the biological networks. PMID:26357321

  4. Unsupervised Anomaly Detection Based on Clustering and Multiple One-Class SVM

    Science.gov (United States)

    Song, Jungsuk; Takakura, Hiroki; Okabe, Yasuo; Kwon, Yongjin

    Intrusion detection system (IDS) has played an important role as a device to defend our networks from cyber attacks. However, since it is unable to detect unknown attacks, i.e., 0-day attacks, the ultimate challenge in intrusion detection field is how we can exactly identify such an attack by an automated manner. Over the past few years, several studies on solving these problems have been made on anomaly detection using unsupervised learning techniques such as clustering, one-class support vector machine (SVM), etc. Although they enable one to construct intrusion detection models at low cost and effort, and have capability to detect unforeseen attacks, they still have mainly two problems in intrusion detection: a low detection rate and a high false positive rate. In this paper, we propose a new anomaly detection method based on clustering and multiple one-class SVM in order to improve the detection rate while maintaining a low false positive rate. We evaluated our method using KDD Cup 1999 data set. Evaluation results show that our approach outperforms the existing algorithms reported in the literature; especially in detection of unknown attacks.

  5. Gennclus: New Models for General Nonhierarchical Clustering Analysis.

    Science.gov (United States)

    Desarbo, Wayne S.

    1982-01-01

    A general class of nonhierarchical clustering models and associated algorithms for fitting them are presented. These models generalize the Shepard-Arabie Additive clusters model. Two applications are given and extensions to three-way models, nonmetric analyses, and other model specifications are provided. (Author/JKS)

  6. Data Clustering Analysis Based on Wavelet Feature Extraction

    Institute of Scientific and Technical Information of China (English)

    QIANYuntao; TANGYuanyan

    2003-01-01

    A novel wavelet-based data clustering method is presented in this paper, which includes wavelet feature extraction and cluster growing algorithm. Wavelet transform can provide rich and diversified information for representing the global and local inherent structures of dataset. therefore, it is a very powerful tool for clustering feature extraction. As an unsupervised classification, the target of clustering analysis is dependent on the specific clustering criteria. Several criteria that should be con-sidered for general-purpose clustering algorithm are pro-posed. And the cluster growing algorithm is also con-structed to connect clustering criteria with wavelet fea-tures. Compared with other popular clustering methods,our clustering approach provides multi-resolution cluster-ing results,needs few prior parameters, correctly deals with irregularly shaped clusters, and is insensitive to noises and outliers. As this wavelet-based clustering method isaimed at solving two-dimensional data clustering prob-lem, for high-dimensional datasets, self-organizing mapand U-matrlx method are applied to transform them intotwo-dimensional Euclidean space, so that high-dimensional data clustering analysis,Results on some sim-ulated data and standard test data are reported to illus-trate the power of our method.

  7. Constructing storyboards based on hierarchical clustering analysis

    Science.gov (United States)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  8. Estimating the number of clusters via system evolution for cluster analysis of gene expression data.

    Science.gov (United States)

    Wang, Kaijun; Zheng, Jie; Zhang, Junying; Dong, Jiyang

    2009-09-01

    The estimation of the number of clusters (NC) is one of crucial problems in the cluster analysis of gene expression data. Most approaches available give their answers without the intuitive information about separable degrees between clusters. However, this information is useful for understanding cluster structures. To provide this information, we propose system evolution (SE) method to estimate NC based on partitioning around medoids (PAM) clustering algorithm. SE analyzes cluster structures of a dataset from the viewpoint of a pseudothermodynamics system. The system will go to its stable equilibrium state, at which the optimal NC is found, via its partitioning process and merging process. The experimental results on simulated and real gene expression data demonstrate that the SE works well on the data with well-separated clusters and the one with slightly overlapping clusters. PMID:19527960

  9. Adaptive Fuzzy Consensus Clustering Framework for Clustering Analysis of Cancer Data.

    Science.gov (United States)

    Yu, Zhiwen; Chen, Hantao; You, Jane; Liu, Jiming; Wong, Hau-San; Han, Guoqiang; Li, Le

    2015-01-01

    Performing clustering analysis is one of the important research topics in cancer discovery using gene expression profiles, which is crucial in facilitating the successful diagnosis and treatment of cancer. While there are quite a number of research works which perform tumor clustering, few of them considers how to incorporate fuzzy theory together with an optimization process into a consensus clustering framework to improve the performance of clustering analysis. In this paper, we first propose a random double clustering based cluster ensemble framework (RDCCE) to perform tumor clustering based on gene expression data. Specifically, RDCCE generates a set of representative features using a randomly selected clustering algorithm in the ensemble, and then assigns samples to their corresponding clusters based on the grouping results. In addition, we also introduce the random double clustering based fuzzy cluster ensemble framework (RDCFCE), which is designed to improve the performance of RDCCE by integrating the newly proposed fuzzy extension model into the ensemble framework. RDCFCE adopts the normalized cut algorithm as the consensus function to summarize the fuzzy matrices generated by the fuzzy extension models, partition the consensus matrix, and obtain the final result. Finally, adaptive RDCFCE (A-RDCFCE) is proposed to optimize RDCFCE and improve the performance of RDCFCE further by adopting a self-evolutionary process (SEPP) for the parameter set. Experiments on real cancer gene expression profiles indicate that RDCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms. PMID:26357330

  10. External Defect classification of Citrus Fruit Images using Linear Discriminant Analysis Clustering and ANN classifiers

    Directory of Open Access Journals (Sweden)

    K.Vijayarekha

    2012-12-01

    Full Text Available Linear Discriminant Analysis (LDA is one technique for transforming raw data into a new feature space in which classification can be carried out more robustly. It is useful where the within-class frequencies are unequal. This method maximizes the ratio of between-class variance to the within-class variance in any particular data set and the maximal separability is guaranteed. LDA clustering models are used to classify object into different category. This study makes use of LDA for clustering the features obtained for the citrus fruit images taken in five different domains. Sub-windows of size 40x40 are cropped from the citrus fruit images having defects such as pitting, splitting and stem end rot. Features are extracted in four domains such as statistical features, fourier transform based features, discrete wavelet transform based features and stationary wavelet transform based features. The results of clustering and classification using LDA and ANN classifiers are reported

  11. A cluster analysis on students' perceived motivational climate. Implications on psycho-social variables.

    Science.gov (United States)

    Fernandez-Rio, Javier; Méndez-Giménez, Antonio; Cecchini Estrada, Jose A

    2014-01-01

    The aim of this study was to examine how students' perceptions of the class climate influence their basic psychological needs, motivational regulations, social goals and outcomes such as boredom, enjoyment, effort, and pressure/tension. 507 (267 males, 240 females) secondary education students agreed to participate. They completed a questionnaire that included the Spanish validated versions of Perceived Motivational Climate in Sport Questionnaire (PMCSQ-2), Basic Psychological Needs in Exercise (BPNES), Perceived Locus of Causality (PLOC), Social Goal Scale-Physical Education (SGS-PE), and several subscales of the IMI. A hierarchical cluster analysis uncovered four independent class climate profiles that were confirmed by a K-Means cluster analysis: "high ego", "low ego-task", "high ego-medium task", and "high task". Several MANOVAs were performed using these clusters as independent variables and the different outcomes as dependent variables (p responsibility and relationship, as well as low levels of amotivation, boredom and pressure/tension. Students' perceptions of a performance class climate made the positive scores decrease significantly. Cluster 3 revealed that a mastery oriented class structure undermines the negative behavioral and psychological effects of a performance class climate. This finding supports the buffering hypothesis of the achievement goal theory. PMID:25012581

  12. Evidence-Based Clustering of Reads and Taxonomic Analysis of Metagenomic Data

    Science.gov (United States)

    Folino, Gianluigi; Gori, Fabio; Jetten, Mike S. M.; Marchiori, Elena

    The rapidly emerging field of metagenomics seeks to examine the genomic content of communities of organisms to understand their roles and interactions in an ecosystem. In this paper we focus on clustering methods and their application to taxonomic analysis of metagenomic data. Clustering analysis for metagenomics amounts to group similar partial sequences, such as raw sequence reads, into clusters in order to discover information about the internal structure of the considered dataset, or the relative abundance of protein families. Different methods for clustering analysis of metagenomic datasets have been proposed. Here we focus on evidence-based methods for clustering that employ knowledge extracted from proteins identified by a BLASTx search (proxygenes). We consider two clustering algorithms introduced in previous works and a new one. We discuss advantages and drawbacks of the algorithms, and use them to perform taxonomic analysis of metagenomic data. To this aim, three real-life benchmark datasets used in previous work on metagenomic data analysis are used. Comparison of the results indicates satisfactory coherence of the taxonomies output by the three algorithms, with respect to phylogenetic content at the class level and taxonomic distribution at phylum level. In general, the experimental comparative analysis substantiates the effectiveness of evidence-based clustering methods for taxonomic analysis of metagenomic data.

  13. A Spitzer Survey of Young Stellar Clusters within One Kiloparsec of the Sun: Cluster Core Extraction and Basic Structural Analysis

    CERN Document Server

    Gutermuth, R A; Myers, P C; Allen, L E; Pipher, J L; Fazio, G G

    2009-01-01

    We present a uniform mid-infrared imaging and photometric survey of 36 young, nearby, star-forming clusters and groups using {\\it Spitzer} IRAC and MIPS. We have confidently identified and classified 2548 young stellar objects using recently established mid-infrared color-based methods. We have devised and applied a new algorithm for the isolation of local surface density enhancements from point source distributions, enabling us to extract the overdense cores of the observed star forming regions for further analysis. We have compiled several basic structural measurements of these cluster cores from the data, such as mean surface densities of sources, cluster core radii, and aspect ratios, in order to characterize the ranges for these quantities. We find that a typical cluster core is 0.39 pc in radius, has 26 members with infrared excess in a ratio of Class II to Class I sources of 3.7, is embedded in a $A_K$=0.8 mag cloud clump, and has a surface density of 60 pc$^{-2}$. We examine the nearest neighbor dista...

  14. A conserved cluster of three PRD-class homeobox genes (homeobrain, rx and orthopedia in the Cnidaria and Protostomia

    Directory of Open Access Journals (Sweden)

    Mazza Maureen E

    2010-07-01

    Full Text Available Abstract Background Homeobox genes are a superclass of transcription factors with diverse developmental regulatory functions, which are found in plants, fungi and animals. In animals, several Antennapedia (ANTP-class homeobox genes reside in extremely ancient gene clusters (for example, the Hox, ParaHox, and NKL clusters and the evolution of these clusters has been implicated in the morphological diversification of animal bodyplans. By contrast, similarly ancient gene clusters have not been reported among the other classes of homeobox genes (that is, the LIM, POU, PRD and SIX classes. Results Using a combination of in silico queries and phylogenetic analyses, we found that a cluster of three PRD-class homeobox genes (Homeobrain (hbn, Rax (rx and Orthopedia (otp is present in cnidarians, insects and mollusks (a partial cluster comprising hbn and rx is present in the placozoan Trichoplax adhaerens. We failed to identify this 'HRO' cluster in deuterostomes; in fact, the Homeobrain gene appears to be missing from the chordate genomes we examined, although it is present in hemichordates and echinoderms. To illuminate the ancestral organization and function of this ancient cluster, we mapped the constituent genes against the assembled genome of a model cnidarian, the sea anemone Nematostella vectensis, and characterized their spatiotemporal expression using in situ hybridization. In N. vectensis, these genes reside in a span of 33 kb with the same gene order as previously reported in insects. Comparisons of genomic sequences and expressed sequence tags revealed the presence of alternative transcripts of Nv-otp and two highly unusual protein-coding polymorphisms in the terminal helix of the Nv-rx homeodomain. A population genetic survey revealed the Rx polymorphisms to be widespread in natural populations. During larval development, all three genes are expressed in the ectoderm, in non-overlapping territories along the oral-aboral axis, with distinct

  15. Interaction of Fanaroff-Riley class II radio jets with a randomly magnetised intra-cluster medium

    CERN Document Server

    Huarte-Espinosa, Martín; Alexander, Paul

    2011-01-01

    A combination of three-dimensional (3D) magnetohydrodynamics (MHD) and synthetic numerical simulations are presented to follow the evolution of a randomly magnetised plasma that models the intra-cluster medium (ICM), under the isolated effects of powerful, light, hypersonic and bipolar Fanaroff-Riley class II (FR II) jets. We prescribe the cluster magnetic field (CMF) as a Gaussian random field with a Kolmogorov-like energy spectrum. Both the power of the jets and the viewing angle that is used for the synthetic Rotation Measure (RM) observations are investigated. We find the model radio sources introduce and amplify fluctuations on the RM statistical properties which we analyse as a function of time as well as the viewing angle. The average RM and the RM standard deviation are increased by the action of the jets. Energetics, RM statistics and magnetic power spectral analysis consistently show that the effects also correlate with the jets' power, and that the lightest, fastest jets produce the strongest chang...

  16. Challenges for Cluster Analysis in a Virtual Observatory

    CERN Document Server

    Djorgovski, S G; Mahabal, A A; Williams, R; Granat, R; Stolorz, P

    2002-01-01

    There has been an unprecedented and continuing growth in the volume, quality, and complexity of astronomical data sets over the past few years, mainly through large digital sky surveys. Virtual Observatory (VO) concept represents a scientific and technological framework needed to cope with this data flood. We review some of the applied statistics and computing challenges posed by the analysis of large and complex data sets expected in the VO-based research. The challenges are driven both by the size and the complexity of the data sets (billions of data vectors in parameter spaces of tens or hundreds of dimensions), by the heterogeneity of the data and measurement errors, the selection effects and censored data, and by the intrinsic clustering properties (functional form, topology) of the data distribution in the parameter space of observed attributes. Examples of scientific questions one may wish to address include: objective determination of the numbers of object classes present in the data, and the membersh...

  17. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    CERN Document Server

    Emmons, Scott; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie network clustering. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms -- Blondel, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 o...

  18. Cluster exponential synchronization of a class of complex networks with hybrid coupling and time-varying delay

    International Nuclear Information System (INIS)

    This paper deals with the cluster exponential synchronization of a class of complex networks with hybrid coupling and time-varying delay. Through constructing an appropriate Lyapunov—Krasovskii functional and applying the theory of the Kronecker product of matrices and the linear matrix inequality (LMI) technique, several novel sufficient conditions for cluster exponential synchronization are obtained. These cluster exponential synchronization conditions adopt the bounds of both time delay and its derivative, which are less conservative. Finally, the numerical simulations are performed to show the effectiveness of the theoretical results. (general)

  19. Cluster analysis of word frequency dynamics

    International Nuclear Information System (INIS)

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations

  20. Trajectories of acute low back pain: a latent class growth analysis.

    Science.gov (United States)

    Downie, Aron S; Hancock, Mark J; Rzewuska, Magdalena; Williams, Christopher M; Lin, Chung-Wei Christine; Maher, Christopher G

    2016-01-01

    Characterising the clinical course of back pain by mean pain scores over time may not adequately reflect the complexity of the clinical course of acute low back pain. We analysed pain scores over 12 weeks for 1585 patients with acute low back pain presenting to primary care to identify distinct pain trajectory groups and baseline patient characteristics associated with membership of each cluster. This was a secondary analysis of the PACE trial that evaluated paracetamol for acute low back pain. Latent class growth analysis determined a 5 cluster model, which comprised 567 (35.8%) patients who recovered by week 2 (cluster 1, rapid pain recovery); 543 (34.3%) patients who recovered by week 12 (cluster 2, pain recovery by week 12); 222 (14.0%) patients whose pain reduced but did not recover (cluster 3, incomplete pain recovery); 167 (10.5%) patients whose pain initially decreased but then increased by week 12 (cluster 4, fluctuating pain); and 86 (5.4%) patients who experienced high-level pain for the whole 12 weeks (cluster 5, persistent high pain). Patients with longer pain duration were more likely to experience delayed recovery or nonrecovery. Belief in greater risk of persistence was associated with nonrecovery, but not delayed recovery. Higher pain intensity, longer duration, and workers' compensation were associated with persistent high pain, whereas older age and increased number of episodes were associated with fluctuating pain. Identification of discrete pain trajectory groups offers the potential to better manage acute low back pain. PMID:26397929

  1. Semi Supervised Weighted K-Means Clustering for Multi Class Data Classification

    Directory of Open Access Journals (Sweden)

    Vijaya Geeta Dharmavaram

    2013-01-01

    Full Text Available Supervised Learning techniques require large number of labeled examples to train a classifier model. Research on Semi Supervised Learning is motivated by the availability of unlabeled examples in abundance even in domains with limited number of labeled examples. In such domains semi supervised classifier uses the results of clustering for classifier development since clustering does not rely only on labeled examples as it groups the objects based on their similarities. In this paper, the authors propose a new algorithm for semi supervised classification namely Semi Supervised Weighted K-Means (SSWKM. In this algorithm, the authors suggest the usage of weighted Euclidean distance metric designed as per the purpose of clustering for estimating the proximity between a pair of points and used it for building semi supervised classifier. The authors propose a new approach for estimating the weights of features by appropriately adopting the results of multiple discriminant analysis. The proposed method was then tested on benchmark datasets from UCI repository with varied percentage of labeled examples and found to be consistent and promising.

  2. Lung scintigraphy clustering by texture analysis

    International Nuclear Information System (INIS)

    The efficiency of texture analysis parameters, describing the organization of grey level variations of an image, was studied for lung scintigraphic data classification. Twenty one patients received a99mTC-MAA perfusion scan and 81mKr and 127Xe ventilation scans. Scans were scaled to 64 grey levels and 100 k events for inter subject comparison. The texture index was the average of the absolute difference between a pixel and its neighbors. Energy, entropy, correlation, local homogeneity and inertia were computed using co-occurrence matrices. A principal component analysis was carried out on each parameter for each type of scan and the first principal components were selected as clustering indices. Validation was achieved by simulating 2 series of 20 increasingly heterogenous perfusion and ventilation scans. For most of the texture parameters, one principal component could summarize the patients data since it corresponded to the relative variances of 67%-88% for perfusion scans, 53%-99% for 81mKr scans and 38%-97% for 127Xe scans. The simulated series demonstrated a linear relationship between the heterogeneity and the first principal component for texture index, energy, entropy and inertia. This was not the case for correlation and local Homogeneity. We conclude that heterogeneity of lung scans may be quantified by texture analysis. The texture index is the easiest to compute and provides the most efficient results for clinical purpose. (orig.)

  3. Impacts of fast food and food retail environment on overweight and obesity in China: a multilevel latent class cluster approach

    NARCIS (Netherlands)

    Zhang XiaoYong, Xiaoyong; Lans, van der I.A.; Dagevos, H.

    2012-01-01

    Objective To simultaneously identify consumer segments based on individual-level consumption and community-level food retail environment data and to investigate whether the segments are associated with BMI and dietary knowledge in China. Design A multilevel latent class cluster model was applied to

  4. An Analysis on an ESL Class

    Institute of Scientific and Technical Information of China (English)

    杜岳青

    2015-01-01

    Input is the precondition of interaction and output while output promotes the input and the interaction of language,which could increase the effectiveness of input and enhance the chance of the absorption of interaction.By analyzing an ESL class,this paper gives readers a picture of how the second language (SL) learners get information,absorb and digest the information,produce language and form their language system in their SL learning.At last,it gives suggestions on how to ensure language learners have a better chance to get comprehensible input,assimilate language during interaction and output more accurate and comprehensible second language in an ESL class.

  5. Smartness and Italian Cities. A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Flavio Boscacci

    2014-05-01

    Full Text Available Smart cities have been recently recognized as the most pleasing and attractive places to live in; due to this, both scholars and policy-makers pay close attention to this topic. Specifically, urban “smartness” has been identified by plenty of characteristics that can be grouped into six dimensions (Giffinger et al. 2007: smart Economy (competitiveness, smart People (social and human capital, smart Governance (participation, smart Mobility (both ICTs and transport, smart Environment (natural resources, and smart Living (quality of life. According to this analytical framework, in the present paper the relation between urban attractiveness and the “smart” characteristics has been investigated in the 103 Italian NUTS3 province capitals in the year 2011. To this aim, a descriptive statistics has been followed by a regression analysis (OLS, where the dependent variable measuring the urban attractiveness has been proxied by housing market prices. Besides, a Cluster Analysis (CA has been developed in order to find differences and commonalities among the province capitals.The OLS results indicate that living, people and economy are the key drivers for achieving a better urban attractiveness. Environment, instead, keeps on playing a minor role. Besides, the CA groups the province capitals a

  6. Using Cluster Analysis for Data Mining in Educational Technology Research

    Science.gov (United States)

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  7. An Analysis on an ESL Class

    Institute of Scientific and Technical Information of China (English)

    杜岳青

    2015-01-01

    Input is the precondition of interaction and output while output promotes the input and the interaction of language,which could increase the effectiveness of input and enhance the chance of the absorption of interaction.By analyzing an ESL class,this paper gives readers a picture of how the second language(SL)learners get information,absorb and digest the information,produce language and form their language system in their SL learning.At last,it gives suggestions on how to ensure language learners have a better chance to get comprehensible input,assimilate language during interaction and output more accurate and comprehensible second language in an ESL class.

  8. Detection and Analysis of Clones in UML Class Models

    Directory of Open Access Journals (Sweden)

    Dhavleesh Rattan

    2015-07-01

    Full Text Available It is quite frequent to copy and paste code fragments in software development. The copied source code is called a software clone and the activity is referred to as code cloning. The presence of code clones hamper maintenance and may lead to bug propagation. Now-a-days, model driven development has become a standard industry practice. Duplicate parts in models i.e. model clones pose similar challenges as in source code. This paper presents an approach to detect clones in Unified Modeling Language class models. The core of our technique is the construction of a labeled, ranked tree corresponding to the UML class model where attributes with their data types and methods with their signatures are represented as subtrees. By grouping and clustering of repeating subtrees, the tool is able to detect duplications in a UML class model at different levels of granularity i.e. complete class diagram, attributes with their data types and methods with their signatures across the model and cluster of such attributes/methods. We propose a new classification of model clones with the objective of detecting exact and meaningful clones. Empirical evaluation of the tool using open source reverse engineered and forward designed models show some interesting and relevant clones which provide useful insights into software modeling practice.

  9. Detection and Analysis of Clones in UML Class Models

    Directory of Open Access Journals (Sweden)

    Dhavleesh Rattan

    2016-01-01

    Full Text Available It is quite frequent to copy and paste code fragments in software development. The copied source code is called a software clone and the activity is referred to as code cloning. The presence of code clones hamper maintenance and may lead to bug propagation. Now-a-days, model driven development has become a standard industry practice. Duplicate parts in models i.e. model clones pose similar challenges as in source code. This paper presents an approach to detect clones in Unified Modeling Language class models. The core of our technique is the construction of a labeled, ranked tree corresponding to the UML class model where attributes with their data types and methods with their signatures are represented as subtrees. By grouping and clustering of repeating subtrees, the tool is able to detect duplications in a UML class model at different levels of granularity i.e. complete class diagram, attributes with their data types and methods with their signatures across the model and cluster of such attributes/methods. We propose a new classification of model clones with the objective of detecting exact and meaningful clones. Empirical evaluation of the tool using open source reverse engineered and forward designed models show some interesting and relevant clones which provide useful insights into software modeling practice.

  10. Human HLA class I- and HLA class II-restricted cloned cytotoxic T lymphocytes identify a cluster of epitopes on the measles virus fusion protein.

    Science.gov (United States)

    van Binnendijk, R S; Versteeg-van Oosten, J P; Poelen, M C; Brugghe, H F; Hoogerhout, P; Osterhaus, A D; Uytdehaag, F G

    1993-01-01

    The transmembrane fusion (F) glycoprotein of measles virus is an important target antigen of human HLA class I- and class II-restricted cytotoxic T lymphocytes (CTL). Genetically engineered F proteins and nested sets of synthetic peptides spanning the F protein were used to determine sequences of F recognized by a number of F-specific CTL clones. Combined N- and C-terminal deletions of the respective peptides revealed that human HLA class I and HLA class II-restricted CTL efficiently recognize nonapeptides or decapeptides representing epitopes of F. Three distinct sequences recognized by three different HLA class II (DQw1, DR2, and DR4/w53)-restricted CTL clones appear to cluster between amino acids 379 and 466 of F, thus defining an important T-cell epitope area of F. Within this same region, a nonamer peptide of F was found to be recognized by an HLA-B27-restricted CTL clone, as expected on the basis of the structural homology between this peptide and other known HLA-B27 binding peptides. PMID:7680390

  11. Clustering and Feature Selection using Sparse Principal Component Analysis

    OpenAIRE

    Luss, Ronny; d'Aspremont, Alexandre

    2007-01-01

    In this paper, we study the application of sparse principal component analysis (PCA) to clustering and feature selection problems. Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of variance in the data while having only a limited number of nonzero coefficients. PCA is often used as a simple clustering technique and sparse factors allow us here to interpret the clusters in terms of a reduced set of variables. We begin with a brief int...

  12. Maximum-entropy clustering algorithm and its global convergence analysis

    Institute of Scientific and Technical Information of China (English)

    ZHANG; Zhihua

    2001-01-01

    [1]Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithm. New York: Plenum, 1981.[2]Krishnapuram, R., Keller, J., A possibilistic approach to clustering, IEEE Trans. on Fuzzy Systems, 1993, 1(2): 98.[3]Yair, E., Zeger, K., Gersho, A., Competitive learning and soft competition for vector quantizer design, IEEE Trans on Signal Processing, 1992, 40(2): 294.[4]Pal, N. R., Bezdek, J. C., Tsao, E. C. K., Generalized clustering networks and Kohonen's self-organizing scheme, IEEE Trans on Neural Networks, 1993, 4(4): 549.[5]Karayiannis, N. B., Bezdek, J. C., Pal, N. R. et al., Repair to GLVQ: a new family of competitive learning schemes, IEEE Trans on Neural Networks, 1996, 7(5): 1062.[6]Karayiannis, N. B., Pai, P. I., Fuzzy algorithms for learning vector quantization, IEEE Trans. on Neural Networks, 1996, 7(5): 1196.[7]Karayiannis, N. B., A methodology for constructing fuzzy algorithms for learning vector quantization, IEEE Trans. on Neural Networks, 1997, 8(3): 505.[8]Karayiannis, N. B., Bezdek, J. C., An integrated approach to fuzzy learning vector quantization and fuzzy C-Means clustering, IEEE Trans. on Fuzzy Systems, 1997, 5(4): 622.[9]Li Xing-si, An efficient approach to nonlinear minimax problems, Chinese Science Bulletin? 1992, 37(10): 802.[10]Li Xing-si, An efficient approach to a class of non-smooth optimization problems, Science in China, Series A,1994, 37(3): 323.[11]. Zangwill, W., Non-linear Programming: A Unified Approach, Englewood Cliffs: Prentice-Hall, 1969.[12]. Fletcher, R., Practical Methods of Optimization,2nd ed., New York: John Wiley & Sons, 1987.[13]. Zhang Zhihua, Zheng Nanning, Wang Tianshu, Behavioral analysis and improving of generalized LVQ neural network, Acta Automatica Sinica, 1999, 25(5): 582.[14]. Kirkpatrick, S., Gelatt, C. D., Vecchi, M. P., Optimization by simulated annealing, Science, 1983, 220(3): 671.[15]. Ross, K., Deterministic annealing for

  13. Portraying Persons Who Inject Drugs Recently Infected with Hepatitis C Accessing Antiviral Treatment: A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Jean-Marie Bamvita

    2014-01-01

    Full Text Available Objectives. To empirically determine a categorization of people who inject drug (PWIDs recently infected with hepatitis C virus (HCV, in order to identify profiles most likely associated with early HCV treatment uptake. Methods. The study population was composed of HIV-negative PWIDs with a documented recent HCV infection. Eligibility criteria included being 18 years old or over, and having injected drugs in the previous 6 months preceding the estimated date of HCV exposure. Participant classification was carried out using a TwoStep cluster analysis. Results. From September 2007 to December 2011, 76 participants were included in the study. 60 participants were eligible for HCV treatment. Twenty-one participants initiated HCV treatment. The cluster analysis yielded 4 classes: class 1: Lukewarm health seekers dismissing HCV treatment offer; class 2: multisubstance users willing to shake off the hell; class 3: PWIDs unlinked to health service use; class 4: health seeker PWIDs willing to reverse the fate. Conclusion. Profiles generated by our analysis suggest that prior health care utilization, a key element for treatment uptake, differs between older and younger PWIDs. Such profiles could inform the development of targeted strategies to improve health outcomes and reduce HCV infection among PWIDs.

  14. Portraying persons who inject drugs recently infected with hepatitis C accessing antiviral treatment: a cluster analysis.

    Science.gov (United States)

    Bamvita, Jean-Marie; Roy, Elise; Zang, Geng; Jutras-Aswad, Didier; Artenie, Andreea Adelina; Levesque, Annie; Bruneau, Julie

    2014-01-01

    Objectives. To empirically determine a categorization of people who inject drug (PWIDs) recently infected with hepatitis C virus (HCV), in order to identify profiles most likely associated with early HCV treatment uptake. Methods. The study population was composed of HIV-negative PWIDs with a documented recent HCV infection. Eligibility criteria included being 18 years old or over, and having injected drugs in the previous 6 months preceding the estimated date of HCV exposure. Participant classification was carried out using a TwoStep cluster analysis. Results. From September 2007 to December 2011, 76 participants were included in the study. 60 participants were eligible for HCV treatment. Twenty-one participants initiated HCV treatment. The cluster analysis yielded 4 classes: class 1: Lukewarm health seekers dismissing HCV treatment offer; class 2: multisubstance users willing to shake off the hell; class 3: PWIDs unlinked to health service use; class 4: health seeker PWIDs willing to reverse the fate. Conclusion. Profiles generated by our analysis suggest that prior health care utilization, a key element for treatment uptake, differs between older and younger PWIDs. Such profiles could inform the development of targeted strategies to improve health outcomes and reduce HCV infection among PWIDs. PMID:25349730

  15. Intelligent Hybrid Cluster Based Classification Algorithm for Social Network Analysis

    Directory of Open Access Journals (Sweden)

    S. Muthurajkumar

    2014-05-01

    Full Text Available In this paper, we propose an hybrid clustering based classification algorithm based on mean approach to effectively classify to mine the ordered sequences (paths from weblog data in order to perform social network analysis. In the system proposed in this work for social pattern analysis, the sequences of human activities are typically analyzed by switching behaviors, which are likely to produce overlapping clusters. In this proposed system, a robust Modified Boosting algorithm is proposed to hybrid clustering based classification for clustering the data. This work is useful to provide connection between the aggregated features from the network data and traditional indices used in social network analysis. Experimental results show that the proposed algorithm improves the decision results from data clustering when combined with the proposed classification algorithm and hence it is proved that of provides better classification accuracy when tested with Weblog dataset. In addition, this algorithm improves the predictive performance especially for multiclass datasets which can increases the accuracy.

  16. Hierarchical Cluster Analysis – Various Approaches to Data Preparation

    Directory of Open Access Journals (Sweden)

    Z. Pacáková

    2013-09-01

    Full Text Available The article deals with two various approaches to data preparation to avoid multicollinearity. The aim of the article is to find similarities among the e-communication level of EU states using hierarchical cluster analysis. The original set of fourteen indicators was first reduced on the basis of correlation analysis while in case of high correlation indicator of higher variability was included in further analysis. Secondly the data were transformed using principal component analysis while the principal components are poorly correlated. For further analysis five principal components explaining about 92% of variance were selected. Hierarchical cluster analysis was performed both based on the reduced data set and the principal component scores. Both times three clusters were assumed following Pseudo t-Squared and Pseudo F Statistic, but the final clusters were not identical. An important characteristic to compare the two results found was to look at the proportion of variance accounted for by the clusters which was about ten percent higher for the principal component scores (57.8% compared to 47%. Therefore it can be stated, that in case of using principal component scores as an input variables for cluster analysis with explained proportion high enough (about 92% for in our analysis, the loss of information is lower compared to data reduction on the basis of correlation analysis.

  17. Impacts of fast food and food retail environment on overweight and obesity in China: a multilevel latent class cluster approach

    OpenAIRE

    Zhang XiaoYong, Xiaoyong; Lans, van der, A.M.; Dagevos, H.

    2012-01-01

    Objective To simultaneously identify consumer segments based on individual-level consumption and community-level food retail environment data and to investigate whether the segments are associated with BMI and dietary knowledge in China. Design A multilevel latent class cluster model was applied to identify consumer segments based not only on their individual preferences for fast food, salty snack foods, and soft drinks and sugared fruit drinks, but also on the food retail environment at the ...

  18. Obstructive Sleep Apnea: A Cluster Analysis at Time of Diagnosis

    Science.gov (United States)

    Grillet, Yves; Richard, Philippe; Stach, Bruno; Vivodtzev, Isabelle; Timsit, Jean-Francois; Lévy, Patrick; Tamisier, Renaud; Pépin, Jean-Louis

    2016-01-01

    Background The classification of obstructive sleep apnea is on the basis of sleep study criteria that may not adequately capture disease heterogeneity. Improved phenotyping may improve prognosis prediction and help select therapeutic strategies. Objectives: This study used cluster analysis to investigate the clinical clusters of obstructive sleep apnea. Methods An ascending hierarchical cluster analysis was performed on baseline symptoms, physical examination, risk factor exposure and co-morbidities from 18,263 participants in the OSFP (French national registry of sleep apnea). The probability for criteria to be associated with a given cluster was assessed using odds ratios, determined by univariate logistic regression. Results: Six clusters were identified, in which patients varied considerably in age, sex, symptoms, obesity, co-morbidities and environmental risk factors. The main significant differences between clusters were minimally symptomatic versus sleepy obstructive sleep apnea patients, lean versus obese, and among obese patients different combinations of co-morbidities and environmental risk factors. Conclusions Our cluster analysis identified six distinct clusters of obstructive sleep apnea. Our findings underscore the high degree of heterogeneity that exists within obstructive sleep apnea patients regarding clinical presentation, risk factors and consequences. This may help in both research and clinical practice for validating new prevention programs, in diagnosis and in decisions regarding therapeutic strategies. PMID:27314230

  19. Somatosensory nociceptive characteristics differentiate subgroups in people with chronic low back pain: a cluster analysis.

    Science.gov (United States)

    Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne

    2015-10-01

    The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking <300 minutes per week of moderate activity was significantly greater in cluster 1 than in clusters 2 and 3. Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups. PMID:26020225

  20. Logistics Enterprise Evaluation Model Based On Fuzzy Clustering Analysis

    Science.gov (United States)

    Fu, Pei-hua; Yin, Hong-bo

    In this thesis, we introduced an evaluation model based on fuzzy cluster algorithm of logistics enterprises. First of all,we present the evaluation index system which contains basic information, management level, technical strength, transport capacity,informatization level, market competition and customer service. We decided the index weight according to the grades, and evaluated integrate ability of the logistics enterprises using fuzzy cluster analysis method. In this thesis, we introduced the system evaluation module and cluster analysis module in detail and described how we achieved these two modules. At last, we gave the result of the system.

  1. Cancer incidence in men: a cluster analysis of spatial patterns

    Directory of Open Access Journals (Sweden)

    D'Alò Daniela

    2008-11-01

    Full Text Available Abstract Background Spatial clustering of different diseases has received much less attention than single disease mapping. Besides chance or artifact, clustering of different cancers in a given area may depend on exposure to a shared risk factor or to multiple correlated factors (e.g. cigarette smoking and obesity in a deprived area. Models developed so far to investigate co-occurrence of diseases are not well-suited for analyzing many cancers simultaneously. In this paper we propose a simple two-step exploratory method for screening clusters of different cancers in a population. Methods Cancer incidence data were derived from the regional cancer registry of Umbria, Italy. A cluster analysis was performed on smoothed and non-smoothed standardized incidence ratios (SIRs of the 13 most frequent cancers in males. The Besag, York and Mollie model (BYM and Poisson kriging were used to produce smoothed SIRs. Results Cluster analysis on non-smoothed SIRs was poorly informative in terms of clustering of different cancers, as only larynx and oral cavity were grouped, and of characteristic patterns of cancer incidence in specific geographical areas. On the other hand BYM and Poisson kriging gave similar results, showing cancers of the oral cavity, larynx, esophagus, stomach and liver formed a main cluster. Lung and urinary bladder cancers clustered together but not with the cancers mentioned above. Both methods, particularly the BYM model, identified distinct geographic clusters of adjacent areas. Conclusion As in single disease mapping, non-smoothed SIRs do not provide reliable estimates of cancer risks because of small area variability. The BYM model produces smooth risk surfaces which, when entered into a cluster analysis, identify well-defined geographical clusters of adjacent areas. It probably enhances or amplifies the signal arising from exposure of more areas (statistical units to shared risk factors that are associated with different cancers. In

  2. Cluster analysis for anomaly detection in accounting data : an audit approach

    OpenAIRE

    Thiprungsri, Sutapat; Vasarhelyi, Miklos A.

    2011-01-01

    This study examines the application of cluster analysis in the accounting domain, particularly discrepancy detection in audit. Cluster analysis groups data so that points within a single group or cluster are similar to one another and distinct from points in other clusters. Clustering has been shown to be a good candidate for anomaly detection. The purpose of this study is to examine the use of clustering technology to automate fraud filtering during an audit. We use cluster analysis to help ...

  3. Sensitivity Analysis of Gas Production from Class 2 and Class 3 Hydrate Deposits

    Energy Technology Data Exchange (ETDEWEB)

    Reagan, Matthew; Moridis, George; Zhang, Keni

    2008-05-01

    Gas hydrates are solid crystalline compounds in which gas molecules are lodged within the lattices of an ice-like crystalline solid. The vast quantities of hydrocarbon gases trapped in hydrate formations in the permafrost and in deep ocean sediments may constitute a new and promising energy source. Class 2 hydrate deposits are characterized by a Hydrate-Bearing Layer (HBL) that is underlain by a saturated zone of mobile water. Class 3 hydrate deposits are characterized by an isolated Hydrate-Bearing Layer (HBL) that is not in contact with any hydrate-free zone of mobile fluids. Both classes of deposits have been shown to be good candidates for exploitation in earlier studies of gas production via vertical well designs - in this study we extend the analysis to include systems with varying porosity, anisotropy, well spacing, and the presence of permeable boundaries. For Class 2 deposits, the results show that production rate and efficiency depend strongly on formation porosity, have a mild dependence on formation anisotropy, and that tighter well spacing produces gas at higher rates over shorter time periods. For Class 3 deposits, production rates and efficiency also depend significantly on formation porosity, are impacted negatively by anisotropy, and production rates may be larger, over longer times, for well configurations that use a greater well spacing. Finally, we performed preliminary calculations to assess a worst-case scenario for permeable system boundaries, and found that the efficiency of depressurization-based production strategies are compromised by migration of fluids from outside the system.

  4. Credibility analysis of risk classes by generalized linear model

    Science.gov (United States)

    Erdemir, Ovgucan Karadag; Sucu, Meral

    2016-06-01

    In this paper generalized linear model (GLM) and credibility theory which are frequently used in nonlife insurance pricing are combined for reliability analysis. Using full credibility standard, GLM is associated with limited fluctuation credibility approach. Comparison criteria such as asymptotic variance and credibility probability are used to analyze the credibility of risk classes. An application is performed by using one-year claim frequency data of a Turkish insurance company and results of credible risk classes are interpreted.

  5. HLA class II genes: typing by DNA analysis.

    Science.gov (United States)

    Bidwell, J L; Bidwell, E A; Bradley, B A

    1990-04-01

    A detailed understanding of the structure and function of the human major histocompatibility complex (MHC) has ensued from studies by molecular biologist during the last decade. Virtually all of the HLA genes have now been cloned, and the nucleotide sequences of their different allelic forms have been determined. Typing for these HLA alleles is a fundamental prerequisite for tissue matching in allogeneic organ transplantation. Until very recently, typing procedures have been dominated by serological and cellular methods. The availability of cloned DNA from HLA genes has now permitted the technique of restriction fragment length polymorphism (RFLP) analysis to be applied, with remarkable success and advantage, to phenotyping of both HLA Class I and Class II determinants. For the HLA Class II genes DR and DQ, a simple two-stage RFLP analysis permits the accurate identification of all specificities defined by serology, and of many which are defined by cellular typing. At the present time, however, RFLP typing of HLA Class I genes is not as practicable or as informative as that for HLA Class II genes. The present clinical applications of HLA-DR and DQ RFLP typing are predominantly in phenotyping of living donors, including selection of HLA-matched volunteer bone marrow donors, in allograft survival studies, and in studies of HLA Class II-associated diseases. However, the time taken to perform RFLP analysis precludes its use for the typing of cadaveric kidney donors. Nucleotide sequence data for the alleles of HLA Class II genes have now permitted the development of allele-specific oligonucleotide (ASO) typing, a second category of DNA analysis. This has been greatly facilitated by the ability to amplify specific HLA Class II DNA 'target' sequences using the polymerase chain reaction (PCR) technique. The accuracy of DNA typing techniques should ensure that this methodology will eventually replace conventional HLA phenotyping.

  6. Comparative analysis of genomic signal processing for microarray data clustering.

    Science.gov (United States)

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  7. Using cluster analysis to organize and explore regional GPS velocities

    Science.gov (United States)

    Simpson, Robert W.; Thatcher, Wayne; Savage, James C.

    2012-01-01

    Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.

  8. Cluster Analysis of Gene Expression Data

    CERN Document Server

    Domany, E

    2002-01-01

    The expression levels of many thousands of genes can be measured simultaneously by DNA microarrays (chips). This novel experimental tool has revolutionized research in molecular biology and generated considerable excitement. A typical experiment uses a few tens of such chips, each dedicated to a single sample - such as tissue extracted from a particular tumor. The results of such an experiment contain several hundred thousand numbers, that come in the form of a table, of several thousand rows (one for each gene) and 50 - 100 columns (one for each sample). We developed a clustering methodology to mine such data. In this review I provide a very basic introduction to the subject, aimed at a physics audience with no prior knowledge of either gene expression or clustering methods. I explain what genes are, what is gene expression and how it is measured by DNA chips. Next I explain what is meant by "clustering" and how we analyze the massive amounts of data from such experiments, and present results obtained from a...

  9. Statistical fractal analysis of 25 young star clusters

    CERN Document Server

    Gregorio-Hetem, J; Santos-Silva, T; Fernandes, B

    2015-01-01

    A large sample of young stellar groups is analysed aiming to investigate their clustering properties and dynamical evolution. A comparison of the Q statistical parameter, measured for the clusters, with the fractal dimension estimated for the projected clouds shows that 52% of the sample has substructures and tends to follow the theoretically expected relation between clusters and clouds, according to calculations for artificial distribution of points. The fractal statistics was also compared to structural parameters revealing that clusters having radial density profile show a trend of parameter s increasing with mean surface stellar density. The core radius of the sample, as a function of age, follows a distribution similar to that observed in stellar groups of Milky Way and other galaxies. They also have dynamical age, indicated by their crossing time that is similar to unbound associations. The statistical analysis allowed us to separate the sample into two groups showing different clustering characteristi...

  10. Risk of re-report: A latent class analysis of infants reported for maltreatment.

    Science.gov (United States)

    Eastman, Andrea Lane; Mitchell, Michael N; Putnam-Hornstein, Emily

    2016-05-01

    A key challenge facing child protective services (CPS) is identifying children who are at greatest risk of future maltreatment. This analysis examined a cohort of children with a first report to CPS during infancy, a vulnerable population at high risk of future CPS reports. Birth records of all infants born in California in 2006 were linked to CPS records; 23,871 infants remaining in the home following an initial report were followed for 5 years to determine if another maltreatment report occurred. Latent class analysis (LCA) was used to identify subpopulations of infants based on varying risks of re-report. LCA model fit was examined using the Bayesian information criterion, a likelihood ratio test, and entropy. Statistical indicators and interpretability suggested the four-class model best fit the data. A second LCA included infant re-report as a distal outcome to examine the association between class membership and the likelihood of re-report. In Class 1 and Class 2 (lowest risk), the probability of a re-report was 44%; in contrast, the probability in Class 4 (highest risk) was 78%. Two birth characteristics clustered in the medium- and highest-risk classes: lack of established paternity and delayed or absent prenatal care. Two risk factors from the initial report of maltreatment emerged as predictors of re-report in the highest-risk class: an initial allegation of neglect and a family history of CPS involvement involving older siblings. Findings suggest that statistical techniques can be used to identify families with a heightened risk of experiencing later CPS contact. PMID:27082751

  11. PARTIAL TRAINING METHOD FOR HEURISTIC ALGORITHM OF POSSIBLE CLUSTERIZATION UNDER UNKNOWN NUMBER OF CLASSES

    Directory of Open Access Journals (Sweden)

    D. A. Viattchenin

    2009-01-01

    Full Text Available A method for constructing a subset of labeled objects which is used in a heuristic algorithm of possible  clusterization with partial  training is proposed in the  paper.  The  method  is  based  on  data preprocessing by the heuristic algorithm of possible clusterization using a transitive closure of a fuzzy tolerance. Method efficiency is demonstrated by way of an illustrative example.

  12. Contour Cluster Shape Analysis for Building Damage Detection from Post-earthquake Airborne LiDAR

    Directory of Open Access Journals (Sweden)

    HE Meizhang

    2015-04-01

    Full Text Available Detection of the damaged building is the obligatory step prior to evaluate earthquake casualty and economic losses. It's very difficult to detect damaged buildings accurately based on the assumption that intact roofs appear in laser data as large planar segments whereas collapsed roofs are characterized by many small segments. This paper presents a contour cluster shape similarity analysis algorithm for reliable building damage detection from the post-earthquake airborne LiDAR point cloud. First we evaluate the entropies of shape similarities between all the combinations of two contour lines within a building cluster, which quantitatively describe the shape diversity. Then the maximum entropy model is employed to divide all the clusters into intact and damaged classes. The tests on the LiDAR data at El Mayor-Cucapah earthquake rupture prove the accuracy and reliability of the proposed method.

  13. Cluster analysis of WIBS single particle bioaerosol data

    Directory of Open Access Journals (Sweden)

    N. H. Robinson

    2012-09-01

    Full Text Available Hierarchical agglomerative cluster analysis was performed on single-particle multi-spatial datasets comprising optical diameter, asymmetry and three different fluorescence measurements, gathered using two dual Waveband Integrated Bioaerosol Sensor (WIBS. The technique is demonstrated on measurements of various fluorescent and non-fluorescent polystyrene latex spheres (PSL before being applied to two separate contemporaneous ambient WIBS datasets recorded in a forest site in Colorado, USA as part of the BEACHON-RoMBAS project. Cluster analysis results between both datasets are consistent. Clusters are tentatively interpreted by comparison of concentration time series and cluster average measurement values to the published literature (of which there is a paucity to represent: non-fluorescent accumulation mode aerosol; bacterial agglomerates; and fungal spores. To our knowledge, this is the first time cluster analysis has been applied to long term online PBAP measurements. The novel application of this clustering technique provides a means for routinely reducing WIBS data to discrete concentration time series which are more easily interpretable, without the need for any a priori assumptions concerning the expected aerosol types. It can reduce the level of subjectivity compared to the more standard analysis approaches, which are typically performed by simple inspection of various ensemble data products. It also has the advantage of potentially resolving less populous or subtly different particle types. This technique is likely to become more robust in the future as fluorescence-based aerosol instrumentation measurement precision, dynamic range and the number of available metrics is improved.

  14. Cluster analysis of clinical data identifies fibromyalgia subgroups.

    Directory of Open Access Journals (Sweden)

    Elisa Docampo

    Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.

  15. Cluster analysis of Southeastern U.S. climate stations

    Science.gov (United States)

    Stooksbury, D. E.; Michaels, P. J.

    1991-09-01

    A two-step cluster analysis of 449 Southeastern climate stations is used to objectively determine general climate clusters (groups of climate stations) for eight southeastern states. The purpose is objectively to define regions of climatic homogeneity that should perform more robustly in subsequent climatic impact models. This type of analysis has been successfully used in many related climate research problems including the determination of corn/climate districts in Iowa (Ortiz-Valdez, 1985) and the classification of synoptic climate types (Davis, 1988). These general climate clusters may be more appropriate for climate research than the standard climate divisions (CD) groupings of climate stations, which are modifications of the agro-economic United States Department of Agriculture crop reporting districts. Unlike the CD's, these objectively determined climate clusters are not restricted by state borders and thus have reduced multicollinearity which makes them more appropriate for the study of the impact of climate and climatic change.

  16. Performance Analysis of Enhanced Clustering Algorithm for Gene Expression Data

    CERN Document Server

    Chandrasekhar, T; Elayaraja, E

    2011-01-01

    Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this paper we applied K-Means with Automatic Generations of Merge Factor for ISODATA- AGMFI. Though AGMFI has been applied for clustering of Gene Expression Data, this proposed Enhanced Automatic Generations of Merge Factor for ISODATA- EAGMFI Algorithms overcome the drawbacks of AGMFI in terms of specifying the optimal number of clusters and initialization of good cluster centroids. Experimental results on Gene Expression Data show that the proposed EAGMFI algorithms could identify compact clus...

  17. Variable cluster analysis method for building neural network model

    Institute of Scientific and Technical Information of China (English)

    王海东; 刘元东

    2004-01-01

    To address the problems that input variables should be reduced as much as possible and explain output variables fully in building neural network model of complicated system, a variable selection method based on cluster analysis was investigated. Similarity coefficient which describes the mutual relation of variables was defined. The methods of the highest contribution rate, part replacing whole and variable replacement are put forwarded and deduced by information theory. The software of the neural network based on cluster analysis, which can provide many kinds of methods for defining variable similarity coefficient, clustering system variable and evaluating variable cluster, was developed and applied to build neural network forecast model of cement clinker quality. The results show that all the network scale, training time and prediction accuracy are perfect. The practical application demonstrates that the method of selecting variables for neural network is feasible and effective.

  18. Analysis of questioning technique during classes in medical education

    Directory of Open Access Journals (Sweden)

    Cho Young

    2012-06-01

    Full Text Available Abstract Background Questioning is one of the essential techniques used by lecturers to make lectures more interactive and effective. This study surveyed the perception of questioning techniques by medical school faculty members and analyzed how the questioning technique is used in actual classes. Methods Data on the perceptions of the questioning skills used during lectures was collected using a self‒questionnaire for faculty members (N = 33 during the second semester of 2008. The questionnaire consisted of 18 items covering the awareness and characteristics of questioning skills. Recorded video tapes were used to observe the faculty members’ questioning skills. Results Most faculty members regarded the questioning technique during classes as being important and expected positive outcomes in terms of the students’ participation in class, concentration in class and understanding of the class contents. In the 99 classes analyzed, the median number of questions per class was 1 (0–29. Among them, 40 classes (40.4 % did not use questioning techniques. The frequency of questioning per lecture was similar regardless of the faculty members’ perception. On the other hand, the faculty members perceived that their usual wait time after question was approximately 10 seconds compared to only 2.5 seconds measured from video analysis. More lecture‒experienced faculty members tended to ask more questions in class. Conclusions There were some discrepancies regarding the questioning technique between the faculty members’ perceptions and reality, even though they had positive opinions of the technique. The questioning skills during a lecture need to be emphasized to faculty members.

  19. Spatial Data Mining using Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Ch.N.Santhosh Kumar

    2012-09-01

    Full Text Available Data mining, which is refers to as Knowledge Discovery in Databases(KDD, means a process of nontrivialexaction of implicit, previously useful and unknown information such as knowledge rules, descriptions,regularities, and major trends from large databases. Data mining is evolved in a multidisciplinary field ,including database technology, machine learning, artificial intelligence, neural network, informationretrieval, and so on. In principle data mining should be applicable to the different kind of data and databasesused in many different applications, including relational databases, transactional databases, datawarehouses, object- oriented databases, and special application- oriented databases such as spatialdatabases, temporal databases, multimedia databases, and time- series databases. Spatial data mining, alsocalled spatial mining, is data mining as applied to the spatial data or spatial databases. Spatial data are thedata that have spatial or location component, and they show the information, which is more complex thanclassical data. A spatial database stores spatial data represents by spatial data types and spatialrelationships and among data. Spatial data mining encompasses various tasks. These include spatialclassification, spatial association rule mining, spatial clustering, characteristic rules, discriminant rules,trend detection. This paper presents how spatial data mining is achieved using clustering.

  20. Advanced Heat Map and Clustering Analysis Using Heatmap3

    OpenAIRE

    Shilin Zhao; Yan Guo; Quanhu Sheng; Yu Shyr

    2014-01-01

    Heat maps and clustering are used frequently in expression analysis studies for data visualization and quality control. Simple clustering and heat maps can be produced from the “heatmap” function in R. However, the “heatmap” function lacks certain functionalities and customizability, preventing it from generating advanced heat maps and dendrograms. To tackle the limitations of the “heatmap” function, we have developed an R package “heatmap3” which significantly improves the original “heatmap”...

  1. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Directory of Open Access Journals (Sweden)

    I. Crawford

    2015-11-01

    Full Text Available In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the

  2. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Science.gov (United States)

    Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.

    2015-11-01

    In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs) by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio-hydro-atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen-Rocky Mountain Biogenic Aerosol Study) ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of

  3. Fuzzy clustering analysis to study geomagnetic coastal effects

    Directory of Open Access Journals (Sweden)

    M. Sridharan

    2005-06-01

    Full Text Available The utility of fuzzy set theory in cluster analysis and pattern recognition has been evolving since the mid 1960s, in conjunction with the emergence and evolution of computer technology. The classification of objects into categories is the subject of cluster analysis. The aim of this paper is to employ Fuzzy-clustering technique to examine the interrelationship of geomagnetic coastal and other effects at Indian observatories. Data from the observatories used for the present studies are from Alibag on the West Coast, Visakhapatnam and Pondicherry on the East Coast, Hyderabad and Nagpur as central inland stations which are located far from either of the coasts; all the above stations are free from the influence of the daytime equatorial electrojet. It has been found that Alibag and Pondicherry Observatories form a separate cluster showing anomalous variations in the vertical (Z-component. H- and D-components form different clusters. The results are compared with the graphical method. Analytical technique and the results of Fuzzy-clustering analysis are discussed here.

  4. Increasing the number of thyroid lesions classes in microarray analysis improves the relevance of diagnostic markers.

    Directory of Open Access Journals (Sweden)

    Jean-Fred Fontaine

    Full Text Available BACKGROUND: Genetic markers for thyroid cancers identified by microarray analysis have offered limited predictive accuracy so far because of the few classes of thyroid lesions usually taken into account. To improve diagnostic relevance, we have simultaneously analyzed microarray data from six public datasets covering a total of 347 thyroid tissue samples representing 12 histological classes of follicular lesions and normal thyroid tissue. Our own dataset, containing about half the thyroid tissue samples, included all categories of thyroid lesions. METHODOLOGY/PRINCIPAL FINDINGS: Classifier predictions were strongly affected by similarities between classes and by the number of classes in the training sets. In each dataset, sample prediction was improved by separating the samples into three groups according to class similarities. The cross-validation of differential genes revealed four clusters with functional enrichments. The analysis of six of these genes (APOD, APOE, CLGN, CRABP1, SDHA and TIMP1 in 49 new samples showed consistent gene and protein profiles with the class similarities observed. Focusing on four subclasses of follicular tumor, we explored the diagnostic potential of 12 selected markers (CASP10, CDH16, CLGN, CRABP1, HMGB2, ALPL2, ADAMTS2, CABIN1, ALDH1A3, USP13, NR2F2, KRTHB5 by real-time quantitative RT-PCR on 32 other new samples. The gene expression profiles of follicular tumors were examined with reference to the mutational status of the Pax8-PPARgamma, TSHR, GNAS and NRAS genes. CONCLUSION/SIGNIFICANCE: We show that diagnostic tools defined on the basis of microarray data are more relevant when a large number of samples and tissue classes are used. Taking into account the relationships between the thyroid tumor pathologies, together with the main biological functions and pathways involved, improved the diagnostic accuracy of the samples. Our approach was particularly relevant for the classification of microfollicular adenomas.

  5. Application of Subspace Clustering in DNA Sequence Analysis.

    Science.gov (United States)

    Wallace, Tim; Sekmen, Ali; Wang, Xiaofei

    2015-10-01

    Identification and clustering of orthologous genes plays an important role in developing evolutionary models such as validating convergent and divergent phylogeny and predicting functional proteins in newly sequenced species of unverified nucleotide protein mappings. Here, we introduce an application of subspace clustering as applied to orthologous gene sequences and discuss the initial results. The working hypothesis is based upon the concept that genetic changes between nucleotide sequences coding for proteins among selected species and groups may lie within a union of subspaces for clusters of the orthologous groups. Estimates for the subspace dimensions were computed for a small population sample. A series of experiments was performed to cluster randomly selected sequences. The experimental design allows for both false positives and false negatives, and estimates for the statistical significance are provided. The clustering results are consistent with the main hypothesis. A simple random mutation binary tree model is used to simulate speciation events that show the interdependence of the subspace rank versus time and mutation rates. The simple mutation model is found to be largely consistent with the observed subspace clustering singular value results. Our study indicates that the subspace clustering method may be applied in orthology analysis. PMID:26162018

  6. Cluster analysis of radionuclide concentrations in beach sand

    NARCIS (Netherlands)

    de Meijer, R.J.; James, I.; Jennings, P.J.; Keoyers, J.E.

    2001-01-01

    This paper presents a method in which natural radionuclide concentrations of beach sand minerals are traced along a stretch of coast by cluster analysis. This analysis yields two groups of mineral deposit with different origins. The method deviates from standard methods of following dispersal of rad

  7. Bayesian Analysis of Two Stellar Populations in Galactic Globular Clusters III: Analysis of 30 Clusters

    CERN Document Server

    Wagner-Kaiser, R; Sarajedini, A; von Hippel, T; van Dyk, D A; Robinson, E; Stein, N; Jefferys, W H

    2016-01-01

    We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of 30 Galactic Globular Clusters to characterize two distinct stellar populations. A sophisticated Bayesian technique is employed to simultaneously sample the joint posterior distribution of age, distance, and extinction for each cluster, as well as unique helium values for two populations within each cluster and the relative proportion of those populations. We find the helium differences among the two populations in the clusters fall in the range of ~0.04 to 0.11. Because adequate models varying in CNO are not presently available, we view these spreads as upper limits and present them with statistical rather than observational uncertainties. Evidence supports previous studies suggesting an increase in helium content concurrent with increasing mass of the cluster and also find that the proportion of the first population of stars increases with mass as well. Our results are examined in the context of proposed g...

  8. Discourse Analysis and Cultivation of Conversational Competence in English Class

    Science.gov (United States)

    Zhang, Zheng

    2008-01-01

    The essay is to discuss in perspective of teaching how to apply the results of Discourse Analysis study to English class to train students for successful communication through taking turns, controlling turns, teaching exchange, organizing transaction, spreading topic and taking into account contextual factors as well in order to cultivate…

  9. Frontiers of performance analysis on leadership-class systems

    International Nuclear Information System (INIS)

    The number of cores in high-end systems for scientific computing are employingis increasing rapidly. As a result, there is an pressing need for tools that can measure, model, and diagnose performance problems in highly-parallel runs. We describe two tools that employ complementary approaches for analysis at scale and we illustrate their use on DOE leadership-class systems.

  10. Multilevel Latent Class Analysis: Parametric and Nonparametric Models

    Science.gov (United States)

    Finch, W. Holmes; French, Brian F.

    2014-01-01

    Latent class analysis is an analytic technique often used in educational and psychological research to identify meaningful groups of individuals within a larger heterogeneous population based on a set of variables. This technique is flexible, encompassing not only a static set of variables but also longitudinal data in the form of growth mixture…

  11. Performance Analysis of Enhanced Clustering Algorithm for Gene Expression Data

    Directory of Open Access Journals (Sweden)

    T. Chandrasekhar

    2011-11-01

    Full Text Available Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this paper we applied K-Means with Automatic Generations of Merge Factor for ISODATA- AGMFI. Though AGMFI has been applied for clustering of Gene Expression Data, this proposed Enhanced Automatic Generations of Merge Factor for ISODATA- EAGMFI Algorithms overcome the drawbacks of AGMFI in terms of specifying the optimal number of clusters and initialization of good cluster centroids. Experimental results on Gene Expression Data show that the proposed EAGMFI algorithms could identify compact clusters with perform well in terms of the Silhouette Coefficients cluster measure.

  12. Clustering and Feature Selection using Sparse Principal Component Analysis

    CERN Document Server

    Luss, Ronny

    2007-01-01

    In this paper, we use sparse principal component analysis (PCA) to solve clustering and feature selection problems. Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of variance in the data while having only a limited number of nonzero coefficients. PCA is often used as a simple clustering technique and sparse factors allow us here to interpret the clusters in terms of a reduced set of variables. We begin with a brief introduction and motivation on sparse PCA and detail our implementation of the algorithm in d'Aspremont et al. (2005). We finish by describing the application of sparse PCA to clustering and by a brief description of DSPCA, the numerical package used in these experiments.

  13. Cognitive analysis of multiple sclerosis utilizing fuzzy cluster means

    Directory of Open Access Journals (Sweden)

    Imianvan Anthony Agboizebeta

    2012-01-01

    Full Text Available Multiple sclerosis, often called MS, is a disease that affects the central nervous system (the brain and spinal cord. Myelin provides insulation for nerve cells improves the conduction of impulses along the nerves and is important for maintaining the health of the nerves. In multiple sclerosis, inflammation causes the myelin to disappear. Genetic factors, environmental issues and viral infection may also play a role in developing the disease. Ms is characterized by life threatening symptoms such as; loss of balance, hearing problem and depression. The application of Fuzzy Cluster Means (FCM or Fuzzy CMean analysis to the diagnosis of different forms of multiple sclerosis is the focal point of this paper. Application of cluster analysis involves a sequence of methodological and analytical decision steps that enhances the quality and meaning of the clusters produced. Uncertainties associated with analysis of multiple sclerosis test data are eliminated by the system

  14. A clustering analysis of lipoprotein diameters in the metabolic syndrome

    Directory of Open Access Journals (Sweden)

    Frazier-Wood Alexis C

    2011-12-01

    Full Text Available Abstract Background The presence of smaller low-density lipoproteins (LDL has been associated with atherosclerosis risk, and the insulin resistance (IR underlying the metabolic syndrome (MetS. In addition, some research has supported the association of very low-, low- and high-density lipoprotein (VLDL HDL particle diameters with components of the metabolic syndrome (MetS, although this has been the focus of less research. We aimed to explore the relationship of VLDL, LDL and HDL diameters to MetS and its features, and by clustering individuals by their diameters of VLDL, LDL and HDL particles, to capture information across all three fractions of lipoprotein into a unified phenotype. Methods We used nuclear magnetic resonance spectroscopy measurements on fasting plasma samples from a general population sample of 1,036 adults (mean ± SD, 48.8 ± 16.2 y of age. Using latent class analysis, the sample was grouped by the diameter of their fasting lipoproteins, and mixed effects models tested whether the distribution of MetS components varied across the groups. Results Eight discrete groups were identified. Two groups (N = 251 were enriched with individuals meeting criteria for the MetS, and were characterized by the smallest LDL/HDL diameters. One of those two groups, one was additionally distinguished by large VLDL, and had significantly higher blood pressure, fasting glucose, triglycerides, and waist circumference (WC; P Conclusions While small LDL diameters remain associated with IR and the MetS, the occurrence of these in conjunction with a shift to overall larger VLDL diameter may identify those with the highest fasting glucose, TG and WC within the MetS. If replicated, the association of this phenotype with more severe IR-features indicated that it may contribute to identifying of those most at risk for incident type II diabetes and cardiometabolic disease.

  15. Mass spectrometric analysis with cluster projectiles and coincidence counting

    Energy Technology Data Exchange (ETDEWEB)

    Cox, B.D.

    1992-01-01

    Methods for maximizing the amount of secondary ion information, per primary projectile, are described. The method is based on time-of-flight mass spectrometry and event-by-event coincidence counting. The information obtained from coincidence counting time-of-flight mass spectrometry includes: (a) surface composition, (b) relative concentrations, and (c) degree of intermolecular mixing. The technique was applied to the study of an important new class of polymers: polymer blends. Secondary ion mass spectrometry, when applied to the analysis of synthetic polymers, induces backbone fragmentation which is characteristic of the homopolymer. The characteristic fingerprint peaks from polystyrene and poly(vinyl methyl ether) were used to identify the presence of these two polymers in a polymer blend. The percent coincidence between the characteristic secondary ions from each component of the blend were used to determine both the relative concentration and the degree of molecular mixing. Results indicate molecular segregation of the two polymers on the film surface. The largest degree of segregation was determined for the phase separated blends. The performance of this technique depends on the desorption efficiency of the primary projectiles. In practice one seeks primary ions which are surface sensitive, have controllable parameters such as size, velocity, and charge state, and generate high secondary ion yields. Focus was placed on the use of keV organic cluster projectiles to meet these criteria. Of interest to this study were C[sub 18] (chrysene), C[sub 24] (coronene), and C[sub 60] (buckminster-fulleren). Results indicate enhanced secondary ion yields for C[sub 60]. For example, when CsI is bombarded with 30 keV C[sub 60], the yields for I[sup [minus

  16. A Mid-Infrared Study of the Class 0 Cluster in LDN 1448

    CERN Document Server

    O'Linger, J A; Ressler, M E; Wolf-Chase, G A

    2005-01-01

    We present ground-based mid-infrared observations of Class 0 protostars in LDN 1448. Of the five known protostars in this cloud, we detected two, L1448N:A and L1448C, at 12.5, 17.9, 20.8, and 24.5 microns, and a third, L1448 IRS 2, at 24.5 microns. We present high-resolution images of the detected sources, and photometry or upper limits for all five Class 0 sources in this cloud. With these data, we are able to augment existing spectral energy distributions (SEDs) for all five objects and place them on an evolutionary status diagram.

  17. Traffic Accident, System Model and Cluster Analysis in GIS

    Directory of Open Access Journals (Sweden)

    Veronika Vlčková

    2015-07-01

    Full Text Available One of the many often frequented topics as normal journalism, so the professional public, is the problem of traffic accidents. This article illustrates the orientation of considerations to a less known context of accidents, with the help of constructive systems theory and its methods, cluster analysis and geoinformation engineering. Traffic accident is reframing the space-time, and therefore it can be to study with tools of technology of geographic information systems. The application of system approach enabling the formulation of the system model, grabbed by tools of geoinformation engineering and multicriterial and cluster analysis.

  18. Fault Reactivation Analysis Using Microearthquake Clustering Based on Signal-to-Noise Weighted Waveform Similarity

    Science.gov (United States)

    Grund, Michael; Groos, Jörn C.; Ritter, Joachim R. R.

    2016-07-01

    The cluster formation of about 2000 induced microearthquakes (mostly M L correlation and a subsequent equivalence class approach. All events were detected within two separated but neighbouring seismic volumes close to the geothermal powerplants near Landau and Insheim in the Upper Rhine Graben, SW Germany between 2006 and 2013. Besides different sensors, sampling rates and individual data gaps, mainly low signal-to-noise ratios (SNR) of the recordings at most station sites provide a complication for the determination of a precise waveform similarity analysis of the microseismic events in this area. To include a large number of events for such an analysis, a newly developed weighting approach was implemented in the waveform similarity analysis which directly considers the individual SNRs across the whole seismic network. The application to both seismic volumes leads to event clusters with high waveform similarities within short (seconds to hours) and long (months to years) time periods covering two magnitude ranges. The estimated relative hypocenter locations are spatially concentrated for each single cluster and mirror the orientations of mapped faults as well as interpreted rupture planes determined from fault plane solutions. Depending on the waveform cross-correlation coefficient threshold, clusters can be resolved in space to as little as one dominant wavelength. The interpretation of these observations implies recurring fault reactivations by fluid injection with very similar faulting mechanisms during different time periods between 2006 and 2013.

  19. A novel PPGA-based clustering analysis method for business cycle indicator selection

    Institute of Scientific and Technical Information of China (English)

    Dabin ZHANG; Lean YU; Shouyang WANG; Yingwen SONG

    2009-01-01

    A new clustering analysis method based on the pseudo parallel genetic algorithm (PPGA) is proposed for business cycle indicator selection. In the proposed method,the category of each indicator is coded by real numbers,and some illegal chromosomes are repaired by the identi-fication arid restoration of empty class. Two mutation op-erators, namely the discrete random mutation operator andthe optimal direction mutation operator, are designed to bal-ance the local convergence speed and the global convergence performance, which are then combined with migration strat-egy and insertion strategy. For the purpose of verification and illustration, the proposed method is compared with the K-means clustering algorithm and the standard genetic algo-rithms via a numerical simulation experiment. The experi-mental result shows the feasibility and effectiveness of the new PPGA-based clustering analysis algorithm. Meanwhile,the proposed clustering analysis algorithm is also applied to select the business cycle indicators to examine the status of the macro economy. Empirical results demonstrate that the proposed method can effectively and correctly select some leading indicators, coincident indicators, and lagging indi-cators to reflect the business cycle, which is extremely op-erational for some macro economy administrative managers and business decision-makers.

  20. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering

    Directory of Open Access Journals (Sweden)

    Suraj

    2015-01-01

    Full Text Available Transferring the brain computer interface (BCI from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG signal reasons us to look toward evolutionary algorithm (EA. Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA and particle swarm optimization (PSO based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD and desynchronization (ERD feature vector is formed.

  1. Examination of European Union economic cohesion: A cluster analysis approach

    Directory of Open Access Journals (Sweden)

    Jiri Mazurek

    2014-01-01

    Full Text Available In the past years majority of EU members experienced the highest economic decline in their modern history, but impacts of the global financial crisis were not distributed homogeneously across the continent. The aim of the paper is to examine a cohesion of European Union (plus Norway and Iceland in terms of an economic development of its members from the 1st of January 2008 to the 31st of December 2012. For the study five economic indicators were selected: GDP growth, unemployment, inflation, labour productivity and government debt. Annual data from Eurostat databases were averaged over the whole period and then used as an input for a cluster analysis. It was found that EU countries were divided into six different clusters. The most populated cluster with 14 countries covered Central and West Europe and reflected relative homogeneity of this part of Europe. Countries of Southern Europe (Greece, Portugal and Spain shared their own cluster of the most affected countries by the recent crisis as well as the Baltics and the Balkans states in another cluster. On the other hand Slovakia and Poland, only two countries that escaped a recession, were classified in their own cluster of the most successful countries

  2. A Geometric Analysis of Subspace Clustering with Outliers

    CERN Document Server

    Soltanolkotabi, Mahdi

    2011-01-01

    This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information about their dimensions. We develop a novel geometric analysis of an algorithm named {\\em sparse subspace clustering} (SSC) \\cite{Elhamifar09}, which significantly broadens the range of problems where it is provably effective. For instance, we show that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension. We also prove that SSC can correctly cluster data points even when the subspaces of interest intersect. Further, we develop an extension of SSC that succeeds when the data set is corrupted with possibly overwhelmingly many outliers. Underlying our analysis are clear geometric insights, which may bear on other sparse recovery problems. A numerical study complements our theoretica...

  3. Java Analysis Studio and the hep.lcd class library

    International Nuclear Information System (INIS)

    The Java Analysis Studio and the hep.lcd class library provide a general framework for performing Java-based Linear Collider Detector (LCD) studies. The package is being developed to fully reconstruct 500 GeV to 1.5 TeV e+e- annihilation events for analyzing detector options and performance. The current North American LCD reconstruction effort is aimed at comparing different detailed detector models by performing full detector simulation and reconstruction. This paper describes the JAS/hep.lcd distributed analysis framework and some aspects of the reconstruction and analysis object modeling

  4. A Cluster Analysis of Personality Style in Adults with ADHD

    Science.gov (United States)

    Robin, Arthur L.; Tzelepis, Angela; Bedway, Marquita

    2008-01-01

    Objective: The purpose of this study was to use hierarchical linear cluster analysis to examine the normative personality styles of adults with ADHD. Method: A total of 311 adults with ADHD completed the Millon Index of Personality Styles, which consists of 24 scales assessing motivating aims, cognitive modes, and interpersonal behaviors. Results:…

  5. Frailty phenotypes in the elderly based on cluster analysis

    DEFF Research Database (Denmark)

    Dato, Serena; Montesanto, Alberto; Lagani, Vincenzo;

    2012-01-01

    genetic background on the frailty status is still questioned. We investigated the applicability of a cluster analysis approach based on specific geriatric parameters, previously set up and validated in a southern Italian population, to two large longitudinal Danish samples. In both cohorts, we identified...

  6. K-means cluster analysis and seismicity partitioning for Pakistan

    Science.gov (United States)

    Rehman, Khaista; Burton, Paul W.; Weatherill, Graeme A.

    2014-07-01

    Pakistan and the western Himalaya is a region of high seismic activity located at the triple junction between the Arabian, Eurasian and Indian plates. Four devastating earthquakes have resulted in significant numbers of fatalities in Pakistan and the surrounding region in the past century (Quetta, 1935; Makran, 1945; Pattan, 1974 and the recent 2005 Kashmir earthquake). It is therefore necessary to develop an understanding of the spatial distribution of seismicity and the potential seismogenic sources across the region. This forms an important basis for the calculation of seismic hazard; a crucial input in seismic design codes needed to begin to effectively mitigate the high earthquake risk in Pakistan. The development of seismogenic source zones for seismic hazard analysis is driven by both geological and seismotectonic inputs. Despite the many developments in seismic hazard in recent decades, the manner in which seismotectonic information feeds the definition of the seismic source can, in many parts of the world including Pakistan and the surrounding regions, remain a subjective process driven primarily by expert judgment. Whilst much research is ongoing to map and characterise active faults in Pakistan, knowledge of the seismogenic properties of the active faults is still incomplete in much of the region. Consequently, seismicity, both historical and instrumental, remains a primary guide to the seismogenic sources of Pakistan. This study utilises a cluster analysis approach for the purposes of identifying spatial differences in seismicity, which can be utilised to form a basis for delineating seismogenic source regions. An effort is made to examine seismicity partitioning for Pakistan with respect to earthquake database, seismic cluster analysis and seismic partitions in a seismic hazard context. A magnitude homogenous earthquake catalogue has been compiled using various available earthquake data. The earthquake catalogue covers a time span from 1930 to 2007 and

  7. Analysis of failure time data with multilevel clustering, with application to the child vitamin a intervention trial in Nepal.

    Science.gov (United States)

    Shih, Joanna H; Lu, Shou-En

    2007-09-01

    We consider the problem of estimating covariate effects in the marginal Cox proportional hazard model and multilevel associations for child mortality data collected from a vitamin A supplementation trial in Nepal, where the data are clustered within households and villages. For this purpose, a class of multivariate survival models that can be represented by a functional of marginal survival functions and accounts for hierarchical structure of clustering is exploited. Based on this class of models, an estimation strategy involving a within-cluster resampling procedure is proposed, and a model assessment approach is presented. The asymptotic theory for the proposed estimators and lack-of-fit test is established. The simulation study shows that the estimates are approximately unbiased, and the proposed test statistic is conservative under extremely heavy censoring but approaches the size otherwise. The analysis of the Nepal study data shows that the association of mortality is much greater within households than within villages. PMID:17825001

  8. Cluster analysis of movement patterns in multiarticular actions: a tutorial.

    Science.gov (United States)

    Rein, Robert; Button, Chris; Davids, Keith; Summers, Jeffery

    2010-04-01

    The present paper proposes a technical analysis method for extracting information about movement patterning in studies of motor control, based on a cluster analysis of movement kinematics. In a tutorial fashion, data from three different experiments are presented to exemplify and validate the technical method. When applied to three different basketball-shooting techniques, the method clearly distinguished between the different patterns. When applied to a cyclical wrist supination-pronation task, the cluster analysis provided the same results as an analysis using the conventional discrete relative phase measure. Finally, when analyzing throwing performance constrained by distance to target, the method grouped movement patterns together according to throwing distance. In conclusion, the proposed technical method provides a valuable tool to improve understanding of coordination and control in different movement models, including multiarticular actions. PMID:20484771

  9. Robustness analysis for a class of nonlinear descriptor systems

    Institute of Scientific and Technical Information of China (English)

    吴敏; 张凌波; 何勇

    2004-01-01

    The robustness analysis problem of a class of nonlinear descriptor systems is studied. Nonlinear matrix inequality which has the good computation property of convex feasibility is employed to derive some sufficient conditions to guarantee that the nonlinear descriptor systems have robust disturbance attenuation performance, which avoids the computational difficulties in conversing nonlinear matrix and Hamilton-Jacobi inequality. The computation property of convex feasibility of nonlinear matrix inequality makes it possible to apply the results of nonlinear robust control to practice.

  10. Multi-class texture analysis in colorectal cancer histology

    OpenAIRE

    Jakob Nikolas Kather; Cleo-Aron Weis; Francesco Bianconi; Melchers, Susanne M; Schad, Lothar R; Timo Gaiser; Alexander Marx; Frank Gerrit Zöllner

    2016-01-01

    Automatic recognition of different tissue types in histological images is an essential part in the digital pathology toolbox. Texture analysis is commonly used to address this problem; mainly in the context of estimating the tumour/stroma ratio on histological samples. However, although histological images typically contain more than two tissue types, only few studies have addressed the multi-class problem. For colorectal cancer, one of the most prevalent tumour types, there are in fact no pu...

  11. Cognitive analysis of multiple sclerosis utilizing fuzzy cluster means

    Directory of Open Access Journals (Sweden)

    Imianvan Anthony Agboizebeta

    2012-02-01

    Full Text Available Multiple sclerosis, often called MS, is a disease that affects the central nervous system (the brain andspinal cord. Myelin provides insulation for nerve cells improves the conduction of impulses along thenerves and is important for maintaining the health of the nerves. In multiple sclerosis, inflammationcauses the myelin to disappear. Genetic factors, environmental issues and viral infection may alsoplay a role in developing the disease. Ms is characterized by life threatening symptoms such as; loss ofbalance, hearing problem and depression. The application of Fuzzy Cluster Means (FCM or Fuzzy CMeananalysis to the diagnosis of different forms of multiple sclerosis is the focal point of this paper.Application of cluster analysis involves a sequence of methodological and analytical decision stepsthat enhances the quality and meaning of the clusters produced. Uncertainties associated withanalysis of multiple sclerosis test data are eliminated by the system

  12. Latent class analysis of comorbidity patterns among women with generalized and localized vulvodynia: preliminary findings

    Directory of Open Access Journals (Sweden)

    Nguyen RHN

    2013-04-01

    Full Text Available Ruby HN Nguyen,1 Christin Veasley,2 Derek Smolenski1,3 1Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, 2National Vulvodynia Association, Silver Spring, MD, 3National Center for Telehealth and Technology, Defense Centers of Excellence, Department of Defense, Tacoma, WA, USA Background: The pattern and extent of clustering of comorbid pain conditions with vulvodynia is largely unknown. However, elucidating such patterns may improve our understanding of the underlying mechanisms involved in these common causes of chronic pain. We sought to describe the pattern of comorbid pain clustering in a population-based sample of women with diagnosed vulvodynia. Methods: A total of 1457 women with diagnosed vulvodynia self-reported their type of vulvar pain as localized, generalized, or both. Respondents were also surveyed about the presence of comorbid pain conditions, including temporomandibular joint and muscle disorders, interstitial cystitis, fibromyalgia, chronic fatigue syndrome, irritable bowel syndrome, endometriosis, and chronic headache. Age-adjusted latent class analysis modeled extant patterns of comorbidity by vulvar pain type, and a multigroup model was used to test for the equality of comorbidity patterns using a comparison of prevalence. A two-class model (no/single comorbidity versus multiple comorbidities had the best fit in individual and multigroup models. Results: For the no/single comorbidity class, the posterior probability prevalence of item endorsement ranged from 0.9% to 24.4%, indicating a low probability of presence. Conversely, the multiple comorbidity class showed that at least two comorbid conditions were likely to be endorsed by at least 50% of women in that class, and irritable bowel syndrome and fibromyalgia were the most common comorbidities regardless of type of vulvar pain. Prevalence of the multiple comorbidity class differed by type of vulvar pain: both

  13. An Optical Analysis of the Merging Cluster Abell 3888

    CERN Document Server

    Shakouri, S; Dehghan, S

    2016-01-01

    In this paper we present new AAOmega spectroscopy of 254 galaxies within a 30' radius around Abell 3888. We combine these data with the existing redshifts measured in a one degree radius around the cluster and performed a substructure analysis. We confirm 71 member galaxies within the core of A3888 and determine a new average redshift and velocity dispersion for the cluster of 0.1535 +\\- 0.0009 and 1181 +\\- 197 km/s, respectively. The cluster is elongated along an East-West axis and we find the core is bimodal along this axis with two sub-groups of 26 and 41 members detected. Our results suggest that A3888 is a merging system putting to rest the previous conjecture about the morphological status of the cluster derived from X-ray observations. In addition to the results on A3888 we also present six newly detected galaxy over-densities in the field, three of which we classify as new galaxy clusters.

  14. Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes

    KAUST Repository

    Cannistraci, Carlo

    2010-09-01

    Motivation: Nonlinear small datasets, which are characterized by low numbers of samples and very high numbers of measures, occur frequently in computational biology, and pose problems in their investigation. Unsupervised hybrid-two-phase (H2P) procedures-specifically dimension reduction (DR), coupled with clustering-provide valuable assistance, not only for unsupervised data classification, but also for visualization of the patterns hidden in high-dimensional feature space. Methods: \\'Minimum Curvilinearity\\' (MC) is a principle that-for small datasets-suggests the approximation of curvilinear sample distances in the feature space by pair-wise distances over their minimum spanning tree (MST), and thus avoids the introduction of any tuning parameter. MC is used to design two novel forms of nonlinear machine learning (NML): Minimum Curvilinear embedding (MCE) for DR, and Minimum Curvilinear affinity propagation (MCAP) for clustering. Results: Compared with several other unsupervised and supervised algorithms, MCE and MCAP, whether individually or combined in H2P, overcome the limits of classical approaches. High performance was attained in the visualization and classification of: (i) pain patients (proteomic measurements) in peripheral neuropathy; (ii) human organ tissues (genomic transcription factor measurements) on the basis of their embryological origin. Conclusion: MC provides a valuable framework to estimate nonlinear distances in small datasets. Its extension to large datasets is prefigured for novel NMLs. Classification of neuropathic pain by proteomic profiles offers new insights for future molecular and systems biology characterization of pain. Improvements in tissue embryological classification refine results obtained in an earlier study, and suggest a possible reinterpretation of skin attribution as mesodermal. © The Author(s) 2010. Published by Oxford University Press.

  15. Bounded Delay Timing Analysis of a Class of CSP Programs

    DEFF Research Database (Denmark)

    Hulgaard, Henrik; Burns, Steven M.

    1997-01-01

    . Such a description is transformed into a safe Petri net with interval time delays specified on the places of the net. The timing analysis we perform determines the extreme separation in time between two communication actions of the CSP program for all possible timed executions of the system. We formally define......We describe an algebraic technique for performing timing analysis of a class of asynchronous circuits described as CSP programs (including Martin's probe operator) with the restrictions that there is no OR-causality and that guard selection is either completely free or mutually exclusive...

  16. DGA Clustering and Analysis: Mastering Modern, Evolving Threats, DGALab

    Directory of Open Access Journals (Sweden)

    Alexander Chailytko

    2016-05-01

    Full Text Available Domain Generation Algorithms (DGA is a basic building block used in almost all modern malware. Malware researchers have attempted to tackle the DGA problem with various tools and techniques, with varying degrees of success. We present a complex solution to populate DGA feed using reversed DGAs, third-party feeds, and a smart DGA extraction and clustering based on emulation of a large number of samples. Smart DGA extraction requires no reverse engineering and works regardless of the DGA type or initialization vector, while enabling a cluster-based analysis. Our method also automatically allows analysis of the whole malware family, specific campaign, etc. We present our system and demonstrate its abilities on more than 20 malware families. This includes showing connections between different campaigns, as well as comparing results. Most importantly, we discuss how to utilize the outcome of the analysis to create smarter protections against similar malware.

  17. How frequently do clusters occur in hierarchical clustering analysis? A graph theoretical approach to studying ties in proximity

    OpenAIRE

    Leal, Wilmer; Llanos, Eugenio J.; RESTREPO Guillermo; Carlos F Suárez; Patarroyo, Manuel Elkin

    2016-01-01

    Background Hierarchical cluster analysis (HCA) is a widely used classificatory technique in many areas of scientific knowledge. Applications usually yield a dendrogram from an HCA run over a given data set, using a grouping algorithm and a similarity measure. However, even when such parameters are fixed, ties in proximity (i.e. two equidistant clusters from a third one) may produce several different dendrograms, having different possible clustering patterns (different classifications). This s...

  18. Class analysis and the reorientation of class theory: the case of persisting differentials in educational attainment. 1996.

    Science.gov (United States)

    Goldthorpe, John H

    2010-01-01

    In class analysis the main regularities that have been established by empirical research are not ones of long-term class formation or decomposition, as envisaged in Marxist or liberal theory, but rather ones that exhibit the powerful resistance to change of class relations and associated life-chances and patterns of social action. If these regularities are to be explained, theory needs to be correspondingly reoriented, and must abandon functionalist and teleological assumptions in favour of providing more secure micro-foundations. This argument is developed and illustrated in the course of an attempt to apply rational action theory to the explanation of persisting class differentials in educational attainment. PMID:20092500

  19. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

    Science.gov (United States)

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

    2016-04-01

    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  20. Full text clustering and relationship network analysis of biomedical publications.

    Directory of Open Access Journals (Sweden)

    Renchu Guan

    Full Text Available Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  1. Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

    Science.gov (United States)

    Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

    2015-11-01

    Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (Pgait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners.

  2. The Quantitative Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, Ethirajan

    2016-07-01

    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  3. The Quantitative Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, Ethirajan

    2016-05-01

    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  4. CLUSTERING TECHNIQUES IN FINANCIAL DATA ANALYSIS APPLICATIONS ON THE U.S. FINANCIAL MARKET

    Directory of Open Access Journals (Sweden)

    ALEXANDRU BOGEANU

    2013-08-01

    Full Text Available In the economic and financial analysis, the need to classify companies in terms of categories, thedelimitation of which has to be clear and natural occurs frequently. The differentiation of companies bycategories is performed according to the economic and financial indicators which are associated to the above.The clustering algorithms are a very powerful tool in identifying the classes of companies based on theinformation provided by the indicators associated to them. The last decade imposed to the economic andfinancial practice the use of economic value added as an indicator of synthesis of the entire activity of acompany. Our study uses a sample of 106 companies in four different fields of activity; each company isidentified by: Economic Value Added, Net Income, Current Sales, Equity and Stock Price. Using the ascendinghierarchical classification methods and the partitioning classification methods, as well as Ward’s method and kmeansalgorithm, we identified on the considered sample an information structure consisting of 5 rating classes.

  5. Stellar variability in open clusters. I. A new class of variable stars in NGC 3766

    CERN Document Server

    Mowlavi, N; Saesen, S; Eyer, L

    2013-01-01

    Aims. We analyze the population of periodic variable stars in the open cluster NGC 3766 based on a 7-year multi-band monitoring campaign conducted on the 1.2 m Swiss Euler telescope at La Silla, Chili. Methods. The data reduction, light curve cleaning and period search procedures, combined with the long observation time line, allow us to detect variability amplitudes down to the milli-magnitude level. The variability properties are complemented with the positions in the color-magnitude and color-color diagrams to classify periodic variable stars into distinct variability types. Results. We find a large population (36 stars) of new variable stars between the red edge of slowly pulsating B (SPB) stars and the blue edge of delta Sct stars, a region in the Hertzsprung-Russell (HR) diagram where no pulsation is predicted to occur based on standard stellar models. The bulk of their periods ranges from 0.1 to 0.7 d, with amplitudes between 1 and 4 mmag for the majority of them. About 20% of stars in that region of t...

  6. Bayesian Analysis of Multiple Populations in Galactic Globular Clusters

    Science.gov (United States)

    Wagner-Kaiser, Rachel A.; Sarajedini, Ata; von Hippel, Ted; Stenning, David; Piotto, Giampaolo; Milone, Antonino; van Dyk, David A.; Robinson, Elliot; Stein, Nathan

    2016-01-01

    We use GO 13297 Cycle 21 Hubble Space Telescope (HST) observations and archival GO 10775 Cycle 14 HST ACS Treasury observations of Galactic Globular Clusters to find and characterize multiple stellar populations. Determining how globular clusters are able to create and retain enriched material to produce several generations of stars is key to understanding how these objects formed and how they have affected the structural, kinematic, and chemical evolution of the Milky Way. We employ a sophisticated Bayesian technique with an adaptive MCMC algorithm to simultaneously fit the age, distance, absorption, and metallicity for each cluster. At the same time, we also fit unique helium values to two distinct populations of the cluster and determine the relative proportions of those populations. Our unique numerical approach allows objective and precise analysis of these complicated clusters, providing posterior distribution functions for each parameter of interest. We use these results to gain a better understanding of multiple populations in these clusters and their role in the history of the Milky Way.Support for this work was provided by NASA through grant numbers HST-GO-10775 and HST-GO-13297 from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555. This material is based upon work supported by the National Aeronautics and Space Administration under Grant NNX11AF34G issued through the Office of Space Science. This project was supported by the National Aeronautics & Space Administration through the University of Central Florida's NASA Florida Space Grant Consortium.

  7. Applications of Cluster Analysis to the Creation of Perfectionism Profiles: A Comparison of two Clustering Approaches

    Directory of Open Access Journals (Sweden)

    Jocelyn H Bolin

    2014-04-01

    Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  8. Segment clustering methodology for unsupervised Holter recordings analysis

    Science.gov (United States)

    Rodríguez-Sotelo, Jose Luis; Peluffo-Ordoñez, Diego; Castellanos Dominguez, German

    2015-01-01

    Cardiac arrhythmia analysis on Holter recordings is an important issue in clinical settings, however such issue implicitly involves attending other problems related to the large amount of unlabelled data which means a high computational cost. In this work an unsupervised methodology based in a segment framework is presented, which consists of dividing the raw data into a balanced number of segments in order to identify fiducial points, characterize and cluster the heartbeats in each segment separately. The resulting clusters are merged or split according to an assumed criterion of homogeneity. This framework compensates the high computational cost employed in Holter analysis, being possible its implementation for further real time applications. The performance of the method is measure over the records from the MIT/BIH arrhythmia database and achieves high values of sensibility and specificity, taking advantage of database labels, for a broad kind of heartbeats types recommended by the AAMI.

  9. Data Preprocessing in Cluster Analysis of Gene Expression

    Institute of Scientific and Technical Information of China (English)

    杨春梅; 万柏坤; 高晓峰

    2003-01-01

    Considering that the DNA microarray technology has generated explosive gene expression data and that it is urgent to analyse and to visualize such massive datasets with efficient methods, we investigate the data preprocessing methods used in cluster analysis, normalization or logarithm of the matrix, by using hierarchical clustering, principal component analysis (PCA) and self-organizing maps (SOMs). The results illustrate that when using the Euclidean distance as measuring metrics, logarithm of relative expression level is the best preprocessing method, while data preprocessed by normalization cannot attain the expected results because the data structure is ruined. If there are only a few principal components, the PCA is an effective method to extract the frame structure, while SOMs are more suitable for a specific structure.

  10. Euro area structural convergence? A multi-criterion cluster analysis.

    OpenAIRE

    Irac, D.; Lopez, J.

    2013-01-01

    This paper proposes a classification of the old member countries of the euro area in a structural data rich environment and run a convergence analysis using the same framework. First, we use a clustering approach and identify two structurally distinct groups of countries that are not modified between 1995 and 2007: the South Countries Group (SCG) – composed of Greece, Italy, Portugal and Spain – and the Other Countries Group (OCG). Second, we propose a convergence metrics and reach three key ...

  11. Customer-Classified Algorithm Based onFuzzy Clustering Analysis

    Institute of Scientific and Technical Information of China (English)

    郭蕴华; 祖巧红; 陈定方

    2004-01-01

    A customer-classified evaluation system is described with the customization-supporting tree of evaluation indexes, in which users can determine any evaluation index independently. Based on this system, a customer-classified algorithm based on fuzzy clustering analysis is proposed to implement the customer-classified management. A numerical example is presented, which provides correct results,indicating that the algorithm can be used in the decision support system of CRM.

  12. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis.

    Directory of Open Access Journals (Sweden)

    Nan Lin

    Full Text Available Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis.

  13. Clustering analysis of ancient celadon based on SOM neural network

    Institute of Scientific and Technical Information of China (English)

    ZHOU ShaoHuai; FU Lue; LIANG BaoLiu

    2008-01-01

    In the study,chemical compositions of 48 fragments of ancient ceramics excavated in 4 archaeological kiln sites which were located in 3 cities (Hangzhou,Cixi and Longquan in Zhejiang Province,China) have been examined by energy-dispersive X-ray fluorescence (EDXRF) technique.Then the method of SOM was introduced into the clustering analysis based on the major and minor element compositions of the bodies,the results manifested that 48 samples could be perfectly distributed into 3 locations,Hangzhou,Cixi and Longquan.Because the major and minor ele-ment compositions of two Royal Kilns were similar to each other,the classification accuracy over them was merely 76.92%.In view of this,the authors have made a SOM clustering analysis again based on the trace element compositions of the bodies,the classification accuracy rose to 84.61%.These results indicated that discrepancies in the trace element compositions of the bodies of the ancient ce-ramics excavated in two Royal Kiln sites were more distinct than those in the major and minor element compositions,which was in accordance with the fact.We ar-gued that SOM could be employed in the clustering analysis of ancient ceramics.

  14. Clustering analysis of ancient celadon based on SOM neural network

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    In the study, chemical compositions of 48 fragments of ancient ceramics excavated in 4 archaeological kiln sites which were located in 3 cities (Hangzhou, Cixi and Longquan in Zhejiang Province, China) have been examined by energy-dispersive X-ray fluorescence (EDXRF) technique. Then the method of SOM was introduced into the clustering analysis based on the major and minor element compositions of the bodies, the results manifested that 48 samples could be perfectly distributed into 3 locations, Hangzhou, Cixi and Longquan. Because the major and minor element compositions of two Royal Kilns were similar to each other, the classification accuracy over them was merely 76.92%. In view of this, the authors have made a SOM clustering analysis again based on the trace element compositions of the bodies, the classification accuracy rose to 84.61%. These results indicated that discrepancies in the trace element compositions of the bodies of the ancient ceramics excavated in two Royal Kiln sites were more distinct than those in the major and minor element compositions, which was in accordance with the fact. We argued that SOM could be employed in the clustering analysis of ancient ceramics.

  15. Coupled Two-Way Clustering Analysis of Gene Microarray Data

    CERN Document Server

    Getz, G; Domany, E

    2000-01-01

    We present a novel coupled two-way clustering approach to gene microarray data analysis. The main idea is to identify subsets of the genes and samples, such that when one of these is used to cluster the other, stable and significant partitions emerge. The search for such subsets is a computationally complex task: we present an algorithm, based on iterative clustering, which performs such a search. This analysis is especially suitable for gene microarray data, where the contributions of a variety of biological mechanisms to the gene expression levels are entangled in a large body of experimental data. The method was applied to two gene microarray data sets, on colon cancer and leukemia. By identifying relevant subsets of the data and focusing on them we were able to discover partitions and correlations that were masked and hidden when the full dataset was used in the analysis. Some of these partitions have clear biological interpretation; others can serve to identify possible directions for future research.

  16. A Review on Clustering and Outlier Analysis Techniques in Datamining

    Directory of Open Access Journals (Sweden)

    S. Koteeswaran

    2012-01-01

    Full Text Available Problem statement: The modern world is based on using physical, biological and social systems more effectively using advanced computerized techniques. A great amount of data being generated by such systems; it leads to a paradigm shift from classical modeling and analyses based on basic principles to developing models and the corresponding analyses directly from data. The ability to extract useful hidden knowledge in these data and to act on that knowledge is becoming increasingly important in today's competitive world. Approach: The entire process of applying a computer-based methodology, including new techniques, for discovering knowledge from data is called data mining. There are two primary goals in the data mining which are prediction and classification. The larger data involved in the data mining requires clustering and outlier analysis for reducing as well as collecting only useful data set. Results: This study is focusing the review of implementation techniques, recent research on clustering and outlier analysis. Conclusion: The study aims for providing the review of clustering and outlier analysis technique and the discussion on the study will guide the researcher for improving their research direction.

  17. Assessment of repeatability of composition of perfumed waters by high-performance liquid chromatography combined with numerical data analysis based on cluster analysis (HPLC UV/VIS - CA).

    Science.gov (United States)

    Ruzik, L; Obarski, N; Papierz, A; Mojski, M

    2015-06-01

    High-performance liquid chromatography (HPLC) with UV/VIS spectrophotometric detection combined with the chemometric method of cluster analysis (CA) was used for the assessment of repeatability of composition of nine types of perfumed waters. In addition, the chromatographic method of separating components of the perfume waters under analysis was subjected to an optimization procedure. The chromatograms thus obtained were used as sources of data for the chemometric method of cluster analysis (CA). The result was a classification of a set comprising 39 perfumed water samples with a similar composition at a specified level of probability (level of agglomeration). A comparison of the classification with the manufacturer's declarations reveals a good degree of consistency and demonstrates similarity between samples in different classes. A combination of the chromatographic method with cluster analysis (HPLC UV/VIS - CA) makes it possible to quickly assess the repeatability of composition of perfumed waters at selected levels of probability.

  18. Diagnostics of subtropical plants functional state by cluster analysis

    Directory of Open Access Journals (Sweden)

    Oksana Belous

    2016-05-01

    Full Text Available The article presents an application example of statistical methods for data analysis on diagnosis of the adaptive capacity of subtropical plants varieties. We depicted selection indicators and basic physiological parameters that were defined as diagnostic. We used evaluation on a set of parameters of water regime, there are: determination of water deficit of the leaves, determining the fractional composition of water and detection parameters of the concentration of cell sap (CCS (for tea culture flushes. These settings are characterized by high liability and high responsiveness to the effects of many abiotic factors that determined the particular care in the selection of plant material for analysis and consideration of the impact on sustainability. On the basis of the experimental data calculated the coefficients of pair correlation between climatic factors and used physiological indicators. The result was a selection of physiological and biochemical indicators proposed to assess the adaptability and included in the basis of methodical recommendations on diagnostics of the functional state of the studied cultures. Analysis of complex studies involving a large number of indicators is quite difficult, especially does not allow to quickly identify the similarity of new varieties for their adaptive responses to adverse factors, and, therefore, to set general requirements to conditions of cultivation. Use of cluster analysis suggests that in the analysis of only quantitative data; define a set of variables used to assess varieties (and the more sampling, the more accurate the clustering will happen, be sure to ascertain the measure of similarity (or difference between objects. It is shown that the identification of diagnostic features, which are subjected to statistical processing, impact the accuracy of the varieties classification. Selection in result of the mono-clusters analysis (variety tea Kolhida; hazelnut Lombardsky red; variety kiwi Monty

  19. Genomic cluster and network analysis for predictive screening for hepatotoxicity.

    Science.gov (United States)

    Fukushima, Tamio; Kikkawa, Rie; Hamada, Yoshimasa; Horii, Ikuo

    2006-12-01

    The present study was undertaken to estimate the usefulness of genomic approaches to predict hepatotoxicity. Male rats were treated with acetaminophen (APAP), carbon tetrachloride (CCL), amiodarone (AD) or tetracycline (TC) at toxic doses. Their livers were extracted 6 or 24 hr after the dosings and were used for subsequent examinations. At 6 hr there were no histological changes noted in any of the groups except for the CCL group, but at 24 hr, such changes were noted in all but the AD group. Regarding genomic analysis, we performed hierarchical cluster analysis using S-plus software. The individual microarray data were clearly classified into 5 treatment-related clusters at 24 hr as well as at 6 hr, even though no morphological changes were noted at 6 hr. In the gene expression analysis using GeneSpring, transcription factor and oxidative stress- and lipid metabolism-related genes were markedly affected in all treatment groups at both time points when compared with the corresponding control values. Finally, we investigated gene networks in the above-affected genes by using Ingenuity Pathway Analysis software. Down-regulation of lipid metabolism-related genes regulated by SREBP1 was observed in all treatment groups at both time points, and up-regulation of oxidative stress-related genes regulated by Nrf2 was observed in the APAP and CCL treatment groups. From the above findings, for the application of genomic approaches to predict hepatotoxicity, we considered that cluster analysis for classification and early prediction of hepatotoxicity and network analysis for investigation of toxicological biomarkers would be useful. PMID:17202758

  20. Monitoring Customer Satisfaction in Service Industry: A Cluster Analysis Approach

    Directory of Open Access Journals (Sweden)

    Matúš Horváth

    2012-10-01

    Full Text Available One of the key performance indicators of quality management system of an organization is customer satisfaction. The process of monitoring customer satisfaction is therefore an important part of the measuring processes of the quality management system. This paper deals with new ways how to analyse and monitor customer satisfaction using the analysis of data containing how the customers use the organisation services and customer leaving rates. The article used cluster analysis in this process for segmentation of customers with the aim to increase the accuracy of the results and on these results based decisions. The aplication example was created as a part of bachelor thesis.

  1. Monitoring Customer Satisfaction in Service Industry: A Cluster Analysis Approach

    Directory of Open Access Journals (Sweden)

    Matúš Horváth

    2012-11-01

    Full Text Available One of the key performance indicators of quality management system of an organization is customer satisfaction. The process of monitoring customer satisfaction is therefore an important part of the measuring processes of the quality management system. This paper deals with new ways how to analyse and monitor customer satisfaction using the analysis of data containing how the customers use the organisation services and customer leaving rates. The article used cluster analysis in this process for segmentation of customers with the aim to increase the accuracy of the results and on these results based decisions. The aplication example was created as a part of bachelor thesis.

  2. Integrated Data Analysis (IDCA) Program - PETN Class 4 Standard

    Energy Technology Data Exchange (ETDEWEB)

    Sandstrom, Mary M. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Brown, Geoffrey W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Preston, Daniel N. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Pollard, Colin J. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Warner, Kirstin F. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Sorensen, Daniel N. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Remmers, Daniel L. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Shelley, Timothy J. [Air Force Research Lab. (AFRL), Tyndall AFB, FL (United States); Reyes, Jose A. [Applied Research Associates, Tyndall AFB, FL (United States); Phillips, Jason J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hsu, Peter C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Reynolds, John G. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2012-08-01

    The Integrated Data Collection Analysis (IDCA) program is conducting a proficiency study for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of PETN Class 4. The PETN was found to have: 1) an impact sensitivity (DH50) range of 6 to 12 cm, 2) a BAM friction sensitivity (F50) range 7 to 11 kg, TIL (0/10) of 3.7 to 7.2 kg, 3) a ABL friction sensitivity threshold of 5 or less psig at 8 fps, 4) an ABL ESD sensitivity threshold of 0.031 to 0.326 j/g, and 5) a thermal sensitivity of an endothermic feature with Tmin = ~ 141 °C, and a exothermic feature with a Tmax = ~205°C.

  3. A Latent Class Analysis of Dyadic Perfectionism in a College Sample

    Science.gov (United States)

    Lopez, Frederick G.; Fons-Scheyd, Alia; Bush-King, Imelda; McDermott, Ryon C.

    2011-01-01

    A latent class analysis of dyadic perfectionism scores within a college sample (N = 369) identified four classes of participants. Controlling for gender and current dating status, class membership was associated with significant differences on several measures of relationship attitudes. Gender and class membership also significantly interacted in…

  4. Emotional Psychological and Related Problems among Truant Youths: An Exploratory Latent Class Analysis

    Science.gov (United States)

    Dembo, Richard; Briones-Robinson, Rhissa; Ungaro, Rocio Aracelis; Gulledge, Laura M.; Karas, Lora M.; Winters, Ken C.; Belenko, Steven; Greenbaum, Paul E.

    2012-01-01

    Intervention Project. Results identified two classes of youths: Class 1(n=9) - youths with low levels of delinquency, mental health and substance abuse issues; and Class 2(n=37) - youths with high levels of these problems. Comparison of these two classes on their urine analysis test results and parent/guardian reports of traumatic events found…

  5. Cluster Analysis and Fuzzy Query in Ship Maintenance and Design

    Science.gov (United States)

    Che, Jianhua; He, Qinming; Zhao, Yinggang; Qian, Feng; Chen, Qi

    Cluster analysis and fuzzy query win wide-spread applications in modern intelligent information processing. In allusion to the features of ship maintenance data, a variant of hypergraph-based clustering algorithm, i.e., Correlation Coefficient-based Minimal Spanning Tree(CC-MST), is proposed to analyze the bulky data rooting in ship maintenance process, discovery the unknown rules and help ship maintainers make a decision on various device fault causes. At the same time, revising or renewing an existed design of ship or device maybe necessary to eliminate those device faults. For the sake of offering ship designers some valuable hints, a fuzzy query mechanism is designed to retrieve the useful information from large-scale complicated and reluctant ship technical and testing data. Finally, two experiments based on a real ship device fault statistical dataset validate the flexibility and efficiency of the CC-MST algorithm. A fuzzy query prototype demonstrates the usability of our fuzzy query mechanism.

  6. Steady state subchannel analysis of AHWR fuel cluster

    International Nuclear Information System (INIS)

    Subchannel analysis is a technique used to predict the thermal hydraulic behavior of reactor fuel assemblies. The rod cluster is subdivided into a number of parallel interacting flow subchannels. The conservation equations are solved for each of these subchannels, taking into account subchannel interactions. Subchannel analysis of AHWR D-5 fuel cluster has been carried out to determine the variations in thermal hydraulic conditions of coolant and fuel temperatures along the length of the fuel bundle. The hottest regions within the AHWR fuel bundle have been identified. The effect of creep on the fuel performance has also been studied. MCHFR has been calculated using Jansen-Levy correlation. The calculations have been backed by sensitivity analysis for parameters whose values are not known accurately. The sensitivity analysis showed the calculations to have a very low sensitivity to these parameters. Apart from the analysis, the report also includes a brief introduction of a few subchannel codes. A brief description of the equations and solution methodology used in COBRA-IIIC and COBRA-IV-I is also given. (author)

  7. Analysis of breast cancer progression using principal component analysis and clustering

    Indian Academy of Sciences (India)

    G Alexe; G S Dalgin; S Ganesan; C DeLisi; G Bhanot

    2007-08-01

    We develop a new technique to analyse microarray data which uses a combination of principal components analysis and consensus ensemble -clustering to find robust clusters and gene markers in the data. We apply our method to a public microarray breast cancer dataset which has expression levels of genes in normal samples as well as in three pathological stages of disease; namely, atypical ductal hyperplasia or ADH, ductal carcinoma in situ or DCIS and invasive ductal carcinoma or IDC. Our method averages over clustering techniques and data perturbation to find stable, robust clusters and gene markers. We identify the clusters and their pathways with distinct subtypes of breast cancer (Luminal, Basal and Her2+). We confirm that the cancer phenotype develops early (in early hyperplasia or ADH stage) and find from our analysis that each subtype progresses from ADH to DCIS to IDC along its own specific pathway, as if each was a distinct disease.

  8. Analysis & Prediction of Sales Data in SAP-ERP System using Clustering Algorithms

    OpenAIRE

    Sastry, S. Hanumanth; Babu, Prof. M. S. Prasada

    2013-01-01

    Clustering is an important data mining technique where we will be interested in maximizing intracluster distance and also minimizing intercluster distance. We have utilized clustering techniques for detecting deviation in product sales and also to identify and compare sales over a particular period of time. Clustering is suited to group items that seem to fall naturally together, when there is no specified class for any new item. We have utilizedannual sales data of a steel major to analyze S...

  9. Modified K-means Algorithm for Clustering Analysis of Hainan Green Tangerine Peel

    OpenAIRE

    Luo, Ying; Fu, Haiyan

    2014-01-01

    Part 1: Digital Services International audience K-means is a classic, the division of the clustering algorithm, apply to the classification of the globular data. According to the initial clustering center, this paper comprehensive consideration the characteristics of various Hierarchical cluster algorithms and choose the appropriate Hierarchical cluster algorithm to improve K-means, and combined with Hainan Green Tangerine Peel cluster analysis of data which is compared experiments. The...

  10. Cluster analysis application identifies muscle characteristics of importance for beef tenderness

    Directory of Open Access Journals (Sweden)

    Chriki Sghaier

    2012-12-01

    Full Text Available Abstract Background An important controversy in the relationship between beef tenderness and muscle characteristics including biochemical traits exists among meat researchers. The aim of this study is to explain variability in meat tenderness using muscle characteristics and biochemical traits available in the Integrated and Functional Biology of Beef (BIF-Beef database. The BIF-Beef data warehouse contains characteristic measurements from animal, muscle, carcass, and meat quality derived from numerous experiments. We created three classes for tenderness (high, medium, and low based on trained taste panel tenderness scores of all meat samples consumed (4,366 observations from 40 different experiments. For each tenderness class, the corresponding means for the mechanical characteristics, muscle fibre type, collagen content, and biochemical traits which may influence tenderness of the muscles were calculated. Results Our results indicated that lower shear force values were associated with more tender meat. In addition, muscles in the highest tenderness cluster had the lowest total and insoluble collagen contents, the highest mitochondrial enzyme activity (isocitrate dehydrogenase, the highest proportion of slow oxidative muscle fibres, the lowest proportion of fast-glycolytic muscle fibres, and the lowest average muscle fibre cross-sectional area. Results were confirmed by correlation analyses, and differences between muscle types in terms of biochemical characteristics and tenderness score were evidenced by Principal Component Analysis (PCA. When the cluster analysis was repeated using only muscle samples from m. Longissimus thoracis (LT, the results were similar; only contrasting previous results by maintaining a relatively constant fibre-type composition between all three tenderness classes. Conclusion Our results show that increased meat tenderness is related to lower shear forces, lower insoluble collagen and total collagen content, lower

  11. Multi-class texture analysis in colorectal cancer histology

    Science.gov (United States)

    Kather, Jakob Nikolas; Weis, Cleo-Aron; Bianconi, Francesco; Melchers, Susanne M.; Schad, Lothar R.; Gaiser, Timo; Marx, Alexander; Zöllner, Frank Gerrit

    2016-06-01

    Automatic recognition of different tissue types in histological images is an essential part in the digital pathology toolbox. Texture analysis is commonly used to address this problem; mainly in the context of estimating the tumour/stroma ratio on histological samples. However, although histological images typically contain more than two tissue types, only few studies have addressed the multi-class problem. For colorectal cancer, one of the most prevalent tumour types, there are in fact no published results on multiclass texture separation. In this paper we present a new dataset of 5,000 histological images of human colorectal cancer including eight different types of tissue. We used this set to assess the classification performance of a wide range of texture descriptors and classifiers. As a result, we found an optimal classification strategy that markedly outperformed traditional methods, improving the state of the art for tumour-stroma separation from 96.9% to 98.6% accuracy and setting a new standard for multiclass tissue separation (87.4% accuracy for eight classes). We make our dataset of histological images publicly available under a Creative Commons license and encourage other researchers to use it as a benchmark for their studies.

  12. Clustered Numerical Data Analysis Using Markov Lie Monoid Based Networks

    Science.gov (United States)

    Johnson, Joseph

    2016-03-01

    We have designed and build an optimal numerical standardization algorithm that links numerical values with their associated units, error level, and defining metadata thus supporting automated data exchange and new levels of artificial intelligence (AI). The software manages all dimensional and error analysis and computational tracing. Tables of entities verses properties of these generalized numbers (called ``metanumbers'') support a transformation of each table into a network among the entities and another network among their properties where the network connection matrix is based upon a proximity metric between the two items. We previously proved that every network is isomorphic to the Lie algebra that generates continuous Markov transformations. We have also shown that the eigenvectors of these Markov matrices provide an agnostic clustering of the underlying patterns. We will present this methodology and show how our new work on conversion of scientific numerical data through this process can reveal underlying information clusters ordered by the eigenvalues. We will also show how the linking of clusters from different tables can be used to form a ``supernet'' of all numerical information supporting new initiatives in AI.

  13. Cyber Profiling Using Log Analysis And K-Means Clustering

    Directory of Open Access Journals (Sweden)

    Muhammad Zulfadhilah

    2016-07-01

    Full Text Available The Activities of Internet users are increasing from year to year and has had an impact on the behavior of the users themselves. Assessment of user behavior is often only based on interaction across the Internet without knowing any others activities. The log activity can be used as another way to study the behavior of the user. The Log Internet activity is one of the types of big data so that the use of data mining with K-Means technique can be used as a solution for the analysis of user behavior. This study has been carried out the process of clustering using K-Means algorithm is divided into three clusters, namely high, medium, and low. The results of the higher education institution show that each of these clusters produces websites that are frequented by the sequence: website search engine, social media, news, and information. This study also showed that the cyber profiling had been done strongly influenced by environmental factors and daily activities.

  14. Covariance analysis of differential drag-based satellite cluster flight

    Science.gov (United States)

    Ben-Yaacov, Ohad; Ivantsov, Anatoly; Gurfil, Pini

    2016-06-01

    One possibility for satellite cluster flight is to control relative distances using differential drag. The idea is to increase or decrease the drag acceleration on each satellite by changing its attitude, and use the resulting small differential acceleration as a controller. The most significant advantage of the differential drag concept is that it enables cluster flight without consuming fuel. However, any drag-based control algorithm must cope with significant aerodynamical and mechanical uncertainties. The goal of the current paper is to develop a method for examination of the differential drag-based cluster flight performance in the presence of noise and uncertainties. In particular, the differential drag control law is examined under measurement noise, drag uncertainties, and initial condition-related uncertainties. The method used for uncertainty quantification is the Linear Covariance Analysis, which enables us to propagate the augmented state and filter covariance without propagating the state itself. Validation using a Monte-Carlo simulation is provided. The results show that all uncertainties have relatively small effect on the inter-satellite distance, even in the long term, which validates the robustness of the used differential drag controller.

  15. Dynamical analysis of galaxy cluster merger Abell 2146

    CERN Document Server

    White, J A; King, L J; Lee, B E; Russell, H R; Baum, S A; Clowe, D I; Coleman, J E; Donahue, M; Edge, A C; Fabian, A C; Johnstone, R M; McNamara, B R; ODea, C P; Sanders, J S

    2015-01-01

    We present a dynamical analysis of the merging galaxy cluster system Abell 2146 using spectroscopy obtained with the Gemini Multi-Object Spectrograph on the Gemini North telescope. As revealed by the Chandra X-ray Observatory, the system is undergoing a major merger and has a gas structure indicative of a recent first core passage. The system presents two large shock fronts, making it unique amongst these rare systems. The hot gas structure indicates that the merger axis must be close to the plane of the sky and that the two merging clusters are relatively close in mass, from the observation of two shock fronts. Using 63 spectroscopically determined cluster members, we apply various statistical tests to establish the presence of two distinct massive structures. With the caveat that the system has recently undergone a major merger, the virial mass estimate is M_vir = 8.5 +4.3 -4.7 x 10 ^14 M_sol for the whole system, consistent with the mass determination in a previous study using the Sunyaev-Zeldovich signal....

  16. Elemental Abundance Analysis of the Early Type Members of the Open Cluster M6: Preliminary Results

    CERN Document Server

    Kilicoglu, T; Fossati, L

    2014-01-01

    Differences in chemical composition among main sequence stars within a given cluster are probably due to differences in their masses and other effects such as radiative diffusion, magnetic field, rotation, mixing mechanisms, mass loss, accretion and multiplicity. The early type main-sequence members of open clusters of different ages allow to study the competition between radiative diffusion and mixing mechanisms. We have analysed low and high resolution spectra covering the spectral range 4500 - 5840 Angs. of late B, A, and F type members of the open Cluster M6 (age about 100 Myr). The spectra were obtained using the FLAMES/GIRAFFE spectrograph mounted at UT2, the 8 meter class VLT telescope. The effective temperatures, surface gravities and microturbulent velocities of the stars were derived using both photometric and spectral methods. We have also performed a chemical abundance analysis using synthetic spectra. The abundances of the elements were determined for C, O, Mg, Si, Ca, Sc, Ti, Cr, Mn, Fe, Ni, Y, ...

  17. Convergence Analysis of a Class of Computational Intelligence Approaches

    Directory of Open Access Journals (Sweden)

    Junfeng Chen

    2013-01-01

    Full Text Available Computational intelligence approaches is a relatively new interdisciplinary field of research with many promising application areas. Although the computational intelligence approaches have gained huge popularity, it is difficult to analyze the convergence. In this paper, a computational model is built up for a class of computational intelligence approaches represented by the canonical forms of generic algorithms, ant colony optimization, and particle swarm optimization in order to describe the common features of these algorithms. And then, two quantification indices, that is, the variation rate and the progress rate, are defined, respectively, to indicate the variety and the optimality of the solution sets generated in the search process of the model. Moreover, we give four types of probabilistic convergence for the solution set updating sequences, and their relations are discussed. Finally, the sufficient conditions are derived for the almost sure weak convergence and the almost sure strong convergence of the model by introducing the martingale theory into the Markov chain analysis.

  18. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Årup; Frutiger, Sally A.;

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing. Hum. Brain Mapping 15...

  19. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A;

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing...

  20. Physicochemical properties of different corn varieties by principal components analysis and cluster analysis

    International Nuclear Information System (INIS)

    Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)

  1. [Clustering analysis applied to near-infrared spectroscopy analysis of Chinese traditional medicine].

    Science.gov (United States)

    Liu, Mu-qing; Zhou, De-cheng; Xu, Xin-yuan; Sun, Yao-jie; Zhou, Xiao-li; Han, Lei

    2007-10-01

    The present article discusses the clustering analysis used in the near-infrared (NIR) spectroscopy analysis of Chinese traditional medicines, which provides a new method for the classification of Chinese traditional medicines. Samples selected purposely in the authors' research to measure their absorption spectra in seconds by a multi-channel NIR spectrometer developed in the authors' lab were safrole, eucalypt oil, laurel oil, turpentine, clove oil and three samples of costmary oil from different suppliers. The spectra in the range of 0.70-1.7 microm were measured with air as background and the results indicated that they are quite distinct. Qualitative mathematical model was set up and cluster analysis based on the spectra was carried out through different clustering methods for optimization, and came out the cluster correlation coefficient of 0.9742 in the authors' research. This indicated that cluster analysis of the group of samples is practicable. Also it is reasonable to get the result that the calculated classification of 8 samples was quite accorded with their characteristics, especially the three samples of costmary oil were in the closest classification of the clustering analysis. PMID:18306778

  2. A climatology of surface ozone in the extra tropics: cluster analysis of observations and model results

    Directory of Open Access Journals (Sweden)

    O. A. Tarasova

    2007-08-01

    Full Text Available Important aspects of the seasonal variations of surface ozone are discussed. The underlying analysis is based on the long-term (1990–2004 ozone records of Co-operative Programme for Monitoring and Evaluation of the Long-range Transmission of Air Pollutants in Europe (EMEP and the World Data Center of Greenhouse Gases which do have a strong Northern Hemisphere bias. Seasonal variations are pronounced at most of the 114 locations for any time of the day. Seasonal-diurnal variability classification using hierarchical agglomeration clustering reveals 5 distinct clusters: clean/rural, semi-polluted non-elevated, semi-polluted semi-elevated, elevated and polar/remote marine types. For the cluster "clean/rural" the seasonal maximum is observed in April, both for night and day. For those sites with a double maximum or a wide spring-summer maximum, the one in spring appears both for day and night, while the one in summer is more pronounced for daytime and hence can be attributed to photochemical processes. For the spring maximum photochemistry is a less plausible explanation as no dependence of the maximum timing is observed. More probably the spring maximum is caused by dynamical/transport processes. Using data from the 3-D atmospheric chemistry general circulation model ECHAM5/MESSy1 covering the period of 1998–2005 a comparison has been performed for the identified clusters. For the model data four distinct classes of variability are detected. The majority of cases are covered by the regimes with a spring seasonal maximum or with a broad spring-summer maximum (with prevailing summer. The regime with winter–early spring maximum is reproduced by the model for southern hemispheric locations. Background and semi-polluted sites appear in the model in the same cluster. The seasonality in this model cluster is characterized by a pronounced spring (May maximum. For the model cluster that covers partly semi-elevated semi-polluted sites the role of the

  3. Variations in students' perceived reasons for, sources of, and forms of in-school discrimination: A latent class analysis.

    Science.gov (United States)

    Byrd, Christy M; Carter Andrews, Dorinda J

    2016-08-01

    Although there exists a healthy body of literature related to discrimination in schools, this research has primarily focused on racial or ethnic discrimination as perceived and experienced by students of color. Few studies examine students' perceptions of discrimination from a variety of sources, such as adults and peers, their descriptions of the discrimination, or the frequency of discrimination in the learning environment. Middle and high school students in a Midwestern school district (N=1468) completed surveys identifying whether they experienced discrimination from seven sources (e.g., peers, teachers, administrators), for seven reasons (e.g., gender, race/ethnicity, religion), and in eight forms (e.g., punished more frequently, called names, excluded from social groups). The sample was 52% White, 15% Black/African American, 14% Multiracial, and 17% Other. Latent class analysis was used to cluster individuals based on reported sources of, reasons for, and forms of discrimination. Four clusters were found, and ANOVAs were used to test for differences between clusters on perceptions of school climate, relationships with teachers, perceptions that the school was a "good school," and engagement. The Low Discrimination cluster experienced the best outcomes, whereas an intersectional cluster experienced the most discrimination and the worst outcomes. The results confirm existing research on the negative effects of discrimination. Additionally, the paper adds to the literature by highlighting the importance of an intersectional approach to examining students' perceptions of in-school discrimination.

  4. Variations in students' perceived reasons for, sources of, and forms of in-school discrimination: A latent class analysis.

    Science.gov (United States)

    Byrd, Christy M; Carter Andrews, Dorinda J

    2016-08-01

    Although there exists a healthy body of literature related to discrimination in schools, this research has primarily focused on racial or ethnic discrimination as perceived and experienced by students of color. Few studies examine students' perceptions of discrimination from a variety of sources, such as adults and peers, their descriptions of the discrimination, or the frequency of discrimination in the learning environment. Middle and high school students in a Midwestern school district (N=1468) completed surveys identifying whether they experienced discrimination from seven sources (e.g., peers, teachers, administrators), for seven reasons (e.g., gender, race/ethnicity, religion), and in eight forms (e.g., punished more frequently, called names, excluded from social groups). The sample was 52% White, 15% Black/African American, 14% Multiracial, and 17% Other. Latent class analysis was used to cluster individuals based on reported sources of, reasons for, and forms of discrimination. Four clusters were found, and ANOVAs were used to test for differences between clusters on perceptions of school climate, relationships with teachers, perceptions that the school was a "good school," and engagement. The Low Discrimination cluster experienced the best outcomes, whereas an intersectional cluster experienced the most discrimination and the worst outcomes. The results confirm existing research on the negative effects of discrimination. Additionally, the paper adds to the literature by highlighting the importance of an intersectional approach to examining students' perceptions of in-school discrimination. PMID:27425562

  5. Multivariate cluster analysis of forest fire events in Portugal

    Science.gov (United States)

    Tonini, Marj; Pereira, Mario; Vega Orozco, Carmen; Parente, Joana

    2015-04-01

    Portugal is one of the major fire-prone European countries, mainly due to its favourable climatic, topographic and vegetation conditions. Compared to the other Mediterranean countries, the number of events registered here from 1980 up to nowadays is the highest one; likewise, with respect to the burnt area, Portugal is the third most affected country. Portuguese mapped burnt areas are available from the website of the Institute for the Conservation of Nature and Forests (ICNF). This official geodatabase is the result of satellite measurements starting from the year 1990. The spatial information, delivered in shapefile format, provides a detailed description of the shape and the size of area burnt by each fire, while the date/time information relate to the ignition fire is restricted to the year of occurrence. In terms of a statistical formalism wildfires can be associated to a stochastic point process, where events are analysed as a set of geographical coordinates corresponding, for example, to the centroid of each burnt area. The spatio/temporal pattern of stochastic point processes, including the cluster analysis, is a basic procedure to discover predisposing factorsas well as for prevention and forecasting purposes. These kinds of studies are primarily focused on investigating the spatial cluster behaviour of environmental data sequences and/or mapping their distribution at different times. To include both the two dimensions (space and time) a comprehensive spatio-temporal analysis is needful. In the present study authors attempt to verify if, in the case of wildfires in Portugal, space and time act independently or if, conversely, neighbouring events are also closer in time. We present an application of the spatio-temporal K-function to a long dataset (1990-2012) of mapped burnt areas. Moreover, the multivariate K-function allowed checking for an eventual different distribution between small and large fires. The final objective is to elaborate a 3D

  6. The cosmological analysis of X-ray cluster surveys; III. Bypassing cluster mass measurements

    CERN Document Server

    Pierre, M; Faccioli, L; Clerc, N; Gastaud, R; Koulouridis, E; Pacaud, F

    2016-01-01

    Despite strong theoretical arguments, the use of clusters as cosmological probes is, in practice, frequently questioned because of the many uncertainties impinging on cluster mass estimates. Our aim is to develop a fully self-consistent cosmological approach of X-ray cluster surveys, exclusively based on observable quantities, rather than masses. This procedure is justified given the possibility to directly derive the cluster properties via ab initio modelling, either analytically or by using hydrodynamical simulations. In this third paper, we evaluate the method on cluster toy-catalogues. We model the population of detected clusters in the count-rate -- hardness-ratio -- angular size -- redshift space and compare the corresponding 4-dimensional diagram with theoretical predictions. The best cosmology+physics parameter configuration is determined using a simple minimisation procedure; errors on the parameters are derived by scanning the likelihood hyper-surfaces with a wide range of starting values. The metho...

  7. Assessment of genetic divergence in tomato through agglomerative hierarchical clustering and principal component analysis

    International Nuclear Information System (INIS)

    For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)

  8. Usage of a Responsible Gambling Tool: A Descriptive Analysis and Latent Class Analysis of User Behavior.

    Science.gov (United States)

    Forsström, David; Hesser, Hugo; Carlbring, Per

    2016-09-01

    Gambling is a common pastime around the world. Most gamblers can engage in gambling activities without negative consequences, but some run the risk of developing an excessive gambling pattern. Excessive gambling has severe negative economic and psychological consequences, which makes the development of responsible gambling strategies vital to protecting individuals from these risks. One such strategy is responsible gambling (RG) tools. These tools track an individual's gambling history and supplies personalized feedback and might be one way to decrease excessive gambling behavior. However, research is lacking in this area and little is known about the usage of these tools. The aim of this article is to describe user behavior and to investigate if there are different subclasses of users by conducting a latent class analysis. The user behaviour of 9528 online gamblers who voluntarily used a RG tool was analysed. Number of visits to the site, self-tests made, and advice used were the observed variables included in the latent class analysis. Descriptive statistics show that overall the functions of the tool had a high initial usage and a low repeated usage. Latent class analysis yielded five distinct classes of users: self-testers, multi-function users, advice users, site visitors, and non-users. Multinomial regression revealed that classes were associated with different risk levels of excessive gambling. The self-testers and multi-function users used the tool to a higher extent and were found to have a greater risk of excessive gambling than the other classes.

  9. Reliability analysis of cluster-based ad-hoc networks

    Energy Technology Data Exchange (ETDEWEB)

    Cook, Jason L. [Quality Engineering and System Assurance, Armament Research Development Engineering Center, Picatinny Arsenal, NJ (United States); Ramirez-Marquez, Jose Emmanuel [School of Systems and Enterprises, Stevens Institute of Technology, Castle Point on Hudson, Hoboken, NJ 07030 (United States)], E-mail: Jose.Ramirez-Marquez@stevens.edu

    2008-10-15

    The mobile ad-hoc wireless network (MAWN) is a new and emerging network scheme that is being employed in a variety of applications. The MAWN varies from traditional networks because it is a self-forming and dynamic network. The MAWN is free of infrastructure and, as such, only the mobile nodes comprise the network. Pairs of nodes communicate either directly or through other nodes. To do so, each node acts, in turn, as a source, destination, and relay of messages. The virtue of a MAWN is the flexibility this provides; however, the challenge for reliability analyses is also brought about by this unique feature. The variability and volatility of the MAWN configuration makes typical reliability methods (e.g. reliability block diagram) inappropriate because no single structure or configuration represents all manifestations of a MAWN. For this reason, new methods are being developed to analyze the reliability of this new networking technology. New published methods adapt to this feature by treating the configuration probabilistically or by inclusion of embedded mobility models. This paper joins both methods together and expands upon these works by modifying the problem formulation to address the reliability analysis of a cluster-based MAWN. The cluster-based MAWN is deployed in applications with constraints on networking resources such as bandwidth and energy. This paper presents the problem's formulation, a discussion of applicable reliability metrics for the MAWN, and illustration of a Monte Carlo simulation method through the analysis of several example networks.

  10. Time series clustering analysis of health-promoting behavior

    Science.gov (United States)

    Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

    2013-10-01

    Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

  11. A new approach for computing a flood vulnerability index using cluster analysis

    Science.gov (United States)

    Fernandez, Paulo; Mourato, Sandra; Moreira, Madalena; Pereira, Luísa

    2016-08-01

    A Flood Vulnerability Index (FloodVI) was developed using Principal Component Analysis (PCA) and a new aggregation method based on Cluster Analysis (CA). PCA simplifies a large number of variables into a few uncorrelated factors representing the social, economic, physical and environmental dimensions of vulnerability. CA groups areas that have the same characteristics in terms of vulnerability into vulnerability classes. The grouping of the areas determines their classification contrary to other aggregation methods in which the areas' classification determines their grouping. While other aggregation methods distribute the areas into classes, in an artificial manner, by imposing a certain probability for an area to belong to a certain class, as determined by the assumption that the aggregation measure used is normally distributed, CA does not constrain the distribution of the areas by the classes. FloodVI was designed at the neighbourhood level and was applied to the Portuguese municipality of Vila Nova de Gaia where several flood events have taken place in the recent past. The FloodVI sensitivity was assessed using three different aggregation methods: the sum of component scores, the first component score and the weighted sum of component scores. The results highlight the sensitivity of the FloodVI to different aggregation methods. Both sum of component scores and weighted sum of component scores have shown similar results. The first component score aggregation method classifies almost all areas as having medium vulnerability and finally the results obtained using the CA show a distinct differentiation of the vulnerability where hot spots can be clearly identified. The information provided by records of previous flood events corroborate the results obtained with CA, because the inundated areas with greater damages are those that are identified as high and very high vulnerability areas by CA. This supports the fact that CA provides a reliable FloodVI.

  12. Integrating PROOF Analysis in Cloud and Batch Clusters

    International Nuclear Information System (INIS)

    High Energy Physics (HEP) analysis are becoming more complex and demanding due to the large amount of data collected by the current experiments. The Parallel ROOT Facility (PROOF) provides researchers with an interactive tool to speed up the analysis of huge volumes of data by exploiting parallel processing on both multicore machines and computing clusters. The typical PROOF deployment scenario is a permanent set of cores configured to run the PROOF daemons. However, this approach is incapable of adapting to the dynamic nature of interactive usage. Several initiatives seek to improve the use of computing resources by integrating PROOF with a batch system, such as Proof on Demand (PoD) or PROOF Cluster. These solutions are currently in production at Universidad de Oviedo and IFCA and are positively evaluated by users. Although they are able to adapt to the computing needs of users, they must comply with the specific configuration, OS and software installed at the batch nodes. Furthermore, they share the machines with other workloads, which may cause disruptions in the interactive service for users. These limitations make PROOF a typical use-case for cloud computing. In this work we take profit from Cloud Infrastructure at IFCA in order to provide a dynamic PROOF environment where users can control the software configuration of the machines. The Proof Analysis Framework (PAF) facilitates the development of new analysis and offers a transparent access to PROOF resources. Several performance measurements are presented for the different scenarios (PoD, SGE and Cloud), showing a speed improvement closely correlated with the number of cores used.

  13. Selections of data preprocessing methods and similarity metrics for gene cluster analysis

    Institute of Scientific and Technical Information of China (English)

    YANG Chunmei; WAN Baikun; GAO Xiaofeng

    2006-01-01

    Clustering is one of the major exploratory techniques for gene expression data analysis. Only with suitable similarity metrics and when datasets are properly preprocessed, can results of high quality be obtained in cluster analysis. In this study, gene expression datasets with external evaluation criteria were preprocessed as normalization by line, normalization by column or logarithm transformation by base-2, and were subsequently clustered by hierarchical clustering, k-means clustering and self-organizing maps (SOMs) with Pearson correlation coefficient or Euclidean distance as similarity metric. Finally, the quality of clusters was evaluated by adjusted Rand index. The results illustrate that k-means clustering and SOMs have distinct advantages over hierarchical clustering in gene clustering, and SOMs are a bit better than k-means when randomly initialized. It also shows that hierarchical clustering prefers Pearson correlation coefficient as similarity metric and dataset normalized by line. Meanwhile, k-means clustering and SOMs can produce better clusters with Euclidean distance and logarithm transformed datasets. These results will afford valuable reference to the implementation of gene expression cluster analysis.

  14. CLUSTER ANALYSIS OF NATURAL DISASTER LOSSES IN POLISH AGRICULTURE

    Directory of Open Access Journals (Sweden)

    Grzegorz STRUPCZEWSKI

    2015-04-01

    Full Text Available Agricultural production risk is of special nature due to a great number of hazards, relative weakness of production entities on the market and high ambiguity which is greater than in industrial production. Natural disasters occurring very frequently, at simultaneous low percentage of insured farmers, cause damage of such sizes that force the state to organise current financial aid (for instance in the form of preferential natural disaster loans. This aid is usually not sufficient. On the other hand, regional diversity of the risk level does not positively affect the development of insurance. From the perspective of insurance companies and policymakers it becomes highly important to investigate the spatial structure of losses in agriculture caused by natural disasters. The purpose of the research is to classify the 16 Polish voivodeships into clusters in order to show differences between them according to the criterion of level of damage in agricultural farms caused by natural disasters. On the basis of the cluster analysis it was demonstrated that 11 voivodeships form quite a homogeneous group in terms of size of damage in agriculture (the value of damage in cultivations and the acreage of destroyed cultivations are two most important factors determining affiliation to the cluster, however, the profile of loss occurring in other five voivodeships has a very individual course and requires separate handling in the actuarial sense. It was also proved that high value of losses in agriculture in the absolute sense in given voivodeships do not have to mean high vulnerability of agricultural farms from these voivodeships to natural risks.

  15. An Analysis of Social Class Classification Based on Linguistic Variables

    Institute of Scientific and Technical Information of China (English)

    QU Xia-sha

    2016-01-01

    Since language is an influential tool in social interaction, the relationship of speech and social factors, such as social class, gender, even age is worth studying. People employ different linguistic variables to imply their social class, status and iden-tity in the social interaction. Thus the linguistic variation involves vocabulary, sounds, grammatical constructions, dialects and so on. As a result, a classification of social class draws people’s attention. Linguistic variable in speech interactions indicate the social relationship between people. This paper attempts to illustrate three main linguistic variables which influence the social class, and further sociolinguistic studies need to be more concerned about.

  16. Maltreatment and Mental Health Outcomes among Ultra-Poor Children in Burkina Faso: A Latent Class Analysis

    Science.gov (United States)

    Ismayilova, Leyla; Gaveras, Eleni; Blum, Austin; Tô-Camier, Alexice; Nanema, Rachel

    2016-01-01

    Objectives Research about the mental health of children in Francophone West Africa is scarce. This paper examines the relationships between adverse childhood experiences, including exposure to violence and exploitation, and mental health outcomes among children living in ultra-poverty in rural Burkina Faso. Methods This paper utilizes baseline data collected from 360 children ages 10–15 and 360 of their mothers recruited from twelve impoverished villages in the Nord Region of Burkina, located near the Sahel Desert and affected by extreme food insecurity. We used a Latent Class Analysis to identify underlying patterns of maltreatment. Further, the relationships between latent classes and mental health outcomes were tested using mixed effected regression models adjusted for clustering within villages. Results About 15% of the children in the study scored above the clinical cut-off for depression, 17.8% for posttraumatic stress disorder (PTSD), and 6.4% for low self-esteem. The study identified five distinct sub-groups (or classes) of children based on their exposure to adverse childhood experiences. Children with the highest exposure to violence at home, at work and in the community (Abused and Exploited class) and children not attending school and working for other households, often away from their families (External Laborer class), demonstrated highest symptoms of depression and trauma. Despite living in adverse conditions and working to assist families, the study also identified a class of children who were not exposed to any violence at home or at work (Healthy and Non-abused class). Children in this class demonstrated significantly higher self-esteem (b = 0.92, SE = 0.45, pfamily-level poverty and violence in the family. PMID:27764155

  17. Analysis of College Classes Based on U-CLASS System Using Personal Mobile Nodes

    OpenAIRE

    Chonggun Kim; Jeongmi Kim; Hohwan Park; Ilkyu Ha

    2015-01-01

    The increase in mobile communications has led to advanced educational methods and technologies through accepting new technologies. A lot of studies have tried to overcome temporal and spatial limits on using personal mobile devices and have tried to increase learning effectiveness using various efficient educational methods. In this paper, U-CLASS, an interactive learning management system that provides interactive communications between a professor and students, is proposed and implemented. ...

  18. Ranking and clustering of search results: Analysis of Similarity graph

    OpenAIRE

    Shevchuk, Ksenia Alexander

    2008-01-01

    Evaluate the clustering of the similarity matrix and confirm that it is high. Compare the ranking results of the eigenvector ranking and the Link Popularity ranking and confirm for the high clustered graph the correlation between those is larger than for the low clustered graph.

  19. Maximum-entropy clustering algorithm and its global convergence analysis

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Constructing a batch of differentiable entropy functions touniformly approximate an objective function by means of the maximum-entropy principle, a new clustering algorithm, called maximum-entropy clustering algorithm, is proposed based on optimization theory. This algorithm is a soft generalization of the hard C-means algorithm and possesses global convergence. Its relations with other clustering algorithms are discussed.

  20. Differences in the expressed HLA class I alleles effect the differential clustering of HIV type 1-specific T cell responses in infected Chinese and Caucasians

    Institute of Scientific and Technical Information of China (English)

    Yu,XG; Addo,MM; Perkins,BA; Wej,FL; Rathod,A; Geer,SC; Parta,M; Cohen,D; Stone,DR; Russell,CJ; Tanzi,G; Mei,S; Wureel,AG; Frahm,N; Lichterfeld,M; Heath,L; Mullins,JI; Marincola,F; Goulder,PJR; Brander,C; Allen,T; Cao,YZ; Walker,BD; Altfeld,M

    2005-01-01

    China is a region of the world with a rapidly spreading HIV-1 epidemic. Studies providing insights into HIV-1 pathogenesis in infected Chinese are urgently needed to support the design and testing of an effective HIV-1 vaccine for this population. HIV-1-specific T cell responses were characterized in 32 HIV-1-infected individuals of Chinese origin and compared to 34 infected caucasians using 410 overlapping peptides spanning the entire HIV-1 clade B consensus sequence in an IFN-gamma ELISpot assay. All HIV-1 proteins were targeted with similar frequency in both populations and all study subjects recognized at least one overlapping peptide. HIV-1-specific T cell responses clustered in seven different regions of the HIV-1 genome in the Chinese cohort and in nine different regions in the caucasian cohort. The dominant HLA class I alleles expressed in the two populations differed significantly, and differences in epitope clustering pattern were shown to be influenced by differences in class I alleles that restrict immunodominant epitopes. These studies demonstrate that the clustering of HIV-1-specific T cell responses is influenced by the genetic HLA class I background in the study populations. The design and testing of candidate vaccines to fight the rapidly growing HIV-1 epidemic must therefore take the HLA genetics of the population into account as specific regions of the virus can be expected to be differentially targeted in ethnically diverse populations.

  1. Optimum Metallic-Bond Scheme: A Quantitative Analysis of Mass Spectra of Sodium Clusters

    Institute of Scientific and Technical Information of China (English)

    苏长荣; 李家明

    2001-01-01

    Based on the results of the optimum metallic-bond scheme for sodium clusters, we present a quantitative analysis of the detailed features of the mass spectra of sodium clusters. We find that, in the generation of sodium clusters with various abundances, the quasi-steady processes through adding or losing a sodium atom dominate. The quasi-steady processes through adding or losing a sodium dimer are also important to understand the detailed features of mass spectra for small clusters.

  2. Substance use predictors of victimization profiles among homeless youth: a latent class analysis.

    Science.gov (United States)

    Bender, Kimberly; Thompson, Sanna; Ferguson, Kristin; Langenderfer, Lisa

    2014-02-01

    Although a substantial body of literature demonstrates high prevalence of street victimization among homeless youth, few studies have investigated the existence of victimization classes that differ on the type and frequency of victimization experienced. Nor do we know how substance use patterns relate to victimization classes. Using latent class analysis (LCA), we examined the existence of victimization classes of homeless youth and investigated substance use predictors of class membership utilizing a large purposive sample (N=601) recruited from homeless youth-serving host agencies in three disparate regions of the U.S. Results of the LCA suggest the presence of three distinct victimization profiles - youth fit into a low-victimization class, a witness class, or a high-victimization class. These three victimization classes demonstrated differences in their substance use, including rates of substance abuse/dependence on alcohol and/or drugs. The presence of distinct victimization profiles suggests the need for screening and referral for differential services. PMID:24439621

  3. Analysis on a General Class of Holographic Dark Energy Models

    CERN Document Server

    Huang, Zhuo-Peng

    2012-01-01

    We present a detail analysis on a general class of holographic dark energy models characterized by the length scale $L=\\frac1{a^n(t)}\\int_0^t dt' a^m(t')$. We show that $n \\geq 0$ is required by the recent cosmic accelerated expansion of universe. In the early universe dominated by the constituent with constant equation of state $w_m$, we have $w_{de}\\simeq -1-\\frac{2n}{3}$ for $n \\geq 0$ and $m m \\geq 0$. The models with $n > m \\geq 0$ become single-parameter models like the $\\Lambda$CDM model due to the analytic feature $\\Omega_{de}\\simeq \\frac{d^2}4(2m+3w_m+3)^2a^{2(n-m)}$ at radiation- and matter-dominated epoch. Whereas the cases $n=m\\geq 0$ should be abandoned as the dark energy cannot dominate the universe forever and there might be too large fraction of dark energy in early universe, and the cases $m> n \\geq 0$ are forbidden by the self-consistent requirement $\\Omega_{de}\\ll1 $ in the early universe. Thus a detailed study on the single-parameter models corresponding to cases $n >m \\geq 0$ is carried o...

  4. A New Class of Macrocyclic Chiral Selectors for Stereochemical Analysis

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1999-03-11

    This report summarizes the work accomplished in the authors laboratories over the previous three years. During the funding period they have had 23 monographs published or in press, 1 book chapter, 1 patent issued and have delivered 28 invited seminars or plenary lectures on DOE sponsored research. This report covers the work that has been published (or accepted). The most notable aspect of this work involves the successful development and understanding of a new class of fused macrocyclic compounds as pseudophases and selectors in high performance separations (including high performance liquid chromatography, HPLC; capillary electrophoresis, CE; and thin layer chromatography, TLC). They have considerably extended their chiral biomarker work from amber to crude oil and coal. In the process of doing this we've developed several novel separation approaches. They finished their work on the new GSC-PLOT column which is now being used by researchers world-wide for the analysis of gases, light hydrocarbons and halocarbons. Finally, we completed basic studies on immobilizing a cyclodextrin/oligosiloxane hybrid on the wall of fused silica, as well as a basic study on the separation behavior of buckminster fullerene and higher fullerenes.

  5. A COMPARISON BETWEEN SINGLE LINKAGE AND COMPLETE LINKAGE IN AGGLOMERATIVE HIERARCHICAL CLUSTER ANALYSIS FOR IDENTIFYING TOURISTS SEGMENTS

    OpenAIRE

    Noor Rashidah Rashid

    2012-01-01

    Cluster Analysis is a multivariate method in statistics. Agglomerative Hierarchical Cluster Analysis is one of approaches in Cluster Analysis. There are two linkage methods in Agglomerative Hierarchical Cluster Analysis which are Single Linkage and Complete Linkage. The purpose of this study is to compare between Single Linkage and Complete Linkage in Agglomerative Hierarchical Cluster Analysis. The comparison of performances between these linkage methods was shown by using Kruskal-Wallis tes...

  6. A Latent Class Analysis of Heterosexual Young Men's Masculinities.

    Science.gov (United States)

    Casey, Erin A; Masters, N Tatiana; Beadnell, Blair; Wells, Elizabeth A; Morrison, Diane M; Hoppe, Marilyn J

    2016-07-01

    Parallel bodies of research have described the diverse and complex ways that men understand and construct their masculine identities (often termed "masculinities") and, separately, how adherence to traditional notions of masculinity places men at risk for negative sexual and health outcomes. The goal of this analysis was to bring together these two streams of inquiry. Using data from a national, online sample of 555 heterosexually active young men, we employed latent class analysis (LCA) to detect patterns of masculine identities based on men's endorsement of behavioral and attitudinal indicators of "dominant" masculinity, including sexual attitudes and behaviors. LCA identified four conceptually distinct masculine identity profiles. Two groups, termed the Normative and Normative/Male Activities groups, respectively, constituted 88 % of the sample and were characterized by low levels of adherence to attitudes, sexual scripts, and behaviors consistent with "dominant" masculinity, but differed in their levels of engagement in male-oriented activities (e.g., sports teams). Only eight percent of the sample comprised a masculinity profile consistent with "traditional" ideas about masculinity; this group was labeled Misogynistic because of high levels of sexual assault and violence toward female partners. The remaining four percent constituted a Sex-Focused group, characterized by high numbers of sexual partners, but relatively low endorsement of other indicators of traditional masculinity. Follow-up analyses showed a small number of differences across groups on sexual and substance use health indicators. Findings have implications for sexual and behavioral health interventions and suggest that very few young men embody or endorse rigidly traditional forms of masculinity. PMID:26496914

  7. A Latent Class Analysis of Heterosexual Young Men's Masculinities.

    Science.gov (United States)

    Casey, Erin A; Masters, N Tatiana; Beadnell, Blair; Wells, Elizabeth A; Morrison, Diane M; Hoppe, Marilyn J

    2016-07-01

    Parallel bodies of research have described the diverse and complex ways that men understand and construct their masculine identities (often termed "masculinities") and, separately, how adherence to traditional notions of masculinity places men at risk for negative sexual and health outcomes. The goal of this analysis was to bring together these two streams of inquiry. Using data from a national, online sample of 555 heterosexually active young men, we employed latent class analysis (LCA) to detect patterns of masculine identities based on men's endorsement of behavioral and attitudinal indicators of "dominant" masculinity, including sexual attitudes and behaviors. LCA identified four conceptually distinct masculine identity profiles. Two groups, termed the Normative and Normative/Male Activities groups, respectively, constituted 88 % of the sample and were characterized by low levels of adherence to attitudes, sexual scripts, and behaviors consistent with "dominant" masculinity, but differed in their levels of engagement in male-oriented activities (e.g., sports teams). Only eight percent of the sample comprised a masculinity profile consistent with "traditional" ideas about masculinity; this group was labeled Misogynistic because of high levels of sexual assault and violence toward female partners. The remaining four percent constituted a Sex-Focused group, characterized by high numbers of sexual partners, but relatively low endorsement of other indicators of traditional masculinity. Follow-up analyses showed a small number of differences across groups on sexual and substance use health indicators. Findings have implications for sexual and behavioral health interventions and suggest that very few young men embody or endorse rigidly traditional forms of masculinity.

  8. Definition of a family of tissue-protective cytokines using functional cluster analysis: a proof-of-concept study

    Directory of Open Access Journals (Sweden)

    Manuela eMengozzi

    2014-03-01

    Full Text Available The discovery of the tissue-protective activities of erythropoietin (EPO has underlined the importance of some cytokines in tissue protection, repair and remodeling. As such activities have been reported for other cytokines, we asked whether we could define a class of tissue-protective cytokines. We therefore explored a novel approach based on functional clustering. In this pilot study, we started by analyzing a small number of cytokines (30. We functionally classified the 30 cytokines according to their interactions by using the bioinformatics tool STRING (Search Tool for the Retrieval of Interacting Genes, followed by hierarchical cluster analysis. The results of this functional clustering were different from those obtained by clustering cytokines simply according to their sequence. We previously reported that the protective activity of EPO in a model of cerebral ischemia was paralleled by an upregulation of synaptic plasticity genes, particularly early growth response 2 (EGR2. To assess the predictivity of functional clustering, we tested some of the cytokines clustering close to EPO (interleukin-11, IL-11; kit ligand, KITLG; leukemia inhibitory factor, LIF; thrombopoietin, THPO in an in vitro model of human neuronal cells for their ability to induce EGR2. Two of these, LIF and IL-11, induced EGR2 expression. Although these data would need to be extended to a larger number of cytokines and the biological validation should be done using more robust in vivo models, rather then just one cell line, this study shows the feasibility of this approach. This type of functional cluster analysis could be extended to other fields of cytokine research and help design biological experiments.

  9. PIXE cluster analysis of ancient ceramics from North Syria

    Energy Technology Data Exchange (ETDEWEB)

    Kieft, I.E.; Jamieson, D.N. E-mail: dnj@physics.unimelb.edu.au; Rout, B.; Szymanski, R.; Jamieson, A.S

    2002-05-01

    Tell Ahmar is a place situated on the east bank of the Euphrates river, near the Turkish border. The site was well known as a major trade centre in the Iron Age. From the many potsherds excavated from the site, it is necessary to distinguish pottery imported from outside from that made locally. Therefore a sample of the Iron Age potsherds that were excavated from this site was analyzed with particle induced X-ray emission to identify the characteristic composition of the different sherds. Potsherds from four other places near Tell Ahmar were also analyzed. The samples were irradiated with a scanned 3 MeV proton beam in the Melbourne nuclear microprobe. The composition of all sherds measured by this method was similar. However, cluster analysis of the 12 most abundant elements, ranging from Mn to Ba, revealed that the samples known to be from Tell Ahmar could be distinguished from those known to be from elsewhere.

  10. Higgs pair production: choosing benchmarks with cluster analysis

    Science.gov (United States)

    Carvalho, Alexandra; Dall'Osso, Martino; Dorigo, Tommaso; Goertz, Florian; Gottardo, Carlo A.; Tosi, Mia

    2016-04-01

    New physics theories often depend on a large number of free parameters. The phenomenology they predict for fundamental physics processes is in some cases drastically affected by the precise value of those free parameters, while in other cases is left basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics predicted by different models; a clustering algorithm using that metric may allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmarks are then guaranteed to be sensitive to a large area of the parameter space.

  11. New analysis in the field of open cluster Collinder 223

    OpenAIRE

    Tadross, A. L.

    2003-01-01

    The present study of the open cluster Collinder 223 (Cr 223) has been mainly depended on the photoelectric data of Claria & Lapasset (1991; hereafter CL91). This data of CL91 has been used with the cluster's image of AAO-DSS in order to re-investigate and improve the main parameters of Cr 223. Stellar count has been achieved to determine the stellar density, the cluster's center and the cluster's diameter. In addition, the luminosity function, mass function, and the total mass of the cluster ...

  12. AUTOMATED TEXT CLUSTERING OF NEWSPAPER AND SCIENTIFIC TEXTS IN BRAZILIAN PORTUGUESE: ANALYSIS AND COMPARISON OF METHODS

    Directory of Open Access Journals (Sweden)

    Alexandre Ribeiro Afonso

    2014-10-01

    Full Text Available This article reports the findings of an empirical study about Automated Text Clustering applied to scientific articles and newspaper texts in Brazilian Portuguese, the objective was to find the most effective computational method able to cluster the input of texts in their original groups. The study covered four experiments, each experiment had four procedures: 1. Corpus Selections (a set of texts is selected for clustering, 2. Word Class Selections (Nouns, Verbs and Adjectives are chosen from each text by using specific algorithms, 3. Filtering Algorithms (a set of terms is selected from the results of the preview stage, a semantic weight is also inserted for each term and an index is generated for each text, 4. Clustering Algorithms (the clustering algorithms Simple K-Means, sIB and EM are applied to the indexes. After those procedures, clustering correctness and clustering time statistical results were collected. The sIB clustering algorithm is the best choice for both scientific and newspaper corpus, under the condition that the sIB clustering algorithm asks for the number of clusters as input before running (for the newspaper corpus, 68.9% correctness in 1 minute and for the scientific corpus, 77.8% correctness in 1 minute. The EM clustering algorithm additionally guesses the number of clusters without user intervention, but its best case is less than 53% correctness. Considering the experiments carried out, the results of human text classification and automated clustering are distant; it was also observed that the clustering correctness results vary according to the number of input texts and their topics.

  13. Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data.

    Directory of Open Access Journals (Sweden)

    Marco Borri

    Full Text Available To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment.The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4. Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters.The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4, determined with cluster validation, produced the best separation between reducing and non-reducing clusters.The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.

  14. Clustered data analysis under miscategorized ordinal outcomes and missing covariates.

    Science.gov (United States)

    Roy, Surupa; Rana, Subrata; Das, Kalyan

    2016-08-15

    The primary objective in this article is to look into the analysis of clustered ordinal model where complete information on one or more covariates cease to occur. In addition, we also focus on the analysis of miscategorized data that occur in many situations as outcomes are often classified into a category that does not truly reflect its actual state. A general model structure is assumed to accommodate the information that is obtained via surrogate variables. The theoretical motivation actually developed while encountering an orthodontic data to investigate the effects of age, sex and food habit on the extent of plaque deposit. The model we propose is quite flexible and is capable of tackling those additional noises like miscategorization and missingness, which occur in the data most frequently. A new two-step approach has been proposed to estimate the parameters of model framed. A rigorous simulation study has also been carried out to justify the validity of the model taken up for analysis. Copyright © 2015 John Wiley & Sons, Ltd. PMID:26215983

  15. Cluster analysis application in research on pork quality determinants

    Science.gov (United States)

    Przybylski, W.; Wasiewicz, P.; Zieliński, P.; Gromadzka-Ostrowska, J.; Olczak, E.; Jaworska, D.; Niemyjski, S.; Santé-Lhoutellier, V.

    2010-09-01

    In this paper data mining methods were applied to investigate features determining high quality pork meat. The aim of the study was analysis of conditionality of the pork meat quality defined in coherence with HDL and LDL cholesterol concentration, plasma leptin, triglycerides, plasma glucose and serum. The research was carried out on 54 pigs. originated from crossbreeding of Naima sows with P76-PenArLan boars hybrids line. Meat quality parameters were evaluated in samples derived from the Longissimus (LD) muscle taken behind the last rib on the basis: the pH value, meat colour, drip loss, the RTN, intramuscular fat and glycolytic potential. The results of this study were elaborated by using R environment and show that cluster and regression analysis can be a useful tool for in-depth analysis of the determinants of the quality of pig meat in homogeneous populations of pigs. However, the question of determinants of the level of glycogen and fat in meat requires further research.

  16. Analysis of gene expression data from non-small cell lung carcinoma cell lines reveals distinct sub-classes from those identified at the phenotype level.

    Directory of Open Access Journals (Sweden)

    Andrew R Dalby

    Full Text Available Microarray data from cell lines of Non-Small Cell Lung Carcinoma (NSCLC can be used to look for differences in gene expression between the cell lines derived from different tumour samples, and to investigate if these differences can be used to cluster the cell lines into distinct groups. Dividing the cell lines into classes can help to improve diagnosis and the development of screens for new drug candidates. The micro-array data is first subjected to quality control analysis and then subsequently normalised using three alternate methods to reduce the chances of differences being artefacts resulting from the normalisation process. The final clustering into sub-classes was carried out in a conservative manner such that sub-classes were consistent across all three normalisation methods. If there is structure in the cell line population it was expected that this would agree with histological classifications, but this was not found to be the case. To check the biological consistency of the sub-classes the set of most strongly differentially expressed genes was be identified for each pair of clusters to check if the genes that most strongly define sub-classes have biological functions consistent with NSCLC.

  17. Classification of persons attempting suicide. A review of cluster analysis research

    Directory of Open Access Journals (Sweden)

    Wołodźko, Tymoteusz

    2014-08-01

    Full Text Available Aim: Review of conclusions from cluster analysis research on suicide risk factors published after the year 1993. Methods: Search and analysis of cluster analysis research papers on suicidal behaviour. Results: Following groups where distinguished: (1 persons with comorbid mental disorders or with severe symptoms, (2 persons without mental disorders or with mild symptoms, (3 persons with personality disorders and externalizing psychopathology, (4 socially withdrawn persons with a tendency to avoid social contacts, (5 depressive persons Conclusions: Analysis of studies on characteristics of suicide attempters, with the application of cluster analysis, has indicated the possibility of differentiation of several groups of persons with significantly increased risk of suicide attempt. The reviewed cluster analysis research had multiple methodological limitations. Studies employing cluster analysis on large, representative and homogeneous population are needed.

  18. Profiles of exercise motivation, physical activity, exercise habit, and academic performance in Malaysian adolescents: A cluster analysis

    OpenAIRE

    Hairul Anuar Hashim; Freddy Golok; Rosmatunisah Ali

    2011-01-01

    Objectives: This study examined Malaysian adolescents’ profiles of exercise motivation, exercise habit strength, academic performance, and levels of physical activity (PA) using cluster analysis.Methods: The sample (n = 300) consisted of 65.6% males and 34.4% females with a mean age of 13.40 ± 0.49. Statistical analysis was performed using cluster analysis.Results: Cluster analysis revealed three distinct cluster groups. Cluster 1 is characterized by a moderate level of PA, relatively high in...

  19. Cluster analysis of indermediate deep events in the southeastern Aegean

    Science.gov (United States)

    Ruscic, Marija; Becker, Dirk; Brüstle, Andrea; Meier, Thomas

    2015-04-01

    The Hellenic subduction zone (HSZ) is the seismically most active region in Europe where the oceanic African litosphere is subducting beneath the continental Aegean plate. Although there are numerous studies of seismicity in the HSZ, very few focus on the eastern HSZ and the Wadati-Benioff-Zone of the subducting slab in that part of the HSZ. In order to gain a better understanding of the geodynamic processes in the region a dense local seismic network is required. From September 2005 to March 2007, the temporary seismic network EGELADOS has been deployed covering the entire HSZ. It consisted of 56 onshore and 23 offshore broadband stations with addition of 19 stations from GEOFON, NOA and MedNet to complete the network. Here, we focus on a cluster of intermediate deep seismicity recorded by the EGELADOS network within the subducting African slab in the region of the Nysiros volcano. The cluster consists of 159 events at 80 to 190 km depth with magnitudes between 0.2 and 4.1 that were located using nonlinear location tool NonLinLoc. A double-difference earthquake relocation using the HypoDD software is performed with both manual readings of onset times and differential traveltimes obtained by separate cross correlation of P- and S-waveforms. Single event locations are compared to relative relocations. The event hypocenters fall into a thin zone close to the top of the slab defining its geometry with an accuracy of a few kilometers. At intermediate depth the slab is dipping towards the NW at an angle of about 30°. That means it is dipping steeper than in the western part of the HSZ. The edge of the slab is clearly defined by an abrupt disappearance of intermediate depths seismicity towards the NE. It is found approximately beneath the Turkish coastline. Furthermore, results of a cluster analysis based on the cross correlation of three-component waveforms are shown as a function of frequency and the spatio-temporal migration of the seismic activity is analysed.

  20. AVES: A Computer Cluster System approach for INTEGRAL Scientific Analysis

    Science.gov (United States)

    Federici, M.; Martino, B. L.; Natalucci, L.; Umbertini, P.

    The AVES computing system, based on an "Cluster" architecture is a fully integrated, low cost computing facility dedicated to the archiving and analysis of the INTEGRAL data. AVES is a modular system that uses the software resource manager (SLURM) and allows almost unlimited expandibility (65,536 nodes and hundreds of thousands of processors); actually is composed by 30 Personal Computers with Quad-Cores CPU able to reach the computing power of 300 Giga Flops (300x10{9} Floating point Operations Per Second), with 120 GB of RAM and 7.5 Tera Bytes (TB) of storage memory in UFS configuration plus 6 TB for users area. AVES was designed and built to solve growing problems raised from the analysis of the large data amount accumulated by the INTEGRAL mission (actually about 9 TB) and due to increase every year. The used analysis software is the OSA package, distributed by the ISDC in Geneva. This is a very complex package consisting of dozens of programs that can not be converted to parallel computing. To overcome this limitation we developed a series of programs to distribute the workload analysis on the various nodes making AVES automatically divide the analysis in N jobs sent to N cores. This solution thus produces a result similar to that obtained by the parallel computing configuration. In support of this we have developed tools that allow a flexible use of the scientific software and quality control of on-line data storing. The AVES software package is constituted by about 50 specific programs. Thus the whole computing time, compared to that provided by a Personal Computer with single processor, has been enhanced up to a factor 70.

  1. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  2. A generalized analysis of hydrophobic and loop clusters within globular protein sequences

    OpenAIRE

    Mornon Jean-Paul; Delettré Jean; Le Tuan Khanh; Eudes Richard; Callebaut Isabelle

    2007-01-01

    Abstract Background Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These spec...

  3. A functional clustering algorithm for the analysis of neural relationships

    CERN Document Server

    Feldt, S; Hetrick, V L; Berke, J D; Zochowski, M

    2008-01-01

    We formulate a novel technique for the detection of functional clusters in neural data. In contrast to prior network clustering algorithms, our procedure progressively combines spike trains and derives the optimal clustering cutoff in a simple and intuitive manner. To demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. We observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.

  4. A substructure analysis of the A3558 cluster complex

    OpenAIRE

    Bardelli, S.; Pisani, A; Ramella, M.; Zucca, E.; Zamorani, G.

    1998-01-01

    The "algorithm driven by the density estimate for the identification of clusters" (DEDICA, Pisani 1993, 1996) is applied to the A3558 cluster complex in order to find substructures. This complex, located at the center of the Shapley Concentration supercluster, is a chain formed by the ACO clusters A3556, A3558 and A3562 and the two poor clusters SC 1327-312 and SC 1329-313. We find a large number of clumps, indicating that strong dynamical processes are active. In particular, it is necessary ...

  5. Stochastic analysis of the extra clustering model for animal grouping.

    Science.gov (United States)

    Drmota, Michael; Fuchs, Michael; Lee, Yi-Wen

    2016-07-01

    We consider the extra clustering model which was introduced by Durand et al. (J Theor Biol 249(2):262-270, 2007) in order to describe the grouping of social animals and to test whether genetic relatedness is the main driving force behind the group formation process. Durand and François (J Math Biol 60(3):451-468, 2010) provided a first stochastic analysis of this model by deriving (amongst other things) asymptotic expansions for the mean value of the number of groups. In this paper, we will give a much finer analysis of the number of groups. More precisely, we will derive asymptotic expansions for all higher moments and give a complete characterization of the possible limit laws. In the most interesting case (neutral model), we will prove a central limit theorem with a surprising normalization. In the remaining cases, the limit law will be either a mixture of a discrete and continuous law or a discrete law. Our results show that, except of in degenerate cases, strong concentration around the mean value takes place only for the neutral model, whereas in the remaining cases there is also mass concentration away from the mean. PMID:26520857

  6. Study on Cluster Analysis Used with Laser-Induced Breakdown Spectroscopy

    Science.gov (United States)

    He, Li'ao; Wang, Qianqian; Zhao, Yu; Liu, Li; Peng, Zhong

    2016-06-01

    Supervised learning methods (eg. PLS-DA, SVM, etc.) have been widely used with laser-induced breakdown spectroscopy (LIBS) to classify materials; however, it may induce a low correct classification rate if a test sample type is not included in the training dataset. Unsupervised cluster analysis methods (hierarchical clustering analysis, K-means clustering analysis, and iterative self-organizing data analysis technique) are investigated in plastics classification based on the line intensities of LIBS emission in this paper. The results of hierarchical clustering analysis using four different similarity measuring methods (single linkage, complete linkage, unweighted pair-group average, and weighted pair-group average) are compared. In K-means clustering analysis, four kinds of choosing initial centers methods are applied in our case and their results are compared. The classification results of hierarchical clustering analysis, K-means clustering analysis, and ISODATA are analyzed. The experiment results demonstrated cluster analysis methods can be applied to plastics discrimination with LIBS. supported by Beijing Natural Science Foundation of China (No. 4132063)

  7. Putting Bourdieu to work for class analysis: reflections on some recent contributions.

    Science.gov (United States)

    Flemmen, Magne

    2013-06-01

    Recent developments in class analysis, particularly associated with so-called 'cultural class analysis'; have seen the works of Pierre Bourdieu take centre stage. Apart from the general influence of 'habitus' and 'cultural capital', some scholars have tried to reconstruct class analysis with concepts drawn from Bourdieu. This involves a theoretical reorientation, away from the conventional concerns of class analysis with property and market relations, towards an emphasis on the multiple forms of capital. Despite the significant potential of these developments, such a reorientation dismisses or neglects the relations of power and domination founded in the economic institutions of capitalism as a crucial element of what class is. Through a critique of some recent attempts by British authors to develop a 'Bourdieusian' class theory, the paper reasserts the centrality of the relations of power and domination that used to be the domain of class analysis. The paper suggests some elements central to a reworked class analysis that benefits from the power of Bourdieu's ideas while retaining a perspective on the fundamentals of class relations in capitalism.

  8. Putting Bourdieu to work for class analysis: reflections on some recent contributions.

    Science.gov (United States)

    Flemmen, Magne

    2013-06-01

    Recent developments in class analysis, particularly associated with so-called 'cultural class analysis'; have seen the works of Pierre Bourdieu take centre stage. Apart from the general influence of 'habitus' and 'cultural capital', some scholars have tried to reconstruct class analysis with concepts drawn from Bourdieu. This involves a theoretical reorientation, away from the conventional concerns of class analysis with property and market relations, towards an emphasis on the multiple forms of capital. Despite the significant potential of these developments, such a reorientation dismisses or neglects the relations of power and domination founded in the economic institutions of capitalism as a crucial element of what class is. Through a critique of some recent attempts by British authors to develop a 'Bourdieusian' class theory, the paper reasserts the centrality of the relations of power and domination that used to be the domain of class analysis. The paper suggests some elements central to a reworked class analysis that benefits from the power of Bourdieu's ideas while retaining a perspective on the fundamentals of class relations in capitalism. PMID:23713562

  9. The association between school exclusion, delinquency and subtypes of cyber- and F2F-victimizations: identifying and predicting risk profiles and subtypes using latent class analysis.

    Science.gov (United States)

    Barboza, Gia Elise

    2015-01-01

    This purpose of this paper is to identify risk profiles of youth who are victimized by on- and offline harassment and to explore the consequences of victimization on school outcomes. Latent class analysis is used to explore the overlap and co-occurrence of different clusters of victims and to examine the relationship between class membership and school exclusion and delinquency. Participants were a random sample of youth between the ages of 12 and 18 selected for inclusion to participate in the 2011 National Crime Victimization Survey: School Supplement. The latent class analysis resulted in four categories of victims: approximately 3.1% of students were highly victimized by both bullying and cyberbullying behaviors; 11.6% of youth were classified as being victims of relational bullying, verbal bullying and cyberbullying; a third class of students were victims of relational bullying, verbal bullying and physical bullying but were not cyberbullied (8%); the fourth and final class, characteristic of the majority of students (77.3%), was comprised of non-victims. The inclusion of covariates to the latent class model indicated that gender, grade and race were significant predictors of at least one of the four victim classes. School delinquency measures were included as distal outcomes to test for both overall and pairwise associations between classes. With one exception, the results were indicative of a significant relationship between school delinquency and the victim subtypes. Implications for these findings are discussed. PMID:25194718

  10. An Analysis of the Nature of Classroom Activities: A Comparative Study of an Immersion English Class and a Non-Immersion English Class in the Mainland of China

    Science.gov (United States)

    Liang, Xiaohua

    2011-01-01

    This study was designed to investigate the nature of activities in an immersion English class and a non-immersion English class in the mainland of China, and to find out the differences between these two types of class through data gained from observation and interviews. Spoken discourse analysis was used to analyze the data, where Engestrom's…

  11. The Norma Cluster (ACO 3627): I. A Dynamical Analysis of the Most Massive Cluster in the Great Attractor

    CERN Document Server

    Woudt, P A; Lucey, J; Fairall, A P; Moore, S A W

    2007-01-01

    A detailed dynamical analysis of the nearby rich Norma cluster (ACO 3627) is presented. From radial velocities of 296 cluster members, we find a mean velocity of 4871 +/- 54 km/s and a velocity dispersion of 925 km/s. The mean velocity of the E/S0 population (4979 +/- 85 km/s) is offset with respect to that of the S/Irr population (4812 +/- 70 km/s) by `Delta' v = 164 km/s in the cluster rest frame. This offset increases towards the core of the cluster. The E/S0 population is free of any detectable substructure and appears relaxed. Its shape is clearly elongated with a position angle that is aligned along the dominant large-scale structures in this region, the so-called Norma wall. The central cD galaxy has a very large peculiar velocity of 561 km/s which is most probably related to an ongoing merger at the core of the cluster. The spiral/irregular galaxies reveal a large amount of substructure; two dynamically distinct subgroups within the overall spiral-population have been identified, located along the Nor...

  12. Leukaemia clusters in childhood: geographical analysis in Britain

    Energy Technology Data Exchange (ETDEWEB)

    Knox, E.G.

    1994-08-01

    Study objective - To validate previously demonstrated spatial clustering of childhood leukaemias by showing relative proximities of selected map features to cluster locations, compared with control locations. If clusters are real, then they are likely to be close to a determining hazard. Design -Cluster postcode loci and partially matched control postcodes were compared in terms of distances to railways, main roads, churches, surface water, woodland areas, and railside industrial installations. Further supporting comparisons between non-clustered cases and random postcode controls with those map features representable as single grid points were made. Setting -England, Wales, and Scotland 1966-83. Subjects - Grid referenced registrations of 9406 childhood leukaemias and non-Hodgkin`s lymphomas, including 264 pairs (or more) separated by <150 m, and grid references of random postcodes in equal numbers. Main results - the 264 clusters showed relative proximities (or the inverse) to several map features, of which the most powerful was an association with railways. The non-railway associations seemed to be statistically indirect. Some railside industrial installations, identified from a railway atlas, also showed relative proximities to leukaemia clusters, as well as to non-clustered cases, but did not ``explain`` the railway effect. These installations, with seemingly independent geographical associations, included oil refineries, petrochemical plants, oil storage and oil distribution depots, power stations, and steelworks. Conclusions - The previously shown childhood leukaemia clusters are confirmed to be non-random through their systematic associations with certain map features when compared with the control locations. The common patterns of close association of clustered and non-clustered cases imply a common aetiological component arising from a common environmental hazard - namely the use of fossil fuels, especially petroleum. (UK)

  13. Multidimensional cluster stability analysis from a Brazilian Bradyrhizobium sp. RFLP/PCR data set

    Science.gov (United States)

    Milagre, S. T.; Maciel, C. D.; Shinoda, A. A.; Hungria, M.; Almeida, J. R. B.

    2009-05-01

    The taxonomy of the N2-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradyrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster.

  14. Modified distance in average linkage based on M-estimator and MADn criteria in hierarchical cluster analysis

    Science.gov (United States)

    Muda, Nora; Othman, Abdul Rahman

    2015-10-01

    The process of grouping a set of objects into classes of similar objects is called clustering. It divides a large group of observations into smaller groups so that the observations within each group are relatively similar and the observations in different groups are relatively dissimilar. In this study, an agglomerative method in hierarchical cluster analysis is chosen and clusters were constructed by using an average linkage technique. An average linkage technique requires distance between clusters, which is calculated based on the average distance between all pairs of points, one group with another group. In calculating the average distance, the distance will not be robust when there is an outlier. Therefore, the average distance in average linkage needs to be modified in order to overcome the problem of outlier. Therefore, the criteria of outlier detection based on MADn criteria is used and the average distance is recalculated without the outlier. Next, the distance in average linkage is calculated based on a modified one step M-estimator (MOM). The groups of cluster are presented in dendrogram graph. To evaluate the goodness of a modified distance in the average linkage clustering, the bootstrap analysis is conducted on the dendrogram graph and the bootstrap value (BP) are assessed for each branch in dendrogram that formed the group, to ensure the reliability of the branches constructed. This study found that the average linkage technique with modified distance is significantly superior than the usual average linkage technique, if there is an outlier. Both of these techniques are said to be similar if there is no outlier.

  15. Mesoscopic analysis of networks: applications to exploratory analysis and data clustering

    CERN Document Server

    Granell, Clara; Arenas, Alex

    2011-01-01

    We investigate the adaptation and performance of modularity-based algorithms, designed in the scope of complex networks, to analyze the mesoscopic structure of correlation matrices. Using a multi-resolution analysis we are able to describe the structure of the data in terms of clusters at different topological levels. We demonstrate the applicability of our findings in two different scenarios: to analyze the neural connectivity of the nematode {\\em Caenorhabditis elegans}, and to automatically classify a typical benchmark of unsupervised clustering, the Iris data set, with considerable success.

  16. Analysis and Synthesis of pHEMT Class-E Amplifiers with Shunt Inductor including ON-State Active-Device Resistance Effects

    OpenAIRE

    Thian, Mury; Fusco, Vincent

    2006-01-01

    In this theoretical paper, the analysis of the effect that ON-state active-device resistance has on the performance of a Class-E tuned power amplifier using a shunt inductor topology is presented. The work is focused on the relatively unexplored area of design facilitation of Class-E tuned amplifiers where intrinsically low-output-capacitance monolithic microwave integrated circuit switching devices such as pseudomorphic high electron mobility transistors are used. In the paper, the switching...

  17. Sequential Combination Methods forData Clustering Analysis

    Institute of Scientific and Technical Information of China (English)

    钱 涛; Ching Y.Suen; 唐远炎

    2002-01-01

    This paper proposes the use of more than one clustering method to improve clustering performance. Clustering is an optimization procedure based on a specific clustering criterion. Clustering combination can be regardedasatechnique that constructs and processes multiple clusteringcriteria.Sincetheglobalandlocalclusteringcriteriaarecomplementary rather than competitive, combining these two types of clustering criteria may enhance theclustering performance. In our past work, a multi-objective programming based simultaneous clustering combination algorithmhasbeenproposed, which incorporates multiple criteria into an objective function by a weighting method, and solves this problem with constrained nonlinear optimization programming. But this algorithm has high computationalcomplexity.Hereasequential combination approach is investigated, which first uses the global criterion based clustering to produce an initial result, then uses the local criterion based information to improve the initial result with aprobabilisticrelaxation algorithm or linear additive model.Compared with the simultaneous combination method, sequential combination haslow computational complexity. Results on some simulated data and standard test data arereported.Itappearsthatclustering performance improvement can be achieved at low cost through sequential combination.

  18. Alternatives to Multilevel Modeling for the Analysis of Clustered Data

    Science.gov (United States)

    Huang, Francis L.

    2016-01-01

    Multilevel modeling has grown in use over the years as a way to deal with the nonindependent nature of observations found in clustered data. However, other alternatives to multilevel modeling are available that can account for observations nested within clusters, including the use of Taylor series linearization for variance estimation, the design…

  19. Multilevel Analysis Methods for Partially Nested Cluster Randomized Trials

    Science.gov (United States)

    Sanders, Elizabeth A.

    2011-01-01

    This paper explores multilevel modeling approaches for 2-group randomized experiments in which a treatment condition involving clusters of individuals is compared to a control condition involving only ungrouped individuals, otherwise known as partially nested cluster randomized designs (PNCRTs). Strategies for comparing groups from a PNCRT in the…

  20. Molecular Reclassification of Crohn's Disease by Cluster Analysis of Genetic Variants

    Science.gov (United States)

    Cleynen, Isabelle; Mahachie John, Jestinah M.; Henckaerts, Liesbet; Van Moerkercke, Wouter; Rutgeerts, Paul; Van Steen, Kristel; Vermeire, Severine

    2010-01-01

    Background Crohn's Disease (CD) has a heterogeneous presentation, and is typically classified according to extent and location of disease. The genetic susceptibility to CD is well known and genome-wide association scans (GWAS) and meta-analysis thereof have identified over 30 susceptibility loci. Except for the association between ileal CD and NOD2 mutations, efforts in trying to link CD genetics to clinical subphenotypes have not been very successful. We hypothesized that the large number of confirmed genetic variants enables (better) classification of CD patients. Methodology/Principal Findings To look for genetic-based subgroups, genotyping results of 46 SNPs identified from CD GWAS were analyzed by Latent Class Analysis (LCA) in CD patients and in healthy controls. Six genetic-based subgroups were identified in CD patients, which were significantly different from the five subgroups found in healthy controls. The identified CD-specific clusters are therefore likely to contribute to disease behavior. We then looked at whether we could relate the genetic-based subgroups to the currently used clinical parameters. Although modest differences in prevalence of disease location and behavior could be observed among the CD clusters, Random Forest analysis showed that patients could not be allocated to one of the 6 genetic-based subgroups based on the typically used clinical parameters alone. This points to a poor relationship between the genetic-based subgroups and the used clinical subphenotypes. Conclusions/Significance This approach serves as a first step to reclassify Crohn's disease. The used technique can be applied to other common complex diseases as well, and will help to complete patient characterization, in order to evolve towards personalized medicine. PMID:20886065

  1. Molecular reclassification of Crohn's disease by cluster analysis of genetic variants.

    Directory of Open Access Journals (Sweden)

    Isabelle Cleynen

    Full Text Available BACKGROUND: Crohn's Disease (CD has a heterogeneous presentation, and is typically classified according to extent and location of disease. The genetic susceptibility to CD is well known and genome-wide association scans (GWAS and meta-analysis thereof have identified over 30 susceptibility loci. Except for the association between ileal CD and NOD2 mutations, efforts in trying to link CD genetics to clinical subphenotypes have not been very successful. We hypothesized that the large number of confirmed genetic variants enables (better classification of CD patients. METHODOLOGY/PRINCIPAL FINDINGS: To look for genetic-based subgroups, genotyping results of 46 SNPs identified from CD GWAS were analyzed by Latent Class Analysis (LCA in CD patients and in healthy controls. Six genetic-based subgroups were identified in CD patients, which were significantly different from the five subgroups found in healthy controls. The identified CD-specific clusters are therefore likely to contribute to disease behavior. We then looked at whether we could relate the genetic-based subgroups to the currently used clinical parameters. Although modest differences in prevalence of disease location and behavior could be observed among the CD clusters, Random Forest analysis showed that patients could not be allocated to one of the 6 genetic-based subgroups based on the typically used clinical parameters alone. This points to a poor relationship between the genetic-based subgroups and the used clinical subphenotypes. CONCLUSIONS/SIGNIFICANCE: This approach serves as a first step to reclassify Crohn's disease. The used technique can be applied to other common complex diseases as well, and will help to complete patient characterization, in order to evolve towards personalized medicine.

  2. MASSCLEAN - MASSive CLuster Evolution and ANalysis Package - Description and Tests

    CERN Document Server

    Popescu, Bogdan

    2008-01-01

    We present MASSCLEAN, a new, sophisticated and robust stellar cluster image and photometry simulation package. This package is able to create color-magnitude diagrams and standard FITS images in any of the traditional optical and near-infrared bands based on cluster characteristics input by the user, including but not limited to distance, age, mass, radius and extinction. At the limit of very distant, unresolved clusters, we have checked the integrated colors created in MASSCLEAN against those from other single stellar population models with consistent results. We have also tested models which provide a reasonable estimate of the field star contamination in images and color-magnitude diagrams. We demonstrate the package by simulating images and color-magnitude diagrams of well known massive Milky Way clusters and compare their appearance to real data. Because the algorithm populates the cluster with a discrete number of tenable stars, it can be used as part of a Monte Carlo Method to derive the probabilistic ...

  3. Application and research of fuzzy clustering analysis algorithm under “micro-lecture” English teaching mode

    Directory of Open Access Journals (Sweden)

    Shi Ying

    2016-01-01

    Full Text Available The fuzzy clustering algorithm is to classify the data or indicators with a greater degree of similarity based on the principle of the same type of individuals possessing a greater similarity, and different types of individuals possessing differences, establish clear category boundaries, form any shape of relationship clusters in the solving process, and input the research indicators at random, in order to accurately analyze the significance of the indicators in the algorithm. The evaluation value of the clustering analysis can be obtained by the establishment of the fuzzy factor set based on the membership analysis, and the evaluation result can be analyzed through reference to the evaluation indicators of the fuzzy clustering analysis. The “micro-lecture” English teaching mode can be estimated and the analysis indicators can be rationally established based on the fuzzy clustering analysis algorithm, with better algorithm applicability.

  4. Analysis on Teacher Talk in College English Class

    Institute of Scientific and Technical Information of China (English)

    高莉

    2013-01-01

      This paper first briefly introduces the definition and function of Teacher Talk, and secondly mainly analyzes the charac⁃teristics of Teacher Talk in terms of three aspects, finally gives out some constructive suggestions to improve the quality of Teacher Talk in college English class, and also to promote the English teacher to develop the teaching more effectively.

  5. A class of symmetric wavelets and regularity analysis

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Presents the generation of wavelet functions using the filter H(ω) = (1+e-iω/2)N|F(e-iω) | F(z) is a rational function), and then a class of symmetric wavelets including orthogonal B-spline wavelets, and finally the best estimation of regular order obtained by analysing their regularity.

  6. Fingerprint analysis of Hibiscus mutabilis L. leaves based on ultra performance liquid chromatography with photodiode array detector combined with similarity analysis and hierarchical clustering analysis methods

    Directory of Open Access Journals (Sweden)

    Xianrui Liang

    2013-01-01

    Full Text Available Background: A method for chemical fingerprint analysis of Hibiscus mutabilis L. leaves was developed based on ultra performance liquid chromatography with photodiode array detector (UPLC-PAD combined with similarity analysis (SA and hierarchical clustering analysis (HCA. Materials and Methods: 10 batches of Hibiscus mutabilis L. leaves samples were collected from different regions of China. UPLC-PAD was employed to collect chemical fingerprints of Hibiscus mutabilis L. leaves. Results: The relative standard deviations (RSDs of the relative retention times (RRT and relative peak areas (RPA of 10 characteristic peaks (one of them was identified as rutin in precision, repeatability and stability test were less than 3%, and the method of fingerprint analysis was validated to be suitable for the Hibiscus mutabilis L. leaves. Conclusions: The chromatographic fingerprints showed abundant diversity of chemical constituents qualitatively in the 10 batches of Hibiscus mutabilis L. leaves samples from different locations by similarity analysis on basis of calculating the correlation coefficients between each two fingerprints. Moreover, the HCA method clustered the samples into four classes, and the HCA dendrogram showed the close or distant relations among the 10 samples, which was consistent to the SA result to some extent.

  7. Global stability analysis on a class of cellular neural networks

    Institute of Scientific and Technical Information of China (English)

    ZHANG; Yi

    2001-01-01

    [1]Chua, L. O., Yang, L., Cellular neural networks: Theory, IEEE Trans. CAS, 1988, (10): 1257.[2]Chua, L. O., Yang, L., Cellular neural networks: Applications, IEEE Trans. CAS, 1988, (10): 1273.[3]Chua, L. O., Roska, T., The CNN paradigm, IEEE Trans. CAS-I, 1993, (3): 147.[4]Matsumoto, T. Chua, L. O., Suzuki, H., CNN cloning template: Connected component detector, IEEE Trans. CAS, 1990, (8): 633.[5]Cao, L, Sun, Y, Yu, J., A CNN-based signature verification system,Proc. ICONIP′95, Beijing, 1995, 913—916.[6]Roska, T., Chua, L. O., The CNN universal machine: An analogic array computer, IEEE Trans. CAS Ⅱ, 1993, (3): 163.[7]Chua, L. O., Roska, T., Stability of a class of nonreciprocal cellular neural networks, IEEE Trans. CAS, 1990, (3): 1520.[8]Roska, T., Wu, C. W., Balsi, M. Et al., Stability and dynamics of delay type general and cellular neural networks, IEEE Trans. CAS, 1992, (6): 487.[9]Roska, T., Wu, C. W., Chua, L. O., Stability of cellular neural networks with dominant nonlinear and delaytype templates, IEEE Trans. CAS, 1993, (4): 270.[10]Civalleri, P. P., On stability of cellular neural networks with delay, IEEE Trans. CAS-I, 1993, (3): 157.[11]Gilli, G., Stability of cellular neural network and delayed cellular neural networks with nonpositive templates and nonmonotonic output functions, IEEE Trans CAS-I, 1994, (8): 518.[12]Baldi, P., Atiya, A. F., How delays affect neural dynamics and learning, IEEE Trans. On Neural Networks, 1994, (4): 612.[13]Liao, X. X., Mathematic foundation of cellular neural networks (Ⅰ), Science in China, Ser. A, 1994, 37(9): 902.[14]Liao, X. X., Mathematic foundation of cellular neural networks (Ⅱ), Science in China, Ser. A, 1994, 37(9): 1037.[15]Zhang, Y., Global exponential stability and periodic solutions of delay Hopfild neural networks, International J. Sys. Sci., 1996, (2): 227.[16]Zhang Yi, Zhong, S. M., Li, Z. L., Periodic solutions and global

  8. RELIABILITY ANALYSIS OF RING, AGENT AND CLUSTER BASED DISTRIBUTED SYSTEMS

    Directory of Open Access Journals (Sweden)

    R.SEETHALAKSHMI

    2011-08-01

    Full Text Available The introduction of pervasive devices and mobile devices has led to immense growth of real time distributed processing. In such context reliability of the computing environment is very important. Reliability is the probability that the devices, links, processes, programs and files work efficiently for the specified period of time and in the specified condition. Distributed systems are available as conventional ring networks, clusters and agent based systems. Reliability of such systems is focused. These networks are heterogeneous and scalable in nature. There are several factors, which are to be considered for reliability estimation. These include the application related factors like algorithms, data-set sizes, memory usage pattern, input-output, communication patterns, task granularity and load-balancing. It also includes the hardware related factors like processor architecture, memory hierarchy, input-output configuration and network. The software related factors concerning reliability are operating systems, compiler, communication protocols, libraries and preprocessor performance. In estimating the reliability of a system, the performance estimation is an important aspect. Reliability analysis is approached using probability.

  9. Abundance analysis of the outer halo globular cluster Palomar 14

    CERN Document Server

    Caliskan, S; Grebel, K E

    2011-01-01

    We determine the elemental abundances of nine red giant stars belonging to Palomar 14 (Pal 14). Pal 14 is an outer halo globular cluster (GC) at a distance of \\sim 70 kpc. Our abundance analysis is based on high-resolution spectra and one-dimensional stellar model atmospheres.We derived the abundances for the iron peak elements Sc, V, Cr, Mn, Co, Ni, the {\\alpha}-elements O, Mg, Si, Ca, Ti, the light odd element Na, and the neutron-capture elements Y, Zr, Ba, La, Ce, Nd, Eu, Dy, and Cu. Our data do not permit us to investigate light element (i.e., O to Mg) abundance variations. The neutron-capture elements show an r-process signature. We compare our measurements with the abundance ratios of inner and other outer halo GCs, halo field stars, GCs of recognized extragalactic origin, and stars in dwarf spheroidal galaxies (dSphs). The abundance pattern of Pal 14 is almost identical to those of Pal 3 and Pal 4, the next distant members of the outer halo GC population after Pal 14. The abundance pattern of Pal 14 is...

  10. Analysis and clustering of natural gas consumption data for thermal energy use forecasting

    Science.gov (United States)

    Franco, Alessandro; Fantozzi, Fabio

    2015-11-01

    In this paper, after a brief analysis of the connections between the uses of natural gas and thermal energy use, the natural gas consumption data related to Italian market are analyzed and opportunely clustered in order to compute the typical consumption profile in different days of the week in different seasons and for the different class of users: residential, tertiary and industrial. The analysis of the data shows that natural gas consumption profile is mainly related to seasonality pattern and to the weather conditions (outside temperature, humidity and wind chiller). There is also an important daily pattern related to industrial and civil sector that, at a lower degree than the previous one, does affect the consumption profile and have to be taken into account for defining an effective short and mid term thermal energy forecasting method. A possible mathematical structure of the natural gas consumption profile is provided. Due to the strong link between thermal energy use and natural gas consumption, this analysis could be considered the first step for the development of a model for thermal energy forecasting.

  11. Cluster Computing For Real Time Seismic Array Analysis.

    Science.gov (United States)

    Martini, M.; Giudicepietro, F.

    A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by

  12. The Study about the Analysis of Responsiveness Pair Clustering to Social Network Bipartite Graph

    OpenAIRE

    Otsuki, Akira; Kawamura, Masayoshi

    2013-01-01

    In this study, regional (cities, towns and villages) data and tweet data are obtained from Twitter, and extract information of purchase information (Where and what bought) from the tweet data by morphological analysis and rule-based dependency analysis. Then, the "The regional information" and "The information of purchase history (Where and what bought information)" are captured as bipartite graph, and Responsiveness Pair Clustering analysis (a clustering using correspondence analysis as simi...

  13. DNA splice site sequences clustering method for conservativeness analysis

    Institute of Scientific and Technical Information of China (English)

    Quanwei Zhang; Qinke Peng; Tao Xu

    2009-01-01

    DNA sequences that are near to splice sites have remarkable conservativeness,and many researchers have contributed to the prediction of splice site.In order to mine the underlying biological knowledge,we analyze the conservativeness of DNA splice site adjacent sequences by clustering.Firstly,we propose a kind of DNA splice site sequences clustering method which is based on DBSCAN,and use four kinds of dissimilarity calculating methods.Then,we analyze the conservative feature of the clustering results and the experimental data set.

  14. Nonlinear analysis of nano-cluster doped fiber

    Institute of Scientific and Technical Information of China (English)

    LIU Gang; ZHANG Ru

    2007-01-01

    There are prominent nonlinear characteristics that we hope for the semiconductor nano-clusters doped fiber. Refractive index of fiber core can be effectively changed by adulteration. This technology can provide a new method for developing photons components. Because the semiconductor nano-cluster has quantum characteristics,Based on first-order perturbation theory and classical theory of fiber,we deduced refractive index expressions of fiber core,which was semiconductor nano-cluster doped fiber. Finally,third-order nonlinear coefficient equation was gained. Using this equation,we calculated SMF-28 fiber nonlinear coefficient. The equation shows that new third-order coefficient was greater.

  15. Cluster Analysis in Patients with GOLD 1 Chronic Obstructive Pulmonary Disease.

    Directory of Open Access Journals (Sweden)

    Philippe Gagnon

    Full Text Available We hypothesized that heterogeneity exists within the Global Initiative for Chronic Obstructive Lung Disease (GOLD 1 spirometric category and that different subgroups could be identified within this GOLD category.Pre-randomization study participants from two clinical trials were symptomatic/asymptomatic GOLD 1 chronic obstructive pulmonary disease (COPD patients and healthy controls. A hierarchical cluster analysis used pre-randomization demographics, symptom scores, lung function, peak exercise response and daily physical activity levels to derive population subgroups.Considerable heterogeneity existed for clinical variables among patients with GOLD 1 COPD. All parameters, except forced expiratory volume in 1 second (FEV1/forced vital capacity (FVC, had considerable overlap between GOLD 1 COPD and controls. Three-clusters were identified: cluster I (18 [15%] COPD patients; 105 [85%] controls; cluster II (45 [80%] COPD patients; 11 [20%] controls; and cluster III (22 [92%] COPD patients; 2 [8%] controls. Apart from reduced diffusion capacity and lower baseline dyspnea index versus controls, cluster I COPD patients had otherwise preserved lung volumes, exercise capacity and physical activity levels. Cluster II COPD patients had a higher smoking history and greater hyperinflation versus cluster I COPD patients. Cluster III COPD patients had reduced physical activity versus controls and clusters I and II COPD patients, and lower FEV1/FVC versus clusters I and II COPD patients.The results emphasize heterogeneity within GOLD 1 COPD, supporting an individualized therapeutic approach to patients.www.clinicaltrials.gov. NCT01360788 and NCT01072396.

  16. Identifying the Clusters within Nonmotor Manifestations in Early Parkinson's Disease by Using Unsupervised Cluster Analysis

    OpenAIRE

    Hui-Jun Yang; Young Eun Kim; Ji Young Yun; Han-Joon Kim; Beom Seok Jeon

    2014-01-01

    BACKGROUND: Classical and data-driven classifications of Parkinson's disease (PD) are based primarily on motor symptoms, with little attention being paid to the clustering of nonmotor manifestations. METHODS: Clinical data on demographic, motor and nonmotor features, including the Korean version of the sniffin' stick (KVSS) test results, and responses to the screening questionnaire of the nonmotor features were collected from 56 PD patients with disease onset within 3 years. Nonmotor subgroup...

  17. Stellar variability in open clusters. II. Discovery of a new period-luminosity relation in a class of fast-rotating pulsating stars in NGC 3766

    CERN Document Server

    Mowlavi, N; Semaan, T; Eggenberger, P; Barblan, F; Eyer, L; Ekström, S; Georgy, C

    2016-01-01

    $Context.$ Pulsating stars are windows to the physics of stars enabling us to see glimpses of their interior. Not all stars pulsate, however. On the main sequence, pulsating stars form an almost continuous sequence in brightness, except for a magnitude range between $\\delta$ Scuti and slowly pulsating B stars. Against all expectations, 36 periodic variables were discovered in 2013 in this luminosity range in the open cluster NGC 3766, the origins of which was a mystery. $Aims.$ We investigate the properties of those new variability class candidates in relation to their stellar rotation rates and stellar multiplicity. $Methods.$ We took multi-epoch spectra over three consecutive nights using ESO's Very Large Telescope. $Results.$ We find that the majority of the new variability class candidates are fast-rotating pulsators that obey a new period-luminosity relation. We argue that the new relation discovered here has a different physical origin to the period-luminosity relations observed for Cepheids. $Conclusio...

  18. Cluster analysis of Wisconsin Breast Cancer dataset using self-organizing maps.

    Science.gov (United States)

    Pantazi, Stefan; Kagolovsky, Yuri; Moehr, Jochen R

    2002-01-01

    This work deals with multidimensional data analysis, precisely cluster analysis applied to a very well known dataset, the Wisconsin Breast Cancer dataset. After the introduction of the topics of the paper the cluster analysis concept is shortly explained and different methods of cluster analysis are compared. Further, the Kohonen model of self-organizing maps is briefly described together with an example and with explanations of how the cluster analysis can be performed using the maps. After describing the data set and the methodology used for the analysis we present the findings using textual as well as visual descriptions and conclude that the approach is a useful complement for assessing multidimensional data and that this dataset has been overused for automated decision benchmarking purposes, without a thorough analysis of the data it contains. PMID:15460731

  19. caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data

    OpenAIRE

    Xuan Jianhua; Wang Zuyi; Miller David J; Li Huai; Zhu Yitan; Clarke Robert; Hoffman Eric P; Wang Yue

    2008-01-01

    Abstract Background The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables. Results In an ...

  20. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms

    Science.gov (United States)

    Esplin, M Sean; Manuck, Tracy A.; Varner, Michael W.; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M.; Ilekis, John

    2015-01-01

    Objective We sought to employ an innovative tool based on common biological pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB), in order to enhance investigators' ability to identify to highlight common mechanisms and underlying genetic factors responsible for SPTB. Study Design A secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks gestation. Each woman was assessed for the presence of underlying SPTB etiologies. A hierarchical cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis using VEGAS software. Results 1028 women with SPTB were assigned phenotypes. Hierarchical clustering of the phenotypes revealed five major clusters. Cluster 1 (N=445) was characterized by maternal stress, cluster 2 (N=294) by premature membrane rupture, cluster 3 (N=120) by familial factors, and cluster 4 (N=63) by maternal comorbidities. Cluster 5 (N=106) was multifactorial, characterized by infection (INF), decidual hemorrhage (DH) and placental dysfunction (PD). These three phenotypes were highly correlated by Chi-square analysis [PD and DH (p<2.2e-6); PD and INF (p=6.2e-10); INF and DH (p=0.0036)]. Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. Conclusion We identified 5 major clusters of SPTB based on a phenotype tool and hierarchal clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors underlying SPTB. PMID:26070700

  1. First PPMXL photometric analysis of open cluster Ruprecht 15

    Institute of Scientific and Technical Information of China (English)

    Ashraf Latif Tadross

    2012-01-01

    We present the first in a series studying the astrophysical parameters of open clusters using the PPMXL* database whose data are applied to study Ruprecht 15.The astrophysical parameters of Ruprecht 15 have been estimated for the first time.

  2. Functional clustering algorithm for the analysis of dynamic network data

    OpenAIRE

    Feldt, S.; Waddell, J; Hetrick, V. L.; Berke, J. D.; Żochowski, M

    2009-01-01

    We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulat...

  3. Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

    Energy Technology Data Exchange (ETDEWEB)

    Lalonde, Michel, E-mail: mlalonde15@rogers.com; Wassenaar, Richard [Department of Physics, Carleton University, Ottawa, Ontario K1S 5B6 (Canada); Wells, R. Glenn; Birnie, David; Ruddy, Terrence D. [Division of Cardiology, University of Ottawa Heart Institute, Ottawa, Ontario K1Y 4W7 (Canada)

    2014-07-15

    Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster

  4. Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

    Directory of Open Access Journals (Sweden)

    Martinez Fernando J

    2010-03-01

    Full Text Available Abstract Background Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity. Methods We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster. Results We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1 post-bronchodilator FEV1 percent predicted, 2 percent bronchodilator responsiveness, and quantitative CT measurements of 3 apical emphysema and 4 airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1 emphysema predominant, 2 bronchodilator responsive, with higher FEV1; 3 discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4 airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant was associated with TGFB1 SNP rs1800470. Conclusions Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies.

  5. Assessment of Random Assignment in Training and Test Sets using Generalized Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Sorana D. BOLBOACĂ

    2011-06-01

    Full Text Available Aim: The properness of random assignment of compounds in training and validation sets was assessed using the generalized cluster technique. Material and Method: A quantitative Structure-Activity Relationship model using Molecular Descriptors Family on Vertices was evaluated in terms of assignment of carboquinone derivatives in training and test sets during the leave-many-out analysis. Assignment of compounds was investigated using five variables: observed anticancer activity and four structure descriptors. Generalized cluster analysis with K-means algorithm was applied in order to investigate if the assignment of compounds was or not proper. The Euclidian distance and maximization of the initial distance using a cross-validation with a v-fold of 10 was applied. Results: All five variables included in analysis proved to have statistically significant contribution in identification of clusters. Three clusters were identified, each of them containing both carboquinone derivatives belonging to training as well as to test sets. The observed activity of carboquinone derivatives proved to be normal distributed on every. The presence of training and test sets in all clusters identified using generalized cluster analysis with K-means algorithm and the distribution of observed activity within clusters sustain a proper assignment of compounds in training and test set. Conclusion: Generalized cluster analysis using the K-means algorithm proved to be a valid method in assessment of random assignment of carboquinone derivatives in training and test sets.

  6. Marketing Mix Formulation for Higher Education: An Integrated Analysis Employing Analytic Hierarchy Process, Cluster Analysis and Correspondence Analysis

    Science.gov (United States)

    Ho, Hsuan-Fu; Hung, Chia-Chi

    2008-01-01

    Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…

  7. Using Latent Class Analysis to Identify Academic and Behavioral Risk Status in Elementary Students

    Science.gov (United States)

    King, Kathleen R.; Lembke, Erica S.; Reinke, Wendy M.

    2016-01-01

    Identifying classes of children on the basis of academic and behavior risk may have important implications for the allocation of intervention resources within Response to Intervention (RTI) and Multi-Tiered System of Support (MTSS) models. Latent class analysis (LCA) was conducted with a sample of 517 third grade students. Fall screening scores in…

  8. Latent Class Analysis of Peer Conformity: Who Is Yielding to Pressure and Why?

    Science.gov (United States)

    Kosten, Paul A.; Scheier, Lawrence M.; Grenard, Jerry L.

    2013-01-01

    This study used latent class analysis to examine typologies of peer conformity in a community sample of middle school students. Students responded to 31 items assessing diverse facets of conformity dispositions. The most parsimonious model produced three qualitatively distinct classes that differed on the basis of conformity to recreational…

  9. Identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard.

    Directory of Open Access Journals (Sweden)

    Xiao-Juan Jiang

    Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles

  10. Bohai crude oil identification by gas chromatogram fingerprinting quantitative analysis coupled with cluster analysis

    Institute of Scientific and Technical Information of China (English)

    SUN Peiyan; BAO Mutai; GAO Zhenhui; LI Mei; ZHAO Yuhui; WANG Xinping; ZHOU Qing; WANG Xiulin

    2006-01-01

    By gas chromatogram, six crude oils fingerprinting distributed in four oilfields and four oil platforms were analyzed and the corresponding normal paraffin hydrocarbon (including pristane and phytane) concentration was obtained by the internal standard method. The normal paraffin hydrocarbon distribution patterns of six crude oils were built and compared. The cluster analysis on the normal paraffin hydrocarbon concentration was conducted for classification and some ratios of oils were used for oils comparison. The results indicated: there was a clear difference within different crude oils in different oil fields and a small difference between the crude oils in the same oil platform. The normal paraffin hydrocarbon distribution pattern and ratios, as well as the cluster analysis on the normal paraffin hydrocarbon concentration can have a better differentiation result for the crude oils with small difference than the original gas chromatogram.

  11. Work Analysis of the nuclear power plant control room operators (II): The classes of situation

    International Nuclear Information System (INIS)

    This report presents a work analysis of nuclear power plant control room operators focused on the classes of situation they can meet during their job. Each class of situation is first described in terms of the process variables states. We then describe the goals of the operators and the variables they process in each class of situation. We report some of the most representative difficulties encountered by the operators in each class of situation. Finally, we conclude on different topics: the nature of the mental representations, the temporal dimension, the monitoring activity, and the role of the context in the work of controlling a nuclear power plant

  12. Dialogue Analysis and Its Application in English Language Class

    Institute of Scientific and Technical Information of China (English)

    李婧雅

    2014-01-01

    Dialogue is frequently employed as role-play in the classroom activities. Students are encouraged to practice speaking skill in the context of a certain conversation. Conversation tasks in listening exercises also attract various interests in English les-sons. This essay aims to analyze the functions of dialogues, followed with the discussion on how to apply the proper dialogues in-to English classroom, and ending up with the suggestions of some possible activities adopted in English language class. A dialogue cited from Dellar and Walkley (2003, p. 125) is used as a sample to interpret in detail.

  13. Global stability analysis on a class of cellular neural networks

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    The existence, uniqueness, globally exponential stability andspeed of exponential convergence for a class of cellular neural networks are investigated. The existence of a unique equilibrium is proved under very concise conditions, and theorems for estimating the global convergence speed approaching the equilibrium and criteria for its globally exponential stability are derived, Considering synapse time delay, by constructing appropriate Lyapunov functional, the existence of a unique equilibrium and its global stability for the delayed network are also proved. The results, which do not require the cloning template to be symmetric, are easy to use in network design.

  14. Construction and Dimension Analysis for a Class of Fractal Functions

    Institute of Scientific and Technical Information of China (English)

    Hong-yong Wang; Zong-ben Xu

    2002-01-01

    In this paper, we construct a class of nowhere differentiable continuous functions by means of the Cantor series expression of real numbers. The constructed functions include some known nondifferentiable functions, such as Bush type functions. These functions are fractal functions since their graphs are in general fractal sets. Under certain conditions, we investigate the fractal dimensions of the graphs of these functions,compute the precise values of Box and Packing dimensions, and evaluate the Hausdorff dimension. Meanwhile,the Holder continuity of such functions is also discussed.

  15. Two-dimensional ordered cluster analysis of component groups in self-organization

    Directory of Open Access Journals (Sweden)

    WenJun Zhang

    2014-09-01

    Full Text Available An algorithm for two-dimensional cluster analysis of component groups, originally from Zhang et al., (2004, was introduced in this study. The algorithm composes of three procedures, i.e., calculation of distance measures, randomization statistic test, and ordered clustering of components.

  16. Social Learning Network Analysis Model to Identify Learning Patterns Using Ontology Clustering Techniques and Meaningful Learning

    Science.gov (United States)

    Firdausiah Mansur, Andi Besse; Yusof, Norazah

    2013-01-01

    Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…

  17. Competitiveness Analysis of Processing Industry Cluster of Livestock Products in Inner Mongolia Based on "Diamond Model"

    OpenAIRE

    Yang, Xing-long; Ren, Ya-tong

    2012-01-01

    Using Michael Porter's "diamond model", based on regional development characteristics, we conduct analysis of the competitiveness of processing industry cluster of livestock products in Inner Mongolia from six aspects (the factor conditions, demand conditions, corporate strategy, structure and competition, related and supporting industries, government and opportunities). And we put forward the following rational recommendations for improving the competitiveness of processing industry cluster ...

  18. Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

    Science.gov (United States)

    Chan, Julia Y. K.; Bauer, Christopher F.

    2014-01-01

    The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…

  19. Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method

    DEFF Research Database (Denmark)

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2014-01-01

    method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous...

  20. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Q. Zhou; F. Leng; L. Leydesdorff

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare the

  1. Identification and structural analysis of a novel snoRNA gene cluster from Arabidopsis thaliana

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    A Z2 snoRNA gene cluster,consisting of four antisense snoRNA genes, was identified from Arabidopsis thaliana. The sequence and structural analysis showed that the Z2 snoRNA gene cluster might be transcribed as a polycistronic precursor from an upstream promoter, and the intergenic spacers of the gene cluster encode the 'hairpin' structures similar to the processing recognition signals of yeast Saccharomyces cerevisiae polycistronic snoRNA precursor. The results also revealed that plant snoRNA gene with multiple copies is a characteristic in common, and provides a good system for further revealing the transcription and expression mechanism of plant snoRNA gene cluster.

  2. Analysis of cost data in a cluster-randomized, controlled trial: comparison of methods

    DEFF Research Database (Denmark)

    Sokolowski, Ineta; Ørnbøl, Eva; Rosendal, Marianne;

    in clusters of general practices.   There have been suggestions to apply different methods, e.g., the non-parametric bootstrap, to highly skewed data from pragmatic randomized trials without clusters, but there is very little information about how to analyse skewed data from cluster-randomized trials. Many...... studies have used non-valid analysis of skewed data. We propose two different methods to compare mean cost in two groups. Firstly, we use a non-parametric bootstrap method where the re-sampling takes place on two levels in order to take into account the cluster effect. Secondly, we proceed with a log...

  3. A Latent Class Analysis of Smokeless Tobacco Use in the United States.

    Science.gov (United States)

    Fu, Qiang; Vaughn, Michael G

    2016-08-01

    While there has been an escalating trend in the number of smokeless tobacco uses, mainly snuff, in the United States, it is unclear whether smokeless tobacco users are a homogenous class. The present investigation examines this question and identifies subtypes of smokeless tobacco users in order to better understand the characteristics of these individuals and guide appropriate intervention. Data on smokeless tobacco users (N = 2504) derived from the National Epidemiologic Survey on Alcohol and Related Conditions was employed. A range of antisocial behaviors, from reflecting non-violent deviant acts, irresponsibility, and a disengaged lifestyle, to aggression and violence were used to estimate the number of subtypes of smokeless tobacco users using latent class analysis. Four latent classes emerged: Normative Class (50.2 %), Deviant Class (21.9 %), Disengaging Class (17.2 %), and Antisocial Class (10.5 %). Logistic regression shows that major depression, alcohol use disorder, and marijuana use disorder were associated with Deviant Class (OR's from 2.0 to 10.5). The same array of psychiatric disorders and general anxiety disorder were associated with greater odds of membership in the Disengaging Class (OR's from 2.6 to 7.4). Aforementioned psychiatric disorders and illicit drug use disorder were associated with the Antisocial Class (OR's from 3.8 to 38.1). Findings indicate that smokeless tobacco users are a heterogeneous population that may benefit from differential intervention strategies. PMID:26864269

  4. The intersectionality of discrimination attributes and bullying among youth: an applied latent class analysis.

    Science.gov (United States)

    Garnett, Bernice Raveche; Masyn, Katherine E; Austin, S Bryn; Miller, Matthew; Williams, David R; Viswanath, Kasisomayajula

    2014-08-01

    Discrimination is commonly experienced among adolescents. However, little is known about the intersection of multiple attributes of discrimination and bullying. We used a latent class analysis (LCA) to illustrate the intersections of discrimination attributes and bullying, and to assess the associations of LCA membership to depressive symptoms, deliberate self harm and suicidal ideation among a sample of ethnically diverse adolescents. The data come from the 2006 Boston Youth Survey where students were asked whether they had experienced discrimination based on four attributes: race/ethnicity, immigration status, perceived sexual orientation and weight. They were also asked whether they had been bullied or assaulted for these attributes. A total of 965 (78%) students contributed to the LCA analytic sample (45% Non-Hispanic Black, 29% Hispanic, 58% Female). The LCA revealed that a 4-class solution had adequate relative and absolute fit. The 4-classes were characterized as: low discrimination (51%); racial discrimination (33%); sexual orientation discrimination (7%); racial and weight discrimination with high bullying (intersectional class) (7%). In multivariate models, compared to the low discrimination class, individuals in the sexual orientation discrimination class and the intersectional class had higher odds of engaging in deliberate self-harm. Students in the intersectional class also had higher odds of suicidal ideation. All three discrimination latent classes had significantly higher depressive symptoms compared to the low discrimination class. Multiple attributes of discrimination and bullying co-occur among adolescents. Research should consider the co-occurrence of bullying and discrimination.

  5. A Latent Class Analysis of Smokeless Tobacco Use in the United States.

    Science.gov (United States)

    Fu, Qiang; Vaughn, Michael G

    2016-08-01

    While there has been an escalating trend in the number of smokeless tobacco uses, mainly snuff, in the United States, it is unclear whether smokeless tobacco users are a homogenous class. The present investigation examines this question and identifies subtypes of smokeless tobacco users in order to better understand the characteristics of these individuals and guide appropriate intervention. Data on smokeless tobacco users (N = 2504) derived from the National Epidemiologic Survey on Alcohol and Related Conditions was employed. A range of antisocial behaviors, from reflecting non-violent deviant acts, irresponsibility, and a disengaged lifestyle, to aggression and violence were used to estimate the number of subtypes of smokeless tobacco users using latent class analysis. Four latent classes emerged: Normative Class (50.2 %), Deviant Class (21.9 %), Disengaging Class (17.2 %), and Antisocial Class (10.5 %). Logistic regression shows that major depression, alcohol use disorder, and marijuana use disorder were associated with Deviant Class (OR's from 2.0 to 10.5). The same array of psychiatric disorders and general anxiety disorder were associated with greater odds of membership in the Disengaging Class (OR's from 2.6 to 7.4). Aforementioned psychiatric disorders and illicit drug use disorder were associated with the Antisocial Class (OR's from 3.8 to 38.1). Findings indicate that smokeless tobacco users are a heterogeneous population that may benefit from differential intervention strategies.

  6. A scoping review of spatial cluster analysis techniques for point-event data

    Directory of Open Access Journals (Sweden)

    Charles E. Fritz

    2013-05-01

    Full Text Available Spatial cluster analysis is a uniquely interdisciplinary endeavour, and so it is important to communicate and disseminate ideas, innovations, best practices and challenges across practitioners, applied epidemiology researchers and spatial statisticians. In this research we conducted a scoping review to systematically search peer-reviewed journal databases for research that has employed spatial cluster analysis methods on individual-level, address location, or x and y coordinate derived data. To illustrate the thematic issues raised by our results, methods were tested using a dataset where known clusters existed. Point pattern methods, spatial clustering and cluster detection tests, and a locally weighted spatial regression model were most commonly used for individual-level, address location data (n = 29. The spatial scan statistic was the most popular method for address location data (n = 19. Six themes were identified relating to the application of spatial cluster analysis methods and subsequent analyses, which we recommend researchers to consider; exploratory analysis, visualization, spatial resolution, aetiology, scale and spatial weights. It is our intention that researchers seeking direction for using spatial cluster analysis methods, consider the caveats and strengths of each approach, but also explore the numerous other methods available for this type of analysis. Applied spatial epidemiology researchers and practitioners should give special consideration to applying multiple tests to a dataset. Future research should focus on developing frameworks for selecting appropriate methods and the corresponding spatial weighting schemes.

  7. A Bayesian Analysis of the Ages of Four Open Clusters

    CERN Document Server

    Jeffery, Elizabeth J; van Dyk, David A; Stenning, David C; Robinson, Elliot; Stein, Nathan; Jefferys, W H

    2016-01-01

    In this paper we apply a Bayesian technique to determine the best fit of stellar evolution models to find the main sequence turn off age and other cluster parameters of four intermediate-age open clusters: NGC 2360, NGC 2477, NGC 2660, and NGC 3960. Our algorithm utilizes a Markov chain Monte Carlo technique to fit these various parameters, objectively finding the best-fit isochrone for each cluster. The result is a high-precision isochrone fit. We compare these results with the those of traditional "by-eye" isochrone fitting methods. By applying this Bayesian technique to NGC 2360, NGC 2477, NGC 2660, and NGC 3960, we determine the ages of these clusters to be 1.35 +/- 0.05, 1.02 +/- 0.02, 1.64 +/- 0.04, and 0.860 +/- 0.04 Gyr, respectively. The results of this paper continue our effort to determine cluster ages to higher precision than that offered by these traditional methods of isochrone fitting.

  8. Galaxy cluster mass estimation from stacked spectroscopic analysis

    Science.gov (United States)

    Farahi, Arya; Evrard, August E.; Rozo, Eduardo; Rykoff, Eli S.; Wechsler, Risa H.

    2016-08-01

    We use simulated galaxy surveys to study: i) how galaxy membership in redMaPPer clusters maps to the underlying halo population, and ii) the accuracy of a mean dynamical cluster mass, $M_\\sigma(\\lambda)$, derived from stacked pairwise spectroscopy of clusters with richness $\\lambda$. Using $\\sim\\! 130,000$ galaxy pairs patterned after the SDSS redMaPPer cluster sample study of Rozo et al. (2015 RMIV), we show that the pairwise velocity PDF of central--satellite pairs with $m_i galaxy membership matching. We apply this approach, along with mis-centering and galaxy velocity bias corrections, to estimate the log-mean matched halo mass at $z=0.2$ of SDSS redMaPPer clusters. Employing the velocity bias constraints of Guo et al. (2015), we find $\\langle \\ln(M_{200c})|\\lambda \\rangle = \\ln(M_{30}) + \\alpha_m \\ln(\\lambda/30)$ with $M_{30} = 1.56 \\pm 0.35 \\times 10^{14} M_\\odot$ and $\\alpha_m = 1.31 \\pm 0.06_{stat} \\pm 0.13_{sys}$. Systematic uncertainty in the velocity bias of satellite galaxies overwhelmingly dominates the error budget.

  9. Robust growing neural gas algorithm with application in cluster analysis.

    Science.gov (United States)

    Qin, A K; Suganthan, P N

    2004-01-01

    We propose a novel robust clustering algorithm within the Growing Neural Gas (GNG) framework, called Robust Growing Neural Gas (RGNG) network.The Matlab codes are available from . By incorporating several robust strategies, such as outlier resistant scheme, adaptive modulation of learning rates and cluster repulsion method into the traditional GNG framework, the proposed RGNG network possesses better robustness properties. The RGNG is insensitive to initialization, input sequence ordering and the presence of outliers. Furthermore, the RGNG network can automatically determine the optimal number of clusters by seeking the extreme value of the Minimum Description Length (MDL) measure during network growing process. The resulting center positions of the optimal number of clusters represented by prototype vectors are close to the actual ones irrespective of the existence of outliers. Topology relationships among these prototypes can also be established. Experimental results have shown the superior performance of our proposed method over the original GNG incorporating MDL method, called GNG-M, in static data clustering tasks on both artificial and UCI data sets. PMID:15555857

  10. Performance Analysis of Gender Clustering and Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Dr.K.Meena

    2012-03-01

    Full Text Available In speech processing, gender clustering and classification plays a major role. In both gender clustering and classification, selecting the feature is an important process and the often utilized featurefor gender clustering and classification in speech processing is pitch. The pitch value of a male speech differs much from that of a female speech. Normally, there is a considerable frequency value difference between the male and female speech. But, in some cases the frequency of male is almost equal to female or frequency of female is equal to male. In such situation, it is difficult to identify the exact gender. By considering this drawback, here three features namely; energy entropy, zero crossing rate and short time energy are used for identifying the gender. Gender clustering and classification of speech signal are estimated using the aforementioned three features. Here, the gender clustering is computed using Euclidean distance, Mahalanobis distance, Manhattan distance & Bhattacharyya distance method and the gender classification method is computed using combined fuzzy logic and neural network, neuro fuzzy and support vector machine and its performance are analyzed.

  11. ERRORS ANALYSIS AND TEACHERS' STRATEGIES IN SPEAKING CLASSES

    Institute of Scientific and Technical Information of China (English)

    LiMinquan

    2004-01-01

    In oral classes, teachers are often faced with all sorts oferrors made by students. Because of insufficient study of them,some correct all of the errors and some neglect them. The authorin this paper, through investigation of real class situation andall the possible collections of errors in his past teaching work,studies the errors and finds out four causes of the errors, andthen puts forward his suggestions for dealing with the differenterrors at different stages. In teaching students to speak English, teachers often find alot of errors in their speech. How should these errors be dealtwith properly? This is something many teachers are working at.Through investigation of real class situation and all the possiblecollections of errors in teaching work, it is believed that ateacher's knowledge of the learning law, careful observation ofthe errors being made by the students and proper attitudestoward the errors are very important. It has been found that when a child starts to learn his native language,he makes errors constantly, such as “This mammy chair”or “Mammy, apple eat” But he con say them correctly without much correction when he grows up. This is because “a human infant is born with an innate predisposition toacquire language”(Richards, 21). When an adult learns aforeign language, it is even more difficult, for physiologicallyhe has to train the muscles of his tongue and lips to get used tothe new ways of pronouncing a word, and psychologically hehas to receive new concepts of the language which are quitedifferent from his native tongue. Therefore, he unavoidablymakes errors in his speech. Even when he has mastered thelanguage to a certain degree, he still makes errors because “heknows very well what he should have done, but owing to thenervousness, tiredness, pressure and the effects of innertranslation (a kind of interference from home language), hejust lapses and forgets for a moment what to do” (McArthur,107-108). This doesn't mean that an

  12. Theoretical Analysis of Structures of Ga4N4 Clusters

    Institute of Scientific and Technical Information of China (English)

    宋斌; 曹培林

    2003-01-01

    The structures and energies of a Ga4N4 cluster have been calculated using a full-potential linear-muffin-tin-orbital molecular-dynamics (FP-LMTO MD) method. We obtained twenty-four structures for a Ga4N4 cluster. The most stable structure we obtained is a Cs three-dimensional structure, the energy of which is lower than that of the C2v symmetry structure proposed by Kandalam et al. [J. Phys. Chem. B 106 (2002) 1945] The calculated results show that the isomer with an N3 subunit is preferred, supporting the previous result made by Kandalam et al.We found that the most stable structure of Ga4N4 clusters presented semiconductor-like properties through the calculation of the density of states.

  13. Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis

    Directory of Open Access Journals (Sweden)

    D. Gaur

    2008-01-01

    Full Text Available Problem statement: In this study researchers propose a new fuzzy graph theoretic construct called fuzzy metagraph and a new method of clustering finding the similar fuzzy nodes in a fuzzy metagraph. Approach: We adopted T-norms (Triangular Norms functions and join two or more T-norms to cluster the fuzzy nodes. Fuzzy metagraph is the fuzzyfication of the crisp Metagraphs using fuzzy Generating sets and the fuzzy edge set. We could efficiently analyze the inexact information and investigate the fuzzy relation by applying the fuzzy graph theory. Results: In this study researchers suggesting a new method of clustering of a new graph theoretic structure i.e., fuzzy metagraph and investigated fuzzy metanode and fuzzy metagraph structure. Conclusion/Recommendations: Our future research will be to explore all its useful operations on fuzzy metagraph. We will give the more application based implementation of fuzzy metagraph.

  14. Electronic stress tensor analysis of hydrogenated palladium clusters

    CERN Document Server

    Ichikawa, Kazuhide; Szarek, Pawel; Zhou, Chenggang; Cheng, Hansong; Tachibana, Akitomo

    2011-01-01

    We study the chemical bonds of small palladium clusters Pd_n (n=2-9) saturated by hydrogen atoms using electronic stress tensor. Our calculation includes bond orders which are recently proposed based on the stress tensor. It is shown that our bond orders can classify the different types of chemical bonds in those clusters. In particular, we discuss Pd-H bonds associated with the H atoms with high coordination numbers and the difference of H-H bonds in the different Pd clusters from viewpoint of the electronic stress tensor. The notion of "pseudo-spindle structure" is proposed as the region between two atoms where the largest eigenvalue of the electronic stress tensor is negative and corresponding eigenvectors forming a pattern which connects them.

  15. Clustering analysis of malware behavior using Self Organizing Map

    DEFF Research Database (Denmark)

    Pirscoveanu, Radu-Stefan; Stevanovic, Matija; Pedersen, Jens Myrup

    2016-01-01

    For the time being, malware behavioral classification is performed by means of Anti-Virus (AV) generated labels. The paper investigates the inconsistencies associated with current practices by evaluating the identified differences between current vendors. In this paper we rely on Self Organizing ...... accurate results based on the clusters created by competitive and cooperative algorithms like Self Organizing Map that better describe the behavioral profile of malware.......For the time being, malware behavioral classification is performed by means of Anti-Virus (AV) generated labels. The paper investigates the inconsistencies associated with current practices by evaluating the identified differences between current vendors. In this paper we rely on Self Organizing...... Map, an unsupervised machine learning algorithm, for generating clusters that capture the similarities between malware behavior. A data set of approximately 270,000 samples was used to generate the behavioral profile of malicious types in order to compare the outcome of the proposed clustering...

  16. A Systematic Analysis of Caustic Methods for Galaxy Cluster Masses

    CERN Document Server

    Gifford, Daniel; Kern, Nicholas

    2013-01-01

    We quantify the expected observed statistical and systematic uncertainties of the escape velocity as a measure of the gravitational potential and total mass of galaxy clusters. We focus our attention on low redshift (z 25, the scatter in the escape velocity mass is dominated by projections along the line-of-sight. Algorithmic uncertainties from the determination of the projected escape velocity profile are negligible. We quantify how target selection based on magnitude, color, and projected radial separation can induce small additional biases into the escape velocity masses. Using N_gal = 150 (25), the caustic technique has a per cluster scatter in ln(M|M_200) of 0.3 (0.5) and bias 1+/-3% (16+/-5%) for clusters with masses > 10^14M_solar at z<0.15.

  17. Molecular-dynamics analysis of mobile helium cluster reactions near surfaces of plasma-exposed tungsten

    International Nuclear Information System (INIS)

    We report the results of a systematic atomic-scale analysis of the reactions of small mobile helium clusters (Hen, 4 ≤ n ≤ 7) near low-Miller-index tungsten (W) surfaces, aiming at a fundamental understanding of the near-surface dynamics of helium-carrying species in plasma-exposed tungsten. These small mobile helium clusters are attracted to the surface and migrate to the surface by Fickian diffusion and drift due to the thermodynamic driving force for surface segregation. As the clusters migrate toward the surface, trap mutation (TM) and cluster dissociation reactions are activated at rates higher than in the bulk. TM produces W adatoms and immobile complexes of helium clusters surrounding W vacancies located within the lattice planes at a short distance from the surface. These reactions are identified and characterized in detail based on the analysis of a large number of molecular-dynamics trajectories for each such mobile cluster near W(100), W(110), and W(111) surfaces. TM is found to be the dominant cluster reaction for all cluster and surface combinations, except for the He4 and He5 clusters near W(100) where cluster partial dissociation following TM dominates. We find that there exists a critical cluster size, n = 4 near W(100) and W(111) and n = 5 near W(110), beyond which the formation of multiple W adatoms and vacancies in the TM reactions is observed. The identified cluster reactions are responsible for important structural, morphological, and compositional features in the plasma-exposed tungsten, including surface adatom populations, near-surface immobile helium-vacancy complexes, and retained helium content, which are expected to influence the amount of hydrogen re-cycling and tritium retention in fusion tokamaks

  18. Genetic Diversity among Parents of Hybrid Rice Based on Cluster Analysis of Morphological Traits and Simple Sequence Repeat Markers

    Institute of Scientific and Technical Information of China (English)

    WANG Sheng-jun; LU Zuo-mei; WAN Jian-min

    2006-01-01

    The genetic diversity of 41 parental lines popularized in commercial hybrid rice production in China was studied by using cluster analysis of morphological traits and simple sequence repeat (SSR) markers. Forty-one entries were assigned into two clusters (I.e. Early or medium-maturing cluster; medium or late-maturing cluster) and further assigned into six sub-clusters based on morphological trait cluster analysis. The early or medium-maturing cluster was composed of 15 maintainer lines, four early-maturing restorer lines and two thermo-sensitive genic male sterile lines, and the medium or late-maturing cluster included 16 restorer lines and 4 medium or late-maturing maintainer lines. Moreover, the SSR cluster analysis classified 41 entries into two clusters (I.e. Maintainer line cluster and restorer line cluster) and seven sub-clusters. The maintainer line cluster consisted of all 19 maintainer lines, two thermo-sensitive genic male sterile lines, while the restorer line cluster was composed of all 20 restorer lines. The SSR analysis fitted better with the pedigree information. From the views on hybrid rice breeding, the results suggested that SSR analysis might be a better method to study the diversity of parental lines in indica hybrid rice.

  19. THE METHODS OF TOTAL BODY BIOIMPEDANCE SPECTROSCOPY IN ANALYSIS THE FUNCTIONAL CLASS OF CONGESTIVE HEART FAILURE

    OpenAIRE

    Ivanov, G.; Dvornicov, V.; Niculina, L.; Kotlarova, L.; Bernshtein, Ju; Pavlovich, A.

    2004-01-01

    The article presented result of studies, which determined signs an studies of total body bioimpedance spectros-copy analysis in evaluate of functional class of chronic heart failure. Key words: biompedance, congestive heart failure.

  20. Learning regularized LDA by clustering.

    Science.gov (United States)

    Pang, Yanwei; Wang, Shuang; Yuan, Yuan

    2014-12-01

    As a supervised dimensionality reduction technique, linear discriminant analysis has a serious overfitting problem when the number of training samples per class is small. The main reason is that the between- and within-class scatter matrices computed from the limited number of training samples deviate greatly from the underlying ones. To overcome the problem without increasing the number of training samples, we propose making use of the structure of the given training data to regularize the between- and within-class scatter matrices by between- and within-cluster scatter matrices, respectively, and simultaneously. The within- and between-cluster matrices are computed from unsupervised clustered data. The within-cluster scatter matrix contributes to encoding the possible variations in intraclasses and the between-cluster scatter matrix is useful for separating extra classes. The contributions are inversely proportional to the number of training samples per class. The advantages of the proposed method become more remarkable as the number of training samples per class decreases. Experimental results on the AR and Feret face databases demonstrate the effectiveness of the proposed method.

  1. caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data

    Directory of Open Access Journals (Sweden)

    Xuan Jianhua

    2008-09-01

    Full Text Available Abstract Background The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables. Results In an effort to partially address these limitations, we develop the VIsual Statistical Data Analyzer (VISDA for cluster modeling, visualization, and discovery in genomic data. VISDA performs progressive, coarse-to-fine (divisive hierarchical clustering and visualization, supported by hierarchical mixture modeling, supervised/unsupervised informative gene selection, supervised/unsupervised data visualization, and user/prior knowledge guidance, to discover hidden clusters within complex, high-dimensional genomic data. The hierarchical visualization and clustering scheme of VISDA uses multiple local visualization subspaces (one at each node of the hierarchy and consequent subspace data modeling to reveal both global and local cluster structures in a "divide and conquer" scenario. Multiple projection methods, each sensitive to a distinct type of clustering tendency, are used for data visualization, which increases the likelihood that cluster structures of interest are revealed. Initialization of the full dimensional model is based on first learning models with user/prior knowledge guidance on data projected into the low-dimensional visualization spaces. Model order selection for the high dimensional data is accomplished by Bayesian theoretic criteria and user justification applied via the hierarchy of low-dimensional visualization subspaces. Based on its complementary building blocks and flexible functionality, VISDA is generally applicable for gene clustering, sample

  2. PERFORMANCE ANALYSIS OF CLUSTERING BASED IMAGE SEGMENTATION AND OPTIMIZATION METHODS

    Directory of Open Access Journals (Sweden)

    Jaskirat kaur

    2012-05-01

    Full Text Available Partitioning of an image into several constituent components is called image segmentation. Myriad algorithms using different methods have been proposed for image segmentation. Many clustering algorithms and optimization techniques are also being used for segmentation of images. A major challenge in segmentation evaluation comes from the fundamental conflict between generality and objectivity. As there is a glut of image segmentation techniques available today, customer who is the real user of these techniques may get obfuscated. In this paper to address the above described problem some image segmentation techniques are evaluated based on their consistency in different applications. Based on the parameters used quantification of different clustering algorithms is done.

  3. Analysis of the Advantages of Creating Border Clusters

    Directory of Open Access Journals (Sweden)

    Liudmila Rosca-Sadurschi

    2015-08-01

    Full Text Available In a changing environment and rapid globalization, competitiveness of a country or region depends increasingly more effective in innovation. The main challenge for research and innovation is to facilitate the networking of companies and research laboratories. These networks can take the form of a highly integrated cross-border economic group, but may consist of action to facilitate business linkages and inter-laboratory, or cross-border clusters. The creation of these clusters requires performing several conditions but bring significant benefits to all stakeholders.

  4. The endangered middle class? A comparative analysis of the role public redistribution plays

    OpenAIRE

    Dallinger, Ursula

    2011-01-01

    This article contributes to the debate on the decline of the middle class by engaging in a cross-national comparison of the role public income redistribution play for the relative income position of the middle, and its change over time. The analysis distinguishes between the development of the market as compared to disposable incomes, since different dynamics shape each. Moreover, the broad category of 'a middle class' is sub-divided into three groups. The analysis is based on a dataset, cove...

  5. Weighted Clustering

    DEFF Research Database (Denmark)

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina;

    2012-01-01

    We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both the...... partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...

  6. Structure analysis of a class of fuzzy controllers using pseudo trapezoid shaped membership functions

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    An output expression of a class of dual-input single-output fuzzy controllers using pseudo trapezoid shaped membership function is given. By structure analysis it is proved that this class of fuzzy controllers is the sum of a global two-dimensional multi-level relay and a local linear or nonlinear proportional-integral or proportional-differential controller. And the output of this class of fuzzy controllers is a continuous, non-decreasing function of its input variables. These and other meaningful results derived from structure analysis based on the output expressions can guide the design of fuzzy controllers.

  7. Structure analysis of a class of fuzzy controllers using pseudo trapezoid shaped membership functions

    Institute of Scientific and Technical Information of China (English)

    曾珂; 张乃尧; 徐文立

    2000-01-01

    An output expression of a class of dual-input single-output fuzzy controllers using pseudo trapezoid shaped membership function is given. By structure analysis it is prdved that this class of fuzzy controllers is the sum of a global two-dimensional multi-level relay and a local linear or nonlinear proportional-integral or proportional-differential controller. And the output of this class of fuzzy controllers is a continuous, non-decreasing function of its input variables. These and other meaningful results derived from structure analysis based on the output expressions can guide the design of fuzzy controllers.

  8. A SPECTROSCOPIC ANALYSIS OF THE GALACTIC GLOBULAR CLUSTER NGC 6273 (M19)

    Energy Technology Data Exchange (ETDEWEB)

    Johnson, Christian I.; Caldwell, Nelson [Harvard–Smithsonian Center for Astrophysics, 60 Garden Street, MS-15, Cambridge, MA 02138 (United States); Rich, R. Michael [Department of Physics and Astronomy, UCLA, 430 Portola Plaza, Box 951547, Los Angeles, CA 90095-1547 (United States); Pilachowski, Catherine A. [Astronomy Department, Indiana University Bloomington, Swain West 319, 727 East 3rd Street, Bloomington, IN 47405-7105 (United States); Mateo, Mario; Bailey, John I. III [Department of Astronomy, University of Michigan, Ann Arbor, MI 48109 (United States); Crane, Jeffrey D., E-mail: cjohnson@cfa.harvard.edu, E-mail: ncaldwell@cfa.harvard.edu, E-mail: rmr@astro.ucla.edu, E-mail: catyp@astro.indiana.edu, E-mail: mmateo@umich.edu, E-mail: baileyji@umich.edu, E-mail: crane@obs.carnegiescience.edu [The Observatories of the Carnegie Institution for Science, Pasadena, CA 91101 (United States)

    2015-08-15

    A combined effort utilizing spectroscopy and photometry has revealed the existence of a new globular cluster class. These “anomalous” clusters, which we refer to as “iron-complex” clusters, are differentiated from normal clusters by exhibiting large (≳0.10 dex) intrinsic metallicity dispersions, complex sub-giant branches, and correlated [Fe/H] and s-process enhancements. In order to further investigate this phenomenon, we have measured radial velocities and chemical abundances for red giant branch stars in the massive, but scarcely studied, globular cluster NGC 6273. The velocities and abundances were determined using high resolution (R ∼ 27,000) spectra obtained with the Michigan/Magellan Fiber System (M2FS) and MSpec spectrograph on the Magellan–Clay 6.5 m telescope at Las Campanas Observatory. We find that NGC 6273 has an average heliocentric radial velocity of +144.49 km s{sup −1} (σ = 9.64 km s{sup −1}) and an extended metallicity distribution ([Fe/H] = −1.80 to −1.30) composed of at least two distinct stellar populations. Although the two dominant populations have similar [Na/Fe], [Al/Fe], and [α/Fe] abundance patterns, the more metal-rich stars exhibit significant [La/Fe] enhancements. The [La/Eu] data indicate that the increase in [La/Fe] is due to almost pure s-process enrichment. A third more metal-rich population with low [X/Fe] ratios may also be present. Therefore, NGC 6273 joins clusters such as ω Centauri, M2, M22, and NGC 5286 as a new class of iron-complex clusters exhibiting complicated star formation histories.

  9. Segmentation of visitors to shopping centers based on their activities through factor analysis and cluster analysis

    Directory of Open Access Journals (Sweden)

    reza soleymani-damaneh

    2013-08-01

    Full Text Available Knowing customers of shopping centers contributes greatly to increase profits of these centers. Segmentation of the customers is one of the most effective means of knowing the customers. The purpose of this study was to present a segmentation of the customers based on their activities in the shopping centers. The participants were 157 visitors to Milad-e-Noor Shopping Center who were required to answer the questions in the questionnaire. Data were analyzed in three steps. Through the use of factor analysis, in the first step, the number of variables was reduced to the four factors of entertainment activities, planned shopping, shopping information gathering and unplanned shopping. These factors were then inserted into K-mean cluster analysis and, in the second step and the visitors were divided into 4 segments on the basis of their activity as following: traditionalists, shopping center enthusiasts, wandering customers, and entertainment seekers. In the third step, the demographic and behavioral variables were investigated in the identified clusters. Considering the variables of age, academic status and accompanying persons in shopping centers, these clusters were significantly different. In respect to variables of sex, marital status, the length of presence in the shopping centers, occupations and monthly salary they were recognized as homogenous, however.

  10. An Empirical Comparison of Variable Standardization Methods in Cluster Analysis.

    Science.gov (United States)

    Schaffer, Catherine M.; Green, Paul E.

    1996-01-01

    The common marketing research practice of standardizing the columns of a persons-by-variables data matrix prior to clustering the entities corresponding to the rows was evaluated with 10 large-scale data sets. Results indicate that the column standardization practice may be problematic for some kinds of data that marketing researchers used for…

  11. The XMM Cluster Survey: X-ray analysis methodology

    CERN Document Server

    Lloyd-Davies, E J; Hosmer, Mark; Mehrtens, Nicola; Davidson, Michael; Sabirli, Kivanc; Mann, Robert G; Hilton, Matt; Liddle, Andrew R; Viana, Pedro T P; Campbell, Heather C; Collins, Chris A; Dubois, E Naomi; Freeman, Peter; Hoyle, Ben; Kay, Scott T; Kuwertz, Emma; Miller, Christopher J; Nichol, Robert C; Sahlen, Martin; Stanford, S Adam; Stott, John P

    2010-01-01

    The XMM Cluster Survey (XCS) is a serendipitous search for galaxy clusters using all publicly available data in the XMM- Newton Science Archive. Its main aims are to measure cosmological parameters and trace the evolution of X-ray scaling relations. In this paper we describe the data processing methodology applied to the 5776 XMM observations used to construct the current XCS source catalogue. A total of 3669 > 4-{\\sigma} cluster candidates with >50 background-subtracted X-ray counts are extracted from a total non-overlapping area suitable for cluster searching of 410 deg^2 . Of these, 1022 candidates are detected with >300 X-ray counts, and we demonstrate that robust temperature measurements can be obtained down to this count limit. We describe in detail the automated pipelines used to perform the spectral and surface brightness fitting for these sources, as well as to estimate redshifts from the X-ray data alone. A total of 517 (126) X-ray temperatures to a typical accuracy of <40 (<10) per cent have ...

  12. Galaxy Cluster Mass Estimation from Stacked Spectroscopic Analysis

    CERN Document Server

    Farahi, Arya; Rozo, Eduardo; Rykoff, Eli S; Wechsler, Risa H

    2016-01-01

    We use simulated galaxy surveys to study: i) how galaxy membership in redMaPPer clusters maps to the underlying halo population, and ii) the accuracy of a mean dynamical cluster mass, $M_\\sigma(\\lambda)$, derived from stacked pairwise spectroscopy of clusters with richness $\\lambda$. Using $\\sim\\! 130,000$ galaxy pairs patterned after the SDSS redMaPPer cluster sample study of Rozo et al. (2015 RMIV), we show that the pairwise velocity PDF of central--satellite pairs with $m_i < 19$ in the simulation matches the form seen in RMIV. Through joint membership matching, we deconstruct the main Gaussian velocity component into its halo contributions, finding that the top-ranked halo contributes $\\sim 60\\%$ of the stacked signal. The halo mass scale inferred by applying the virial scaling of Evrard et al. (2008) to the velocity normalization matches, to within a few percent, the log-mean halo mass derived through galaxy membership matching. We apply this approach, along with mis-centering and galaxy velocity bias...

  13. DESIGN AND ANALYSIS OF MULTI-MODE CLUSTER SAR

    Institute of Scientific and Technical Information of China (English)

    Fan Luhong; Pi Yiming; Hou Yinming

    2004-01-01

    Cluster Synthetic Aperture Radar (SAR) system is composed of a group of spaceborne SAR systems. With its agility of combination, this system can work in several different modes. In this letter, the basic configuration and the working mode of the system are presented.The special performance of the system compared with the conventional SAR system is indicated.

  14. Dynamical analysis of the cluster pair: A3407 + A3408

    CERN Document Server

    Nascimento, R S; Trevisan, M; Carrasco, E R; Plana, H; Dupke, R

    2016-01-01

    We carried out a dynamical study of the galaxy cluster pair A3407 \\& A3408 based on a spectroscopic survey obtained with the 4 meter Blanco telescope at the CTIO, plus 6dF data, and ROSAT All-Sky-Survey. The sample consists of 122 member galaxies brighter than $m_R=20$. Our main goal is to probe the galaxy dynamics in this field and verify if the sample constitutes a single galaxy system or corresponds to an ongoing merging process. Statistical tests were applied to clusters members showing that both the composite system A3407 + A3408 as well as each individual cluster have Gaussian velocity distribution. A velocity gradient of $\\sim 847\\pm 114$ $\\rm km\\;s^{-1}$ was identified around the principal axis of the projected distribution of galaxies, indicating that the global field may be rotating. Applying the KMM algorithm to the distribution of galaxies we found that the solution with two clusters is better than the single unit solution at the 99\\% c.l. This is consistent with the X-ray distribution around ...

  15. High-Speed Video Analysis in a Conceptual Physics Class

    Science.gov (United States)

    Desbien, Dwain M.

    2011-01-01

    The use of probe ware and computers has become quite common in introductory physics classrooms. Video analysis is also becoming more popular and is available to a wide range of students through commercially available and/or free software. Video analysis allows for the study of motions that cannot be easily measured in the traditional lab setting…

  16. Dynamical analysis of the cluster pair: A3407 + A3408

    Science.gov (United States)

    Nascimento, R. S.; Ribeiro, A. L. B.; Trevisan, M.; Carrasco, E. R.; Plana, H.; Dupke, R.

    2016-08-01

    We carried out a dynamical study of the galaxy cluster pair A3407 and A3408 based on a spectroscopic survey obtained with the 4 metre Blanco telescope at the Cerro Tololo Interamerican Observatory, plus 6dF data, and ROSAT All-Sky Survey. The sample consists of 122 member galaxies brighter than mR = 20. Our main goal is to probe the galaxy dynamics in this field and verify if the sample constitutes a single galaxy system or corresponds to an ongoing merging process. Statistical tests were applied to clusters members showing that both the composite system A3407 + A3408 as well as each individual cluster have Gaussian velocity distribution. A velocity gradient of ˜847 ± 114 km s- 1 was identified around the principal axis of the projected distribution of galaxies, indicating that the global field may be rotating. Applying the KMM algorithm to the distribution of galaxies, we found that the solution with two clusters is better than the single unit solution at the 99 per cent cl. This is consistent with the X-ray distribution around this field, which shows no common X-ray halo involving A3407 and A3408. We also estimated virial masses and applied a two-body model to probe the dynamics of the pair. The more likely scenario is that in which the pair is gravitationally bound and probably experiences a collapse phase, with the cluster cores crossing in less than ˜1 h-1 Gyr, a pre-merger scenario. The complex X-ray morphology, the gas temperature, and some signs of galaxy evolution in A3408 suggest a post-merger scenario, with cores having crossed each other ˜1.65 h-1 Gyr ago, as an alternative solution.

  17. Latent Class Analysis of Substance Use among Adolescents Presenting to Urban Primary Care Clinics

    Science.gov (United States)

    Bohnert, Kipling M.; Walton, Maureen A.; Resko, Stella; Barry, Kristen T.; Chermack, Stephen T.; Zucker, Robert A.; Zimmerman, Marc A.; Booth, Brenda M.; Blow, Frederic C.

    2015-01-01

    Background Polysubstance use during adolescence is a significant public health concern; however, few studies have investigated patterns of use during this developmental window within the primary care setting. Objectives This study uses an empirical method to classify adolescents into polysubstance use groups, and examines correlates of the empirically-defined groups. Methods Data come from patients, ages 12-18 years, presenting to urban, primary care community health clinics (Federally Qualified Health Centers) in two cities in the Midwestern United States (n=1664). Latent class analysis (LCA) was used to identify classes of substance users. Multinomial logistic regression was used to examine variables associated with class membership. Results LCA identified three classes: Class 1 (64.5%) exhibited low probabilities of all types of substance use; Class 2 (24.6%) was characterized by high probabilities of cannabis use and consequences; Class 3 (10.9%) had the highest probabilities of polysubstance use, including heavy episodic drinking and misuse of prescription drugs. Those in Class 2 and Class 3 were more likely to be older, and have poorer grades, poorer health, higher levels of psychological distress, and more sexual partners than those in Class 1. Individuals in Class 3 were also less likely to be African-American than those in Class 1. Conclusion Findings provide novel insight into the patterns of polysubstance use among adolescents presenting to low-income urban primary care clinics. Future research should examine the efficacy of interventions that address the complex patterns of substance use and concomitant health concerns among adolescents. PMID:24219231

  18. Social Class Differences in Social Integration among Students in Higher Education: A Meta-Analysis and Recommendations for Future Research

    Science.gov (United States)

    Rubin, Mark

    2012-01-01

    A meta-analysis of 35 studies found that social class (socioeconomic status) is related to social integration among students in higher education: Working-class students are less integrated than middle-class students. This relation generalized across students' gender and year of study, as well as type of social class measure (parental education and…

  19. Log Analysis as a way to assist Opera mini cluster management decisions

    OpenAIRE

    2008-01-01

    This thesis considers ways that analysis of Opera mini logs can assist decisions related to global and local load balancing of Opera mini clusters. The analy- sis is aimed to determine the distribution of traffic with respect to country of origin and server within the cluster over the period of 2 weeks by creating a system for extraction and analysis of log data. Findings show that a large part of traffic originates in Russia with India and Indonesia being second and third. ...

  20. Functional Cluster Analysis of CT Perfusion Maps: A New Tool for Diagnosis of Acute Stroke?

    OpenAIRE

    Baumgartner, Christian; Gautsch, Kurt; Böhm, Christian; Felber, Stephan

    2005-01-01

    CT perfusion imaging constitutes an important contribution to the early diagnosis of acute stroke. Cerebral blood flow (CBF), cerebral blood volume (CBV) and time-to-peak (TTP) maps are used to estimate the severity of cerebral damage after acute ischemia. We introduce functional cluster analysis as a new tool to evaluate CT perfusion in order to identify normal brain, ischemic tissue and large vessels. CBF, CBV and TTP maps represent the basis for cluster analysis applying a partitioning (k-...

  1. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    We present an approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways. We have also proposed a buffer size and worst case queuing delay analysis for the gateways......, responsible for routing inter-cluster traffic. Optimization heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of our approaches....

  2. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    An approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways, is presented. A buffer size and worst case queuing delay analysis for the gateways, responsible for routing...... inter-cluster traffic, is also proposed. Optimisation heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of the approaches....

  3. Design and performance of an analysis-by-synthesis class of predictive speech coders

    Science.gov (United States)

    Rose, Richard C.; Barnwell, Thomas P., III

    1990-01-01

    The performance of a broad class of analysis-by-synthesis linear predictive speech coders is quantified experimentally. The class of coders includes a number of well-known techniques as well as a very large number of speech coders which have not been named or studied. A general formulation for deriving the parametric representation used in all of the coders in the class is presented. A new coder, named the self-excited vocoder, is discussed because of its good performance with low complexity, and because of the insight this coder gives to analysis-by-synthesis coders in general. The results of a study comparing the performances of different members of this class are presented. The study takes the form of a series of formal subjective and objective speech quality tests performed on selected coders. The results of this study lead to some interesting and important observations concerning the controlling parameters for analysis-by-synthesis speech coders.

  4. Using cluster analysis to examine the combinations of motivation regulations of physical education students.

    Science.gov (United States)

    Ullrich-French, Sarah; Cox, Anne

    2009-06-01

    According to self-determination theory, motivation is multidimensional, with motivation regulations lying along a continuum of self-determination (Ryan & Deci, 2007). Accounting for the different types of motivation in physical activity research presents a challenge. This study used cluster analysis to identify motivation regulation profiles and examined their utility by testing profile differences in relative levels of self-determination (i.e., self-determination index), and theoretical antecedents (i.e., competence, autonomy, relatedness) and consequences (i.e., enjoyment, worry, effort, value, physical activity) of physical education motivation. Students (N= 386) in 6th- through 8th-grade physical education classes completed questionnaires of the variables listed above. Five profiles emerged, including average (n = 81), motivated (n = 82), self-determined (n = 91), low motivation (n = 73), and external (n = 59). Group difference analyses showed that students with greater levels of self-determined forms of motivation, regardless of non-self-determined motivation levels, reported the most adaptive physical education experiences.

  5. Multi-class ERP-based BCI data analysis using a discriminant space self-organizing map.

    Science.gov (United States)

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    Emotional or non-emotional image stimulus is recently applied to event-related potential (ERP) based brain computer interfaces (BCI). Though the classification performance is over 80% in a single trial, a discrimination between those ERPs has not been considered. In this research we tried to clarify the discriminability of four-class ERP-based BCI target data elicited by desk, seal, spider images and letter intensifications. A conventional self organizing map (SOM) and newly proposed discriminant space SOM (ds-SOM) were applied, then the discriminabilites were visualized. We also classify all pairs of those ERPs by stepwise linear discriminant analysis (SWLDA) and verify the visualization of discriminabilities. As a result, the ds-SOM showed understandable visualization of the data with a shorter computational time than the traditional SOM. We also confirmed the clear boundary between the letter cluster and the other clusters. The result was coherent with the classification performances by SWLDA. The method might be helpful not only for developing a new BCI paradigm, but also for the big data analysis.

  6. Analysis of luminescence spectra of substrate-free icosahedral and crystalline clusters of argon

    CERN Document Server

    Doronin, Yu S; Kamarchuk, G V; Tkachenko, A A; Samovarov, V N

    2016-01-01

    We propose a new approach to analysis of cathodoluminescence spectra of substrate-free nanoclusters of argon produced in a supersonic jet expanding into a vacuum. It is employed to analyze intensities of the luminescence bands of neutral and charged excimer complexes (Ar2)* and (Ar4+)* measured for clusters with an average size of 500 to 8900 atoms per cluster and diameters ranging from 32 to 87 {\\AA}. Concentration of the jet substance condensed into clusters, which determines the absolute values of the integrated band intensities, is shown to be proportional to the logarithm of the average cluster size. Analysis of reduced intensities of the (Ar2)* and (Ar4+)* bands in the spectra of crystalline clusters with an fcc structure allows us to conclude that emission of the neutral molecules (Ar2)* comes from within the whole volume of the cluster, while the charged complexes (Ar4+)* radiate from its near-surface layers. We find the cluster size range in which the jet is dominated by quasicrystalline clusters wit...

  7. Ranking and clustering countries and their products; a network analysis

    CERN Document Server

    Caldarelli, Guido; Gabrielli, Andrea; Pietronero, Luciano; Scala, Antonio; Tacchella, Andrea

    2011-01-01

    In this paper we analyze the network of countries and products from UN data on country production. We define the country-country and product-product networks and we introduce a novel method of community detection based on elements similarity. As a result we find that country clustering reveals unexpected socio-geographic links among the most competing countries. On the same footings the products clustering can be efficiently used for a bottom-up classification of produced goods. Furthermore we define a procedure to rank different countries and their products over the global market. These analyses are a good proxy of country GDP and therefore could be possibly used to determine the robustness of a country economy.

  8. Qualitative analysis of certain generalized classes of quadratic oscillator systems

    Energy Technology Data Exchange (ETDEWEB)

    Bagchi, Bijan, E-mail: bbagchi123@gmail.com; Ghosh, Samiran, E-mail: sran-g@yahoo.com; Pal, Barnali, E-mail: barrna.roo@gmail.com; Poria, Swarup, E-mail: swarupporia@gmail.com [Department of Applied Mathematics, University of Calcutta, 92 Acharya Prafulla Chandra Road, Kolkata 700009 (India)

    2016-02-15

    We carry out a systematic qualitative analysis of the two quadratic schemes of generalized oscillators recently proposed by Quesne [J. Math. Phys. 56, 012903 (2015)]. By performing a local analysis of the governing potentials, we demonstrate that while the first potential admits a pair of equilibrium points one of which is typically a center for both signs of the coupling strength λ, the other points to a centre for λ < 0 but a saddle λ > 0. On the other hand, the second potential reveals only a center for both the signs of λ from a linear stability analysis. We carry out our study by extending Quesne’s scheme to include the effects of a linear dissipative term. An important outcome is that we run into a remarkable transition to chaos in the presence of a periodic force term fcosωt.

  9. Insights into quasar UV spectra using unsupervised clustering analysis

    Science.gov (United States)

    Tammour, A.; Gallagher, S. C.; Daley, M.; Richards, G. T.

    2016-06-01

    Machine learning techniques can provide powerful tools to detect patterns in multidimensional parameter space. We use K-means - a simple yet powerful unsupervised clustering algorithm which picks out structure in unlabelled data - to study a sample of quasar UV spectra from the Quasar Catalog of the 10th Data Release of the Sloan Digital Sky Survey (SDSS-DR10) of Paris et al. Detecting patterns in large data sets helps us gain insights into the physical conditions and processes giving rise to the observed properties of quasars. We use K-means to find clusters in the parameter space of the equivalent width (EW), the blue- and red-half-width at half-maximum (HWHM) of the Mg II 2800 Å line, the C IV 1549 Å line, and the C III] 1908 Å blend in samples of broad absorption line (BAL) and non-BAL quasars at redshift 1.6-2.1. Using this method, we successfully recover correlations well-known in the UV regime such as the anti-correlation between the EW and blueshift of the C IV emission line and the shape of the ionizing spectra energy distribution (SED) probed by the strength of He II and the Si III]/C III] ratio. We find this to be particularly evident when the properties of C III] are used to find the clusters, while those of Mg II proved to be less strongly correlated with the properties of the other lines in the spectra such as the width of C IV or the Si III]/C III] ratio. We conclude that unsupervised clustering methods (such as K-means) are powerful methods for finding `natural' binning boundaries in multidimensional data sets and discuss caveats and future work.

  10. Analysis of X-ray Structures of Matrix Metalloproteinases via Chaotic Map Clustering

    Directory of Open Access Journals (Sweden)

    Gargano Gianfranco

    2010-10-01

    Full Text Available Abstract Background Matrix metalloproteinases (MMPs are well-known biological targets implicated in tumour progression, homeostatic regulation, innate immunity, impaired delivery of pro-apoptotic ligands, and the release and cleavage of cell-surface receptors. With this in mind, the perception of the intimate relationships among diverse MMPs could be a solid basis for accelerated learning in designing new selective MMP inhibitors. In this regard, decrypting the latent molecular reasons in order to elucidate similarity among MMPs is a key challenge. Results We describe a pairwise variant of the non-parametric chaotic map clustering (CMC algorithm and its application to 104 X-ray MMP structures. In this analysis electrostatic potentials are computed and used as input for the CMC algorithm. It was shown that differences between proteins reflect genuine variation of their electrostatic potentials. In addition, the analysis has been also extended to analyze the protein primary structures and the molecular shapes of the MMP co-crystallised ligands. Conclusions The CMC algorithm was shown to be a valuable tool in knowledge acquisition and transfer from MMP structures. Based on the variation of electrostatic potentials, CMC was successful in analysing the MMP target family landscape and different subsites. The first investigation resulted in rational figure interpretation of both domain organization as well as of substrate specificity classifications. The second made it possible to distinguish the MMP classes, demonstrating the high specificity of the S1' pocket, to detect both the occurrence of punctual mutations of ionisable residues and different side-chain conformations that likely account for induced-fit phenomena. In addition, CMC demonstrated a potential comparable to the most popular UPGMA (Unweighted Pair Group Method with Arithmetic mean method that, at present, represents a standard clustering bioinformatics approach. Interestingly, CMC and

  11. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  12. SSR Cluster and Fertility Loci Analysis of GC13

    Institute of Scientific and Technical Information of China (English)

    NONG Bao-xuan; XIA Xiu-zhong; LIANG Yao-mao; LU Gang; ZHANG Zong-qiong; LI Dan-ting

    2011-01-01

    [Objective] The research aimed to clarify the genetic mechanism of special wide compatibility of GC13.[Method] The clustering analyses of GC13,five indica,five japonica and five wide compatibility varieties were carried out by using 70 SSR primers.[Result] GC13 was clustered into japonica group and had far genetic relationship with indica and wide compatibility variety.Two fertility loci were detected in GC13,in which one closely linked to RM225 on chromosome 6.According to the position on the chromosome,it speculated that this locus was allelic to S5.GC13 carried the allelic gene S5-n at this locus.The other locus closely linked to RM408 on chromosome 8 and was provisionally designated as Sg(t).At this locus,GC13 carried Sg(t)-i allelic gene,which was consistent with IR36.The effect of S5 locus was stronger than that of Sg(t).[Conclusion] The research laid the good foundation for using the wide compatibility line GC13 to breed the hybrid between subspecies.%[Objective] The research aimed to clarify the genetic mechanism of special wide compatibility of GC13.[Method] The clustering analyses of GC13,five indica,five japonica and five wide compatibility varieties were carried out by using 70 SSR primers.[Result

  13. X-ray Analysis of Filaments in Galaxy Clusters

    CERN Document Server

    Walker, S A; Fabian, A C; Sanders, J S

    2015-01-01

    We perform a detailed X-ray study of the filaments surrounding the brightest cluster galaxies in a sample of nearby galaxy clusters using deep Chandra observations, namely the Perseus, Centaurus and Virgo clusters, and Abell 1795. We compare the X-ray properties and spectra of the filaments in all of these systems, and find that their Chandra X-ray spectra are all broadly consistent with an absorbed two temperature thermal model, with temperature components at 0.75 and 1.7 keV. We find that it is also possible to model the Chandra ACIS filament spectra with a charge exchange model provided a thermal component is also present, and the abundance of oxygen is suppressed relative to the abundance of Fe. In this model, charge exchange provides the dominant contribution to the spectrum in the 0.5-1.0 keV band. However, when we study the high spectral resolution RGS spectrum of the filamentary plume seen in X-rays in Centaurus, the opposite appears to be the case. The properties of the filaments in our sample of clu...

  14. Graph partitioning advance clustering technique

    CERN Document Server

    Madhulatha, T Soni

    2012-01-01

    Clustering is a common technique for statistical data analysis, Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Dissimilarities are assessed based on the attribute values describing the objects. Often, distance measures are used. Clustering is an unsupervised learning technique, where interesting patterns and structures can be found directly from very large data sets with little or none of the background knowledge. This paper also considers the partitioning of m-dimensional lattice graphs using Fiedler's approach, which requires the determination of the eigenvector belonging to the second smallest Eigenvalue of the Laplacian with K-means partitioning algorithm.

  15. k-Means clustering as tool for multivariate geophysical data analysis. An application to shallow fault zone imaging

    Science.gov (United States)

    Di Giuseppe, Maria Giulia; Troiano, Antonio; Troise, Claudia; De Natale, Giuseppe

    2014-02-01

    We present the results of an integrated imaging approach for two-dimensional high-resolution magnetotelluric and seismic profiles. These were carried out in the seismically active intermontane basin of Pantano di San Gregorio Magno (southern Italy), along a line across the surface rupture of the 1980, M 6.9, earthquake. We focus on the application of the post-inversion k-means clustering technique to the univariate resistivity and P-wave velocity models, which were obtained previously through independent inversions. Five cluster classes are recognized, allowing a joint two-dimensional section to be imaged in terms of homogeneous zones from a geo-structural point of view. Two distinct local relationships between electrical resistivity and seismic velocities are inferred. In this way, the hanging and footwall zones have been retrieved, and are characterized according to the different fracturing degrees. The case dealt with here can be viewed as a successful example of how cluster analysis can be a promising auxiliary tool that provides bridging towards the integration of distinct geophysical methods.

  16. Future Expectations Among Adolescents: A Latent Class Analysis

    OpenAIRE

    Heather L Sipsma; Ickovics, Jeannette R.; Lin, Haiqun; Kershaw, Trace S.

    2012-01-01

    Future expectations have been important predictors of adolescent development and behavior. Its measurement, however, has largely focused on single dimensions and misses potentially important components. This analysis investigates whether an empirically-driven, multidimensional approach to conceptualizing future expectations can substantively contribute to our understanding of adolescent risk behavior. We use data from the National Longitudinal Survey of Youth 1997 to derive subpopulations of ...

  17. Can galaxy clusters, type Ia supernovae and cosmic microwave background ruled out a class of modified gravity theories?

    CERN Document Server

    Holanda, R F L

    2016-01-01

    In this paper we study cosmological signatures of modified gravity theories that can be written as a coupling between a extra scalar field and the electromagnetic part of the usual Lagrangian for the matter fields. In these frameworks all the electromagnetic sector of the theory is affected and variations of fundamental constants, of the cosmic distance duality relation and of the evolution law of the cosmic microwave background radiation (CMB) are expected and are related each other. In order to search these variations we perform jointly analyses with angular diameter distances of galaxy clusters, luminosity distances of type Ia supernovae and $T_{CMB}(z)$ measurements. We obtain tight constraints with no indication of violation of the standard framework.

  18. Distinguishing PTSD, Complex PTSD, and Borderline Personality Disorder: A latent class analysis

    Directory of Open Access Journals (Sweden)

    Marylène Cloitre

    2014-09-01

    Full Text Available Background: There has been debate regarding whether Complex Posttraumatic Stress Disorder (Complex PTSD is distinct from Borderline Personality Disorder (BPD when the latter is comorbid with PTSD. Objective: To determine whether the patterns of symptoms endorsed by women seeking treatment for childhood abuse form classes that are consistent with diagnostic criteria for PTSD, Complex PTSD, and BPD. Method: A latent class analysis (LCA was conducted on an archival dataset of 280 women with histories of childhood abuse assessed for enrollment in a clinical trial for PTSD. Results: The LCA revealed four distinct classes of individuals: a Low Symptom class characterized by low endorsements on all symptoms; a PTSD class characterized by elevated symptoms of PTSD but low endorsement of symptoms that define the Complex PTSD and BPD diagnoses; a Complex PTSD class characterized by elevated symptoms of PTSD and self-organization symptoms that defined the Complex PTSD diagnosis but low on the symptoms of BPD; and a BPD class characterized by symptoms of BPD. Four BPD symptoms were found to greatly increase the odds of being in the BPD compared to the Complex PTSD class: frantic efforts to avoid abandonment, unstable sense of self, unstable and intense interpersonal relationships, and impulsiveness. Conclusions: Findings supported the construct validity of Complex PTSD as distinguishable from BPD. Key symptoms that distinguished between the disorders were identified, which may aid in differential diagnosis and treatment planning.

  19. How Teachers Use and Manage Their Blogs? A Cluster Analysis of Teachers' Blogs in Taiwan

    Science.gov (United States)

    Liu, Eric Zhi-Feng; Hou, Huei-Tse

    2013-01-01

    The development of Web 2.0 has ushered in a new set of web-based tools, including blogs. This study focused on how teachers use and manage their blogs. A sample of 165 teachers' blogs in Taiwan was analyzed by factor analysis, cluster analysis and qualitative content analysis. First, the teachers' blogs were analyzed according to six criteria…

  20. Abundance analysis of an extended sample of open clusters: A search for chemical inhomogeneities

    Science.gov (United States)

    Reddy, Arumalla B. S.; Giridhar, Sunetra; Lambert, David L.

    We have initiated a program to explore the presence of chemical inhomogeneities in the Galactic disk using the open clusters as ideal probes. We have analyzed high-dispersion echelle spectra (R ≥ 55,000) of red giant members for eleven open clusters to derive abundances for many elements. The membership to the cluster has been confirmed through their radial velocities and proper motions. The spread in temperatures and gravities being very small among the red giants, nearly the same stellar lines were employed thereby reducing the random errors. The errors of average abundance for the cluster were generally in 0.02 to 0.07 dex range. Our present sample covers galactocentric distances of 8.3 to 11.3 kpc and an age range of 0.2 to 4.3 Gyrs. Our earlier analysis of four open clusters (Reddy A.B.S. et al., 2012, MNRAS, 419,1350) indicate that abundances relative to Fe for elements from Na to Eu are equal within measurement uncertainties to published abundances for thin disk giants in the field. This supports the view that field stars come from disrupted open clusters. In the enlarged sample of eleven open clusters we find cluster to cluster abundance variations for some s- and r- process elements, with certain elements such as Zr and Ba showing large variation. These differences mark the signatures that these clusters had formed under different environmental conditions (Type II SN, Type Ia SN, AGB stars or a mixture of any of these) unique to the time and site of formation. These eleven clusters support the widely held impression that there is an abundance gradient such that the metallicity [Fe/H] at the solar galactocentric distance decreases outwards at about -0.1 dex per kpc.

  1. OCAAT: automated analysis of star cluster colour-magnitude diagrams for gauging the local distance scale

    Science.gov (United States)

    Perren, Gabriel I.; Vázquez, Ruben A.; Piatti, Andrés E.; Moitinho, André

    2014-05-01

    Star clusters are among the fundamental astrophysical objects used in setting the local distance scale. Despite its crucial importance, the accurate determination of the distances to the Magellanic Clouds (SMC/LMC) remains a fuzzy step in the cosmological distance ladder. The exquisite astrometry of the recently launched ESA Gaia mission is expected to deliver extremely accurate statistical parallaxes, and thus distances, to the SMC/LMC. However, an independent SMC/LMC distance determination via main sequence fitting of star clusters provides an important validation check point for the Gaia distances. This has been a valuable lesson learnt from the famous Hipparcos Pleiades distance discrepancy problem. Current observations will allow hundreds of LMC/SMC clusters to be analyzed in this light. Today, the most common approach for star cluster main sequence fitting is still by eye. The process is intrinsically subjective and affected by large uncertainties, especially when applied to poorly populated clusters. It is also, clearly, not an efficient route for addressing the analysis of hundreds, or thousands, of star clusters. These concerns, together with a new attitude towards advanced statistical techniques in astronomy and the availability of powerful computers, have led to the emergence of software packages designed for analyzing star cluster photometry. With a few rare exceptions, those packages are not publicly available. Here we present OCAAT (Open Cluster Automated Analysis Tool), a suite of publicly available open source tools that fully automatises cluster isochrone fitting. The code will be applied to a large set of hundreds of open clusters observed in the Washington system, located in the Milky Way and the Magellanic Clouds. This will allow us to generate an objective and homogeneous catalog of distances up to ~ 60 kpc along with its associated reddening, ages and metallicities and uncertainty estimates.

  2. Symptom Clusters in People Living with HIV Attending Five Palliative Care Facilities in Two Sub-Saharan African Countries: A Hierarchical Cluster Analysis.

    Directory of Open Access Journals (Sweden)

    Katrien Moens

    Full Text Available Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries.Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores.Among the sample (N=217 the mean age was 36.5 (SD 9.0, 73.2% were female, and 49.1% were on antiretroviral therapy (ART. The cluster analysis produced five symptom clusters identified as: 1 dermatological; 2 generalised anxiety and elimination; 3 social and image; 4 persistently present; and 5 a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, p<0.001; being on ART (highest proportions for clusters two and three, p=0.012; global distress (F=26.8, p<0.001, physical distress (F=36.3, p<0.001 and psychological distress subscale (F=21.8, p<0.001 (all subscales worst for cluster one, best for cluster four.The greatest burden is associated with cluster one, and should be prioritised in clinical management. Further symptom cluster research in people living with HIV with longitudinally collected symptom data to

  3. Protocol analysis of the correspondence of verbal behavior and equivalence class formation.

    OpenAIRE

    Wulfert, E; Dougher, M J; Greenway, D E

    1991-01-01

    In two equivalence experiments, a "think aloud" procedure modeled after Ericsson and Simon's (1980) protocol analysis was implemented to examine subjects' covert verbal responses during matching to sample. The purpose was to identify variables that might explain individual differences in equivalence class formation. The results from Experiment 1 suggested that subjects who formed equivalence classes described the relations among stimuli, whereas those not showing equivalence described sample ...

  4. Heavy minerals clustering analysis in application of provenance analysis of Kong 2 Member in Kongnan area

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    The main task of provenance analysis is to determine the source of sediments and the position of parent rocks. Provenance analysis may find out the relationship between erosion districts and sediment zone, between the uplift and the depression in the process of basin development. The authors use the method of heavy mineral clustering analysis and estimate the provenance direction of Huanghua Depression in the Paleogene Kong 2 Member. Research shows that there were five provenance areas of Kong 2 Member in Kongnan area.They are western (Shenusi), northwestern (Cangzhou), eastern (Ganhuatun), northeastern and southeastern. The main provenance areas were northwestern and western, while the southern provenance could not be ruled out. And these areas are consistent with the known provenance areas.

  5. Clinical heterogeneity in patients with early-stage Parkinson's disease: a cluster analysis

    Institute of Scientific and Technical Information of China (English)

    Ping LIU; Tao FENG; Yong-jun WANG; Xuan ZHANG; Biao CHEN

    2011-01-01

    The aim of this study was to investigate the clinical heterogeneity of Parkinson's disease (PD) among a cohort of Chinese patients in early stages.Clinical data on demographics,motor variables,motor phenotypes,disease progression,global cognitive function,depression,apathy,sleep quality,constipation,fatigue,and L-dopa complications were collected from 138 Chinese PD subjects in early stages (Hoehn and Yahr stages 1-3).The PD subject subtypes were classified using k-means cluster analysis according to the clinical data from five- to three-cluster consecutively.Kappa statistical analysis was performed to evaluate the consistency among different subtype solutions.The cluster analysis indicated four main subtypes:the non-tremor dominant subtype (NTD,n=28,20.3%),rapid disease progression subtype (RDP,n=7,5.1%),young-onset subtype (YO,n=50,36.2%),and tremor dominant subtype (TD,n=53,38.4%).Overall,78.3% (108/138) of subjects were always classified between the same three groups (52 always in TD,7 in RDP,and 49 in NTD),and 98.6% (136/138) between five- and four-cluster solutions.However,subjects classified as NTD in the four-cluster analysis were dispersed into different subtypes in the three-cluster analysis,with low concordance between four- and three-cluster solutions (kappa value=-0.139,P=0.001 ).This study defines clinical heterogeneity of PD patients in early stages using a data-driven approach.The subtypes generated by the four-cluster solution appear to exhibit ideal internal cohesion and external isolation.

  6. Cluster analysis of particulate matter (PM10) and black carbon (BC) concentrations

    Science.gov (United States)

    Žibert, Janez; Pražnikar, Jure

    2012-09-01

    The monitoring of air-pollution constituents like particulate matter (PM10) and black carbon (BC) can provide information about air quality and the dynamics of emissions. Air quality depends on natural and anthropogenic sources of emissions as well as the weather conditions. For a one-year period the diurnal concentrations of PM10 and BC in the Port of Koper were analysed by clustering days into similar groups according to the similarity of the BC and PM10 hourly derived day-profiles without any prior assumptions about working and non-working days, weather conditions or hot and cold seasons. The analysis was performed by using k-means clustering with the squared Euclidean distance as the similarity measure. The analysis showed that 10 clusters in the BC case produced 3 clusters with just one member day and 7 clusters that encompasses more than one day with similar BC profiles. Similar results were found in the PM10 case, where one cluster has a single-member day, while 7 clusters contain several member days. The clustering analysis revealed that the clusters with less pronounced bimodal patterns and low hourly and average daily concentrations for both types of measurements include the most days in the one-year analysis. A typical day profile of the BC measurements includes a bimodal pattern with morning and evening peaks, while the PM10 measurements reveal a less pronounced bimodality. There are also clusters with single-peak day-profiles. The BC data in such cases exhibit morning peaks, while the PM10 data consist of noon or afternoon single peaks. Single pronounced peaks can be explained by appropriate cluster wind speed profiles. The analysis also revealed some special day-profiles. The BC cluster with a high midnight peak at 30/04/2010 and the PM10 cluster with the highest observed concentration of PM10 at 01/05/2010 (208.0 μg m-3) coincide with 1 May, which is a national holiday in Slovenia and has very strong tradition of bonfire parties. The clustering of

  7. Functional cluster analysis of CT perfusion maps: a new tool for diagnosis of acute stroke?

    Science.gov (United States)

    Baumgartner, Christian; Gautsch, Kurt; Böhm, Christian; Felber, Stephan

    2005-09-01

    CT perfusion imaging constitutes an important contribution to the early diagnosis of acute stroke. Cerebral blood flow (CBF), cerebral blood volume (CBV) and time-to-peak (TTP) maps are used to estimate the severity of cerebral damage after acute ischemia. We introduce functional cluster analysis as a new tool to evaluate CT perfusion in order to identify normal brain, ischemic tissue and large vessels. CBF, CBV and TTP maps represent the basis for cluster analysis applying a partitioning (k-means) and density-based (density-based spatial clustering of applications with noise, DBSCAN) paradigm. In patients with transient ischemic attack and stroke, cluster analysis identified brain areas with distinct hemodynamic properties (gray and white matter) and segmented territorial ischemia. CBF, CBV and TTP values of each detected cluster were displayed. Our preliminary results indicate that functional cluster analysis of CT perfusion maps may become a helpful tool for the interpretation of perfusion maps and provide a rapid means for the segmentation of ischemic tissue. PMID:15827821

  8. A Latent Class Analysis of Risk Factors for Acquiring HIV Among Men Who Have Sex with Men: Implications for Implementing Pre-Exposure Prophylaxis Programs.

    Science.gov (United States)

    Chan, Philip A; Rose, Jennifer; Maher, Justine; Benben, Stacey; Pfeiffer, Kristen; Almonte, Alexi; Poceta, Joanna; Oldenburg, Catherine E; Parker, Sharon; Marshall, Brandon Dl; Lally, Mickey; Mayer, Kenneth; Mena, Leandro; Patel, Rupa; Nunn, Amy S

    2015-11-01

    Current Centers for Disease Control and Prevention (CDC) guidelines for prescribing pre-exposure prophylaxis (PrEP) to prevent HIV transmission are broad. In order to better characterize groups who may benefit most from PrEP, we reviewed demographics, behaviors, and clinical outcomes for individuals presenting to a publicly-funded sexually transmitted diseases (STD) clinic in Providence, Rhode Island, from 2012 to 2014. Latent class analysis (LCA) was used to identify subgroups of men who have sex with men (MSM) at highest risk for contracting HIV. A total of 1723 individuals presented for testing (75% male; 31% MSM). MSM were more likely to test HIV positive than heterosexual men or women. Among 538 MSM, we identified four latent classes. Class 1 had the highest rates of incarceration (33%), forced sex (24%), but had no HIV infections. Class 2 had 10 anal sex partners in the previous 12 months (69%), anonymous partners (100%), drug/alcohol use during sex (76%), and prior STDs (40%). Class 4 had similar characteristics and HIV prevalence as Class 2. In this population, MSM who may benefit most from PrEP include those who have >10 sexual partners per year, anonymous partners, drug/alcohol use during sex and prior STDs. LCA is a useful tool for identifying clusters of characteristics that may place individuals at higher risk for HIV infection and who may benefit most from PrEP in clinical practice. PMID:26389735

  9. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Directory of Open Access Journals (Sweden)

    Valentina Meuti

    2014-01-01

    Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  10. Doing Class Analysis in Singapore's Elite Education: Unravelling the Smokescreen of "Meritocratic Talk"

    Science.gov (United States)

    Koh, Aaron

    2014-01-01

    This paper examines the specificity of the education-class nexus in an elite independent school in Singapore. It seeks to unravel the puzzle that meritocracy is dogmatically believed in Singapore in spite of evidences that point to the contrary. The paper draws on discursive (analysis of media materials) and institutional (analysis of interview…

  11. Chaotic Artificial Bee Colony Used for Cluster Analysis

    Science.gov (United States)

    Zhang, Yudong; Wu, Lenan; Wang, Shuihua; Huo, Yuankai

    A new approach based on artificial bee colony (ABC) with chaotic theory was proposed to solve the partitional clustering problem. We first investigate the optimization model including both the encoding strategy and the variance ratio criterion (VRC). Second, a chaotic ABC algorithm was developed based on the Rossler attractor. Experiments on three types of artificial data of different degrees of overlapping all demonstrate the CABC is superior to both genetic algorithm (GA) and combinatorial particle swarm optimization (CPSO) in terms of robustness and computation time.

  12. A Spectroscopic Analysis of the Galactic Globular Cluster NGC 6273 (M19)

    CERN Document Server

    Johnson, Christian I; Pilachowski, Catherine A; Caldwell, Nelson; Mateo, Mario; Bailey, John I; Crane, Jeffrey D

    2015-01-01

    A combined effort utilizing spectroscopy and photometry has revealed the existence of a new globular cluster class. These "anomalous" clusters, which we refer to as "iron-complex" clusters, are differentiated from normal clusters by exhibiting large (>0.10 dex) intrinsic metallicity dispersions, complex sub-giant branches, and correlated [Fe/H] and s-process enhancements. In order to further investigate this phenomenon, we have measured radial velocities and chemical abundances for red giant branch stars in the massive, but scarcely studied, globular cluster NGC 6273. The velocities and abundances were determined using high resolution (R~27,000) spectra obtained with the Michigan/Magellan Fiber System (M2FS) and MSpec spectrograph on the Magellan-Clay 6.5m telescope at Las Campanas Observatory. We find that NGC 6273 has an average heliocentric radial velocity of +144.49 km s^-1 (sigma=9.64 km s^-1) and an extended metallicity distribution ([Fe/H]=-1.80 to -1.30) composed of at least two distinct stellar popul...

  13. Work Disability among Employees with Diabetes: Latent Class Analysis of Risk Factors in Three Prospective Cohort Studies.

    Directory of Open Access Journals (Sweden)

    Marianna Virtanen

    Full Text Available Studies of work disability in diabetes have examined diabetes as a homogeneous disease. We sought to identify subgroups among persons with diabetes based on potential risk factors for work disability.Participants were 2,445 employees with diabetes from three prospective cohorts (the Finnish Public Sector study, the GAZEL study, and the Whitehall II study. Work disability was ascertained via linkage to registers of sickness absence and disability pensions during a follow-up of 4 years. Study-specific latent class analysis was used to identify subgroups according to prevalent comorbid disease and health-risk behaviours. Study-specific associations with work disability at follow-up were pooled using fixed-effects meta-analysis.Separate latent class analyses for men and women in each cohort supported a two-class solution with one subgroup (total n = 1,086; 44.4% having high prevalence of chronic somatic diseases, psychological symptoms, obesity, physical inactivity and abstinence from alcohol and the other subgroup (total n = 1,359; 55.6% low prevalence of these factors. In the adjusted meta-analyses, participants in the 'high-risk' group had more work disability days (pooled rate ratio = 1.66, 95% CI 1.38-1.99 and more work disability episodes (pooled rate ratio = 1.33, 95% CI 1.21-1.46. These associations were similar in men and women, younger and older participants, and across occupational groups.Diabetes is not a homogeneous disease in terms of work disability risk. Approximately half of people with diabetes are assigned to a subgroup characterised by clustering of comorbid health conditions, obesity, physical inactivity, abstinence of alcohol, and associated high risk of work disability; the other half to a subgroup characterised by a more favourable risk profile.

  14. Stellar variability in open clusters . II. Discovery of a new period-luminosity relation in a class of fast-rotating pulsating stars in NGC 3766

    Science.gov (United States)

    Mowlavi, N.; Saesen, S.; Semaan, T.; Eggenberger, P.; Barblan, F.; Eyer, L.; Ekström, S.; Georgy, C.

    2016-10-01

    Context. Pulsating stars are windows to the physics of stars enabling us to see glimpses of their interior. Not all stars pulsate, however. On the main sequence, pulsating stars form an almost continuous sequence in brightness, except for a magnitude range between δ Scuti and slowly pulsating B stars. Against all expectations, 36 periodic variables were discovered in 2013 in this luminosity range in the open cluster NGC 3766, the origins of which was a mystery. Aims: We investigate the properties of those new variability class candidates in relation to their stellar rotation rates and stellar multiplicity. Methods: We took multi-epoch spectra over three consecutive nights using ESO's Very Large Telescope. Results: We find that the majority of the new variability class candidates are fast-rotating pulsators that obey a new period-luminosity relation. We argue that the new relation discovered here has a different physical origin to the period-luminosity relations observed for Cepheids. Conclusions: We anticipate that our discovery will boost the relatively new field of stellar pulsation in fast-rotating stars, will open new doors for asteroseismology, and will potentially offer a new tool to estimate stellar ages or cosmic distances. Based on observations made with the FLAMES instruments on the VLT/UT2 telescope at the Paranal Observatory, Chile, under the program ID 69.A-0123(A).

  15. Insights into Quasar UV Spectra Using Unsupervised Clustering Analysis

    CERN Document Server

    Tammour, Aycha; Daley, Mark; Richards, Gordon T

    2016-01-01

    Machine learning can provide powerful tools to detect patterns in multi-dimensional parameter space. We use K-means -a simple yet powerful unsupervised clustering algorithm which picks out structure in unlabeled data- to study a sample of quasar UV spectra from the Quasar Catalog of the 10th Data Release of the Sloan Digital Sky Survey of Paris et al. (2014). Detecting patterns in large datasets helps us gain insights into the physical conditions and processes giving rise to the observed properties of quasars. We use K-means to find clusters in the parameter space of the equivalent width (EW), the blue- and red-half-width at half-maximum (HWHM) of the Mg II 2800 A line, the C IV 1549 A line, and the C III] 1908 A blend in samples of Broad Absorption-Line (BAL) and non-BAL quasars at redshift 1.6-2.1. Using this method, we successfully recover correlations well-known in the UV regime such as the anti-correlation between the EW and blueshift of the C IV emission line and the shape of the ionizing Spectra Energy...

  16. A critical cluster analysis of 44 indicators of author-level performance

    DEFF Research Database (Denmark)

    Wildgaard, Lorna Elizabeth

    2015-01-01

    , followed by a risk analysis and ordinal logistic regression to explore cluster membership. Indicator scores were contextualized using the individual researcher's curriculum vitae. Four different clusters based on indicator scores ranked researchers as low, middle, high and extremely high performers...... useful in identifying disciplinary appropriate indicators providing the preliminary data preparation was thorough but needed to be supplemented by other analyses to validate the results. A general disconnection between the performance of the researcher on their curriculum vitae and the performance...

  17. A cluster randomized-controlled trial of a classroom-based drama workshop program to improve mental health outcomes among immigrant and refugee youth in special classes.

    Directory of Open Access Journals (Sweden)

    Cécile Rousseau

    Full Text Available The aim of this cluster randomized trial was to evaluate the effectiveness of a school-based theatre intervention program for immigrant and refugee youth in special classes for improving mental health and academic outcomes. The primary hypothesis was that students in the theatre intervention group would report a greater reduction in impairment from symptoms compared to students in the control and tutoring groups.Special classrooms in five multiethnic high schools were randomly assigned to theater intervention (n = 10, tutoring (n = 10 or control status (n = 9, for a total of 477 participants. Students and teachers were non-blinded to group assignment. The primary outcome was impairment from emotional and behavioural symptoms assessed by the Impact Supplement of the Strengths and Difficulties Questionnaire (SDQ completed by the adolescents. The secondary outcomes were the SDQ global scores (teacher and youth reports, impairment assessed by teachers and school performance. The effect of the interventions was assessed through linear mixed effect models which incorporate the correlation between students in the same class, due to the nature of the randomization of the interventions by classroom.The theatre intervention was not associated with a greater reduction in self-reported impairment and symptoms in youth placed in special class because of learning, emotional and behavioural difficulties than a tutoring intervention or a non-active control group. The estimates of the different models show a non-significant decrease in both self-reported and impairment scores in the theatre intervention group for the overall group, but the impairment score decreased significantly for first generation adolescents while it increased for second generation adolescents.The difference between the population of immigrant and refugee youth newcomers studied previously and the sample of this trial may explain some of the differences in the observed impact of

  18. Screening for personality disorders: A new questionnaire and its validation using Latent Class Analysis

    Directory of Open Access Journals (Sweden)

    Julia Lange

    2012-12-01

    Full Text Available Background: We evaluated a new screening instrument for personality disorders. The Personality Disorder Screening (PDS is a self-administered screening questionnaire that includes 12 items from the Personality Self Portrait (Oldham & Morris, 1990. Sampling and methods: The data of n = 966 participants recruited from the non-clinical population and from different clinical settings were analyzed using latent class analysis. Results: A 4-class model fitted the data best. It confirmed a classification model for personality disorders proposed by Gunderson (1984 and showed high reliability and validity. One class corresponded to “healthy” individuals (40.6 %, and one class to individuals with personality disorders (17.2 %. Two additional classes represented individuals with specific personality styles. Evidence for convergent validity was found in terms of strong associations of the classification with the Structured Clinical Interview (SCID-II for diagnosing personality disorders. The latent classes also showed theoretically expected associations with membership in different subsamples. Conclusions: The PDS shows promise as a new instrument for identifying different classes of personality disorder severity already at the screening stage of the diagnostic process.

  19. The application of cluster analysis in the intercomparison of loop structures in RNA.

    Science.gov (United States)

    Huang, Hung-Chung; Nagaswamy, Uma; Fox, George E

    2005-04-01

    We have developed a computational approach for the comparison and classification of RNA loop structures. Hairpin or interior loops identified in atomic resolution RNA structures were intercompared by conformational matching. The root-mean-square deviation (RMSD) values between all pairs of RNA fragments of interest, even if from different molecules, are calculated. Subsequently, cluster analysis is performed on the resulting matrix of RMSD distances using the unweighted pair group method with arithmetic mean (UPGMA). The cluster analysis objectively reveals groups of folds that resemble one another. To demonstrate the utility of the approach, a comprehensive analysis of all the terminal hairpin tetraloops that have been observed in 15 RNA structures that have been determined by X-ray crystallography was undertaken. The method found major clusters corresponding to the well-known GNRA and UNCG types. In addition, two tetraloops with the unusual primary sequence UMAC (M is A or C) were successfully assigned to the GNRA cluster. Larger loop structures were also examined and the clustering results confirmed the occurrence of variations of the GNRA and UNCG tetraloops in these loops and provided a systematic means for locating them. Nineteen examples of larger loops that closely resemble either the GNRA or UNCG tetraloop were found in the large ribosomal RNAs. When the clustering approach was extended to include all structures in the SCOR database, novel relationships were detected including one between the ANYA motif and a less common folding of the GAAA tetraloop sequence.

  20. Evaluation of socio-economic patterns of SHG members in Kerala using clustering analysis

    Directory of Open Access Journals (Sweden)

    Sajeev B. U

    2012-03-01

    Full Text Available Abstracts In the matter of social development, though Kerala stands ahead of all other states in India, the pattern of distribution of social and economic opportunities within the state is highly inequitable among different social groups. Self help groups (SHG are vehicles for social, political and financial intermediation of the state. Clustering analysis is one of the main analytical methods in data mining; the method of clustering algorithm will influence the clustering results directly. K-means and Fuzzy C-Means Algorithms are popular methods in cluster analysis. In this paper we have evaluated the socioeconomic developments of SHG in various districts in Kerala state using cluster analysis. The data were collected by field survey and interviews. The parameters considered for the study include the regularity of the members in attending meetings and training, social and economic benefits gained by the members in personal level, cluster level and society level, rate of employment and earning members in the family and literacy and educational level of SHG members.

  1. Analysis and comparison of very large metagenomes with fast clustering and functional annotation

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2009-10-01

    Full Text Available Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from http://tools.camera.calit2.net/camera/rammcap/.

  2. Validity Index and number of clusters

    Directory of Open Access Journals (Sweden)

    Mohamed Fadhel Saad

    2012-01-01

    Full Text Available Clustering (or cluster analysis has been used widely in pattern recognition, image processing, and data analysis. It aims to organize a collection of data items into c clusters, such that items within a cluster are more similar to each other than they are items in the other clusters. The number of clusters c is the most important parameter, in the sense that the remaining parameters have less influence on the resulting partition. To determine the best number of classes several methods were made, and are called validity index. This paper presents a new validity index for fuzzy clustering called a Modified Partition Coefficient And Exponential Separation (MPCAES index. The efficiency of the proposed MPCAES index is compared with several popular validity indexes. More information about these indexes is acquired in series of numerical comparisons and also real data Iris.

  3. Clustering the lexicon in the brain: a meta-analysis of the neurofunctional evidence on noun and verb processing

    Science.gov (United States)

    Crepaldi, Davide; Berlingeri, Manuela; Cattinelli, Isabella; Borghese, Nunzio A.; Luzzatti, Claudio; Paulesu, Eraldo

    2013-01-01

    Although it is widely accepted that nouns and verbs are functionally independent linguistic entities, it is less clear whether their processing recruits different brain areas. This issue is particularly relevant for those theories of lexical semantics (and, more in general, of cognition) that suggest the embodiment of abstract concepts, i.e., based strongly on perceptual and motoric representations. This paper presents a formal meta-analysis of the neuroimaging evidence on noun and verb processing in order to address this dichotomy more effectively at the anatomical level. We used a hierarchical clustering algorithm that grouped fMRI/PET activation peaks solely on the basis of spatial proximity. Cluster specificity for grammatical class was then tested on the basis of the noun-verb distribution of the activation peaks included in each cluster. Thirty-two clusters were identified: three were associated with nouns across different tasks (in the right inferior temporal gyrus, the left angular gyrus, and the left inferior parietal gyrus); one with verbs across different tasks (in the posterior part of the right middle temporal gyrus); and three showed verb specificity in some tasks and noun specificity in others (in the left and right inferior frontal gyrus and the left insula). These results do not support the popular tenets that verb processing is predominantly based in the left frontal cortex and noun processing relies specifically on temporal regions; nor do they support the idea that verb lexical-semantic representations are heavily based on embodied motoric information. Our findings suggest instead that the cerebral circuits deputed to noun and verb processing lie in close spatial proximity in a wide network including frontal, parietal, and temporal regions. The data also indicate a predominant—but not exclusive—left lateralization of the network. PMID:23825451

  4. Clustering the Lexicon in the Brain: A Meta‑Analysis of the Neurofunctional Evidence on Noun and Verb Processing

    Directory of Open Access Journals (Sweden)

    Davide eCrepaldi

    2013-06-01

    Full Text Available Although it is widely accepted that nouns and verbs are functionally independent linguistic entities, it is less clear whether their processing recruits different brain areas. This issue is particularly relevant for those theories of lexical semantics (and, more in general, of cognition that suggest the embodiment of abstract concepts, i.e., based strongly on perceptual and motoric representations. This paper presents a formal meta‑analysis of the neuroimaging evidence on noun and verb processing in order to address this dichotomy more effectively at the anatomical level. We used a hierarchical clustering algorithm that grouped fMRI/PET activation peaks solely on the basis of spatial proximity. Cluster specificity for grammatical class was then tested on the basis of the noun‑verb distribution of the activation peaks included in each cluster. 32 clusters were identified: three were associated with nouns across different tasks (in the right inferior temporal gyrus, the left angular gyrus, and the left inferior parietal gyrus; one with verbs across different tasks (in the posterior part of the right middle temporal gyrus; and three showed verb specificity in some tasks and noun specificity in others (in the left and right inferior frontal gyrus and the left insula. These results do not support the popular tenets that verb processing is predominantly based in the left frontal cortex and noun processing relies specifically on temporal regions; nor do they support the idea that verb lexical‑semantic representations are heavily based on embodied motoric information. Our findings suggest instead that the cerebral circuits deputed to noun and verb processing lie in close spatial proximity in a wide network including frontal, parietal, and temporal regions. The data also indicate a predominant – but not exclusive – left lateralization of the network.

  5. Identifying differences in the experience of (in)authenticity: a latent class analysis approach.

    Science.gov (United States)

    Lenton, Alison P; Slabu, Letitia; Bruder, Martin; Sedikides, Constantine

    2014-01-01

    Generally, psychologists consider state authenticity - that is, the subjective sense of being one's true self - to be a unitary and unidimensional construct, such that (a) the phenomenological experience of authenticity is thought to be similar no matter its trigger, and (b) inauthenticity is thought to be simply the opposing pole (on the same underlying construct) of authenticity. Using latent class analysis, we put this conceptualization to a test. In order to avoid over-reliance on a Western conceptualization of authenticity, we used a cross-cultural sample (N = 543), comprising participants from Western, South-Asian, East-Asian, and South-East Asian cultures. Participants provided either a narrative in which the described when they felt most like being themselves or one in which they described when they felt least like being themselves. The analysis identified six distinct classes of experiences: two authenticity classes ("everyday" and "extraordinary"), three inauthenticity classes ("self-conscious," "deflated," and "extraordinary"), and a class representing convergence between authenticity and inauthenticity. The classes were phenomenologically distinct, especially with respect to negative affect, private and public self-consciousness, and self-esteem. Furthermore, relatively more interdependent cultures were less likely to report experiences of extraordinary (in)authenticity than relatively more independent cultures. Understanding the many facets of (in)authenticity may enable researchers to connect different findings and explain why the attainment of authenticity can be difficult.

  6. Provenience study of medieval Bulgarian glasses by NAA and cluster analysis

    International Nuclear Information System (INIS)

    The neutron activation analysis results from 30 glass samples were subjected to cluster analysis. The reliable localization of part of the medieval glass finds from Preslav enabled the evaluation of the variety of the production of a medieval glass workshop (ninth-tenth century), allowing conclusions to be made about the technological level of glass-making in Bulgaria during the Middle Ages. The work proved that NAA followed by cluster analysis is a successful approach to finding the local and chronological features of the investigated glasses. (author)

  7. ANALYSIS OF RISING TUITION RATES IN THE UNITED STATES BASED ON CLUSTERING ANALYSIS AND REGRESSION MODELS

    Directory of Open Access Journals (Sweden)

    Long Cheng

    2016-05-01

    Full Text Available Since higher education is one of the major driving forces for country development and social prosperity, and tuition plays a significant role in determining whether or not a person can afford to receive higher education, the rising tuition is a topic of big concern today. So it is essentially necessary to understand what factors affect the tuition and how they increase or decrease the tuition. Many existing studies on the rising tuition either lack large amounts of real data and proper quantitative models to support their conclusions, or are limited to focus on only a few factors that might affect the tuition, which fail to make a comprehensive analysis. In this paper, we explore a wide variety of factors that might affect the tuition growth rate by use of large amounts of authentic data and different quantitative methods such as clustering analysis and regression models.

  8. Displacement of Building Cluster Using Field Analysis Method

    Institute of Scientific and Technical Information of China (English)

    Al Tinghua

    2003-01-01

    This paper presents a field based method to deal with the displacement of building cluster,which is driven by the street widening. The compress of street boundary results in the force to push the building moving inside and the force propagation is a decay process. To describe the phenomenon above, the field theory is introduced with the representation model of isoline. On the basis of the skeleton of Delaunay triangulation,the displacement field is built in which the propagation force is related to the adjacency degree with respect to the street boundary. The study offers the computation of displacement direction and offset distance for the building displacement. The vector operation is performed on the basis of grade and other field concepts.

  9. Cluster analysis for the probability of DSB site induced by electron tracks

    International Nuclear Information System (INIS)

    To clarify the influence of bio-cells exposed to ionizing radiations, the densely populated pattern of the ionization in the cell nucleus is of importance because it governs the extent of DNA damage which may lead to cell lethality. In this study, we have conducted a cluster analysis of ionization and excitation events to estimate the number of double-strand breaks (DSBs) induced by electron tracks. A Monte Carlo simulation for electrons in liquid water was performed to determine the spatial location of the ionization and excitation events. The events were divided into clusters by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The algorithm enables us to sort out the events into the groups (clusters) in which a minimum number of neighboring events are contained within a given radius. For evaluating the number of DSBs in the extracted clusters, we have introduced an aggregation index (AI). The computational results show that a sub-keV electron produces DSBs in a dense formation more effectively than higher energy electrons. The root-mean square radius (RMSR) of the cluster size is below 5 nm, which is smaller than the chromatin fiber thickness. It was found that this size of clustering events has a high possibility to cause lesions in DNA within the chromatin fiber site

  10. Mapping informative clusters in a hierarchical [corrected] framework of FMRI multivariate analysis.

    Directory of Open Access Journals (Sweden)

    Rui Xu

    Full Text Available Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies.

  11. Cluster analysis for the probability of DSB site induced by electron tracks

    Energy Technology Data Exchange (ETDEWEB)

    Yoshii, Y. [Biological Research, Education and Instrumentation Center, Sapporo Medical University, Sapporo 060-8556 (Japan); Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Sasaki, K. [Faculty of Health Sciences, Hokkaido University of Science, Sapporo 006-8585 (Japan); Matsuya, Y. [Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Date, H., E-mail: date@hs.hokudai.ac.jp [Faculty of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan)

    2015-05-01

    To clarify the influence of bio-cells exposed to ionizing radiations, the densely populated pattern of the ionization in the cell nucleus is of importance because it governs the extent of DNA damage which may lead to cell lethality. In this study, we have conducted a cluster analysis of ionization and excitation events to estimate the number of double-strand breaks (DSBs) induced by electron tracks. A Monte Carlo simulation for electrons in liquid water was performed to determine the spatial location of the ionization and excitation events. The events were divided into clusters by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The algorithm enables us to sort out the events into the groups (clusters) in which a minimum number of neighboring events are contained within a given radius. For evaluating the number of DSBs in the extracted clusters, we have introduced an aggregation index (AI). The computational results show that a sub-keV electron produces DSBs in a dense formation more effectively than higher energy electrons. The root-mean square radius (RMSR) of the cluster size is below 5 nm, which is smaller than the chromatin fiber thickness. It was found that this size of clustering events has a high possibility to cause lesions in DNA within the chromatin fiber site.

  12. Profiling nurses' job satisfaction, acculturation, work environment, stress, cultural values and coping abilities: A cluster analysis.

    Science.gov (United States)

    Goh, Yong-Shian; Lee, Alice; Chan, Sally Wai-Chi; Chan, Moon Fai

    2015-08-01

    This study aimed to determine whether definable profiles existed in a cohort of nursing staff with regard to demographic characteristics, job satisfaction, acculturation, work environment, stress, cultural values and coping abilities. A survey was conducted in one hospital in Singapore from June to July 2012, and 814 full-time staff nurses completed a self-report questionnaire (89% response rate). Demographic characteristics, job satisfaction, acculturation, work environment, perceived stress, cultural values, ways of coping and intention to leave current workplace were assessed as outcomes. The two-step cluster analysis revealed three clusters. Nurses in cluster 1 (n = 222) had lower acculturation scores than nurses in cluster 3. Cluster 2 (n = 362) was a group of younger nurses who reported higher intention to leave (22.4%), stress level and job dissatisfaction than the other two clusters. Nurses in cluster 3 (n = 230) were mostly Singaporean and reported the lowest intention to leave (13.0%). Resources should be allocated to specifically address the needs of younger nurses and hopefully retain them in the profession. Management should focus their retention strategies on junior nurses and provide a work environment that helps to strengthen their intention to remain in nursing by increasing their job satisfaction. PMID:24754648

  13. Fatigue Feature Extraction Analysis based on a K-Means Clustering Approach

    Directory of Open Access Journals (Sweden)

    M.F.M. Yunoh

    2015-06-01

    Full Text Available This paper focuses on clustering analysis using a K-means approach for fatigue feature dataset extraction. The aim of this study is to group the dataset as closely as possible (homogeneity for the scattered dataset. Kurtosis, the wavelet-based energy coefficient and fatigue damage are calculated for all segments after the extraction process using wavelet transform. Kurtosis, the wavelet-based energy coefficient and fatigue damage are used as input data for the K-means clustering approach. K-means clustering calculates the average distance of each group from the centroid and gives the objective function values. Based on the results, maximum values of the objective function can be seen in the two centroid clusters, with a value of 11.58. The minimum objective function value is found at 8.06 for five centroid clusters. It can be seen that the objective function with the lowest value for the number of clusters is equal to five; which is therefore the best cluster for the dataset.

  14. Event-by-event cluster analysis of final states from heavy ion collisions

    OpenAIRE

    Fialkowski, K.; Wit, R.

    1999-01-01

    We present an event-by-event analysis of the cluster structure of final multihadron states resulting from heavy ion collisions. A comparison of experimental data with the states obtained from Monte Carlo generators is shown. The analysis of the first available experimental events suggests that the method is suitable for selecting some different types of events.

  15. Standardized Effect Size Measures for Mediation Analysis in Cluster-Randomized Trials

    Science.gov (United States)

    Stapleton, Laura M.; Pituch, Keenan A.; Dion, Eric

    2015-01-01

    This article presents 3 standardized effect size measures to use when sharing results of an analysis of mediation of treatment effects for cluster-randomized trials. The authors discuss 3 examples of mediation analysis (upper-level mediation, cross-level mediation, and cross-level mediation with a contextual effect) with demonstration of the…

  16. Cluster analysis of flow cytometric list mode data on a personal computer

    NARCIS (Netherlands)

    Bakker Schut, Tom C.; Grooth, de Bart G.; Greve, Jan

    1993-01-01

    A cluster analysis algorithm, dedicated to analysis of flow cytometric data is described. The algorithm is written in Pascal and implemented on an MS-DOS personal computer. It uses k-means, initialized with a large number of seed points, followed by a modified nearest neighbor technique to reduce th

  17. Identifying patterns in treatment response profiles in acute bipolar mania: a cluster analysis approach

    Directory of Open Access Journals (Sweden)

    Houston John P

    2008-07-01

    Full Text Available Abstract Background Patients with acute mania respond differentially to treatment and, in many cases, fail to obtain or sustain symptom remission. The objective of this exploratory analysis was to characterize response in bipolar disorder by identifying groups of patients with similar manic symptom response profiles. Methods Patients (n = 222 were selected from a randomized, double-blind study of treatment with olanzapine or divalproex in bipolar I disorder, manic or mixed episode, with or without psychotic features. Hierarchical clustering based on Ward's distance was used to identify groups of patients based on Young-Mania Rating Scale (YMRS total scores at each of 5 assessments over 7 weeks. Logistic regression was used to identify baseline predictors for clusters of interest. Results Four distinct clusters of patients were identified: Cluster 1 (n = 64: patients did not maintain a response (YMRS total scores ≤ 12; Cluster 2 (n = 92: patients responded rapidly (within less than a week and response was maintained; Cluster 3 (n = 36: patients responded rapidly but relapsed soon afterwards (YMRS ≥ 15; Cluster 4 (n = 30: patients responded slowly (≥ 2 weeks and response was maintained. Predictive models using baseline variables found YMRS Item 10 (Appearance, and psychosis to be significant predictors for Clusters 1 and 4 vs. Clusters 2 and 3, but none of the baseline characteristics allowed discriminating between Clusters 1 vs. 4. Experiencing a mixed episode at baseline predicted membership in Clusters 2 and 3 vs. Clusters 1 and 4. Treatment with divalproex, larger number of previous manic episodes, lack of disruptive-aggressive behavior, and more prominent depressive symptoms at baseline were predictors for Cluster 3 vs. 2. Conclusion Distinct treatment response profiles can be predicted by clinical features at baseline. The presence of these features as potential risk factors for relapse in patients who have responded to treatment

  18. O 18 Brumário e a análise de classe contemporânea The Eighteenth Brumaire and the contemporary class analysis

    Directory of Open Access Journals (Sweden)

    Renato Monseff Perissinotto

    2007-01-01

    Full Text Available Este artigo considera O 18 Brumário de Louis Bonaparte uma espécie de súmula que condensa todas as dificuldades inerentes à análise de classe da política. O artigo está dividido em cinco partes. Na primeira, são analisadas as passagens de O 18 Brumário que enunciam algumas proposições fundamentais acerca da análise política de classe; na segunda, mostra-se que a literatura marxista contemporânea não solucionou os problemas identificados em relação às proposições marxianas; as terceira e quarta partes discutem algumas perspectivas alternativas (classistas e não - classistas ao marxismo; por fim, à guisa de conclusão, faz-se algumas reflexões sobre modos possíveis de operacionalizar a análise de classe da política e sobre os problemas a serem enfrentados nesses casos.This article considers The Eighteenth Brumaire of Louis Napoleon a kind of summary which condenses all the inherent difficulties to the class analysis of Politics. The article is divided in four parts. In the first part, it analyses some passages of The Eighteenth Brumaire that enunciate some fundamental propositions on class analysis of Politics; secondly, it asserts that contemporary Marxist literature on class has not solved the problems here pointed out; in the third and forth parts it discusses some class and non-class perspectives alternative to Marxism; at last, it essays some reflections on possible ways of elaborating with the class analysis of Politics and the problems to be overcome in those cases.

  19. Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing binary outcomes: a simulation study

    Directory of Open Access Journals (Sweden)

    Ma Jinhui

    2013-01-01

    Full Text Available Abstracts Background The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE and cluster-specific (i.e. random-effects logistic regression (RELR models for analyzing data from cluster randomized trials (CRTs with missing binary responses. Methods In this simulation study, clustered responses were generated from a beta-binomial distribution. The number of clusters per trial arm, the number of subjects per cluster, intra-cluster correlation coefficient, and the percentage of missing data were allowed to vary. Under the assumption of covariate dependent missingness, missing outcomes were handled by complete case analysis, standard multiple imputation (MI and within-cluster MI strategies. Data were analyzed using GEE and RELR. Performance of the methods was assessed using standardized bias, empirical standard error, root mean squared error (RMSE, and coverage probability. Results GEE performs well on all four measures — provided the downward bias of the standard error (when the number of clusters per arm is small is adjusted appropriately — under the following scenarios: complete case analysis for CRTs with a small amount of missing data; standard MI for CRTs with variance inflation factor (VIF 50. RELR performs well only when a small amount of data was missing, and complete case analysis was applied. Conclusion GEE performs well as long as appropriate missing data strategies are adopted based on the design of CRTs and the percentage of missing data. In contrast, RELR does not perform well when either standard or within-cluster MI strategy is applied prior to the analysis.

  20. Fault detection of flywheel system based on clustering and principal component analysis

    Directory of Open Access Journals (Sweden)

    Wang Rixin

    2015-12-01

    Full Text Available Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of “integrated power and attitude control” system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the relationship of parameters in each operation through the principal component analysis (PCA method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.

  1. Fault detection of flywheel system based on clustering and principal component analysis

    Institute of Scientific and Technical Information of China (English)

    Wang Rixin; Gong Xuebing; Xu Minqiang; Li Yuqing

    2015-01-01

    Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of‘‘integrated power and attitude control”system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS) can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the rela-tionship of parameters in each operation through the principal component analysis (PCA) method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.

  2. A COMPARISON BETWEEN SINGLE LINKAGE AND COMPLETE LINKAGE IN AGGLOMERATIVE HIERARCHICAL CLUSTER ANALYSIS FOR IDENTIFYING TOURISTS SEGMENTS

    Directory of Open Access Journals (Sweden)

    Noor Rashidah Rashid

    2012-02-01

    Full Text Available Cluster Analysis is a multivariate method in statistics. Agglomerative Hierarchical Cluster Analysis is one of approaches in Cluster Analysis. There are two linkage methods in Agglomerative Hierarchical Cluster Analysis which are Single Linkage and Complete Linkage. The purpose of this study is to compare between Single Linkage and Complete Linkage in Agglomerative Hierarchical Cluster Analysis. The comparison of performances between these linkage methods was shown by using Kruskal-Wallis test. The result of the comparison used for segmenting tourists of Kapas Island. The statistical software SPSS has been applied to analyze data of this research. The result from Kruskal-Wallis test shows Complete Linkage is more useful in identifying tourists segments. Keywords : Agglomerative Hierarchical Cluster Analysis, Single Linkage, Complete Linkage, Kruskal-Wallis test, tourists

  3. Identification and structural analysis of a novel snoRNA gene cluster from Arabidopsis thaliana

    Institute of Scientific and Technical Information of China (English)

    周惠; 孟清; 屈良鹄

    2000-01-01

    A 22 snoRNA gene cluster, consisting of four antisense snoRNA genes, was identified from Arabidopsis thaliana. The sequence and structural analysis showed that the 22 snoRNA gene cluster might be transcribed as a polycistronic precursor from an upstream promoter, and the in-tergenic spacers of the gene cluster encode the ’hairpin’ structures similar to the processing recognition signals of yeast Saccharomyces cerevisiae polycistronic snoRNA precursor. The results also revealed that plant snoRNA gene with multiple copies is a characteristic in common, and provides a good system for further revealing the transcription and expression mechanism of plant snoRNA gene cluster.

  4. Design and Analysis of SD_DWCA - A Mobility Based Clustering of Homogeneous MANETs

    Directory of Open Access Journals (Sweden)

    T.N. Janakiraman

    2011-05-01

    Full Text Available This paper deals with the design and analysis of the distributed weighted clustering algorithm SD_DWCAproposed for homogeneous mobile ad hoc networks. It is a connectivity, mobility and energy based clustering algorithm which is suitable for scalable ad hoc networks. The algorithm uses a new graph parameter called strong degree defined based on the quality of neighbours of a node. The parameters are so chosen to ensure high connectivity, cluster stability and energy efficient communication among nodes of high dynamic nature. This paper also includes the experimental results of the algorithm implementedusing the network simulator NS2. The experimental results show that the algorithm is suitable for highspeed networks and generate stable clusters with less maintenance overhead.

  5. PERFORMANCE EVALUATION OF CLUSTERING IN WEB-LOG ANALYSIS BASED ON AGENT

    Directory of Open Access Journals (Sweden)

    Himani

    2012-06-01

    Full Text Available Web mining is the use of data mining Technique toautomatically discover & extract information from webdocuments. When user searches for goods the managementagent receives order from graphical user interface.Management agent receives information, update agentinformation store house and feedback the mining result touser. Intelligent agent can help making computer systemeasier to use, enable finding & filtering information. Themining agent is the analytical center of whole agentsystem.It mainly adopts two kind of analytical method:related rule mining and cluster analysis. Cluster of objectsare formed so that objects with in a cluster have highsimilarity. The aim of this paper is to analyze the web logdata .To achieve this clustering tool is used. It performs intwo phases. First it captures the web-log data. Then itanalyzes the data& discovers the hidden pattern. Agentrequires an agent communication language to describe &process agent request. The future internet will use PERL toencode information with meaningful structure & semantics.

  6. Functional Interference Clusters in Cancer Patients With Bone Metastases: A Secondary Analysis of RTOG 9714

    International Nuclear Information System (INIS)

    Purpose: To explore the relationships (clusters) among the functional interference items in the Brief Pain Inventory (BPI) in patients with bone metastases. Methods: Patients enrolled in the Radiation Therapy Oncology Group (RTOG) 9714 bone metastases study were eligible. Patients were assessed at baseline and 4, 8, and 12 weeks after randomization for the palliative radiotherapy with the BPI, which consists of seven functional items: general activity, mood, walking ability, normal work, relations with others, sleep, and enjoyment of life. Principal component analysis with varimax rotation was used to determine the clusters between the functional items at baseline and the follow-up. Cronbach's alpha was used to determine the consistency and reliability of each cluster at baseline and follow-up. Results: There were 448 male and 461 female patients, with a median age of 67 years. There were two functional interference clusters at baseline, which accounted for 71% of the total variance. The first cluster (physical interference) included normal work and walking ability, which accounted for 58% of the total variance. The second cluster (psychosocial interference) included relations with others and sleep, which accounted for 13% of the total variance. The Cronbach's alpha statistics were 0.83 and 0.80, respectively. The functional clusters changed at week 12 in responders but persisted through week 12 in nonresponders. Conclusion: Palliative radiotherapy is effective in reducing bone pain. Functional interference component clusters exist in patients treated for bone metastases. These clusters changed over time in this study, possibly attributable to treatment. Further research is needed to examine these effects.

  7. Vertical Migrating and Cluster Analysis of Soil Mesofauna at Dongying Halophytes Garden in Yellow River Delta

    Institute of Scientific and Technical Information of China (English)

    He Fu-xia; Xie Tong-yin; Xie Gui-lin; Fu Rong-shu

    2014-01-01

    For the first time, we used Tullgren method made a study on vertical migrating and cluster analysis of the soil mesofauna in Dongying Halophytes Garden in the Yellow River Delta (YRD), Shandong Province. The results showed that the soil mesofauna tended to gather on soil surface in most samples at most times, but the vertical migrating greatly varied in different seasons or environment conditions. Acari was the dominant group. The index of diversity of the soil fauna was correlated with the index of evenness. The Acari's number of individuals infected other species and numbers. Dominant group-Acari made greater contribution to the result of cluster analysis, and there were significant differences between communities in different habitats by cluster analysis with both Bray-Curtis and Jaccard similarity coefficient.

  8. Preliminary Cluster Analysis For Several Representatives Of Genus Kerivoula (Chiroptera: Vespertilionidae) in Borneo

    Science.gov (United States)

    Hasan, Noor Haliza; Abdullah, M. T.

    2008-01-01

    The aim of the study is to use cluster analysis on morphometric parameters within the genus Kerivoula to produce a dendrogram and to determine the suitability of this method to describe the relationship among species within this genus. A total of 15 adult male individuals from genus Kerivoula taken from sampling trips around Borneo and specimens kept at the zoological museum of Universiti Malaysia Sarawak were examined. A total of 27 characters using dental, skull and external body measurements were recorded. Clustering analysis illustrated the grouping and morphometric relationships between the species of this genus. It has clearly separated each species from each other despite the overlapping of measurements of some species within the genus. Cluster analysis provides an alternative approach to make a preliminary identification of a species.

  9. Application of cluster analysis to preventive maintenance scheme design of pavement

    Institute of Scientific and Technical Information of China (English)

    ZENG Feng; ZHANG Xiao-ning

    2009-01-01

    To quantitatively identify the maintenance demand for each highway segments in the pavement main-tenance scheme design, a mathematical model of uniform segment division was established and an approach of applying cluster analysis theory to the uniform segment division and evaluation of pavement maintenance demand was proposed.The actual maintenance project of a highway carried out in Guangdong province was cited as an example to demonstrate the validity of the proposed method.It is proved that the cluster analysis can eliminate human factors in classification without being constrained by the quantities of samples, considering muhiple pavement distress indexes and the continuity of samples.Thus it is evident that cluster analysis is an efficient analytical tool in uniform segment division and evaluation of maintenance demand.

  10. Parallelization and scheduling of data intensive particle physics analysis jobs on clusters of PCs

    CERN Document Server

    Ponce, S

    2004-01-01

    Summary form only given. Scheduling policies are proposed for parallelizing data intensive particle physics analysis applications on computer clusters. Particle physics analysis jobs require the analysis of tens of thousands of particle collision events, each event requiring typically 200ms processing time and 600KB of data. Many jobs are launched concurrently by a large number of physicists. At a first view, particle physics jobs seem to be easy to parallelize, since particle collision events can be processed independently one from another. However, since large amounts of data need to be accessed, the real challenge resides in making an efficient use of the underlying computing resources. We propose several job parallelization and scheduling policies aiming at reducing job processing times and at increasing the sustainable load of a cluster server. Since particle collision events are usually reused by several jobs, cache based job splitting strategies considerably increase cluster utilization and reduce job ...

  11. Fuzzy C-means clustering for chromatographic fingerprints analysis: A gas chromatography-mass spectrometry case study.

    Science.gov (United States)

    Parastar, Hadi; Bazrafshan, Alisina

    2016-03-18

    Fuzzy C-means clustering (FCM) is proposed as a promising method for the clustering of chromatographic fingerprints of complex samples, such as essential oils. As an example, secondary metabolites of 14 citrus leaves samples are extracted and analyzed by gas chromatography-mass spectrometry (GC-MS). The obtained chromatographic fingerprints are divided to desired number of chromatographic regions. Owing to the fact that chromatographic problems, such as elution time shift and peak overlap can significantly affect the clustering results, therefore, each chromatographic region is analyzed using multivariate curve resolution-alternating least squares (MCR-ALS) to address these problems. Then, the resolved elution profiles are used to make a new data matrix based on peak areas of pure components to cluster by FCM. The FCM clustering parameters (i.e., fuzziness coefficient and number of cluster) are optimized by two different methods of partial least squares (PLS) as a conventional method and minimization of FCM objective function as our new idea. The results showed that minimization of FCM objective function is an easier and better way to optimize FCM clustering parameters. Then, the optimized FCM clustering algorithm is used to cluster samples and variables to figure out the similarities and dissimilarities among samples and to find discriminant secondary metabolites in each cluster (chemotype). Finally, the FCM clustering results are compared with those of principal component analysis (PCA), hierarchical cluster analysis (HCA) and Kohonon maps. The results confirmed the outperformance of FCM over the frequently used clustering algorithms.

  12. A latent class analysis of parental bipolar disorder: Examining associations with offspring psychopathology.

    Science.gov (United States)

    Freed, Rachel D; Tompson, Martha C; Otto, Michael W; Nierenberg, Andrew A; Henin, Aude

    2015-12-15

    Bipolar disorder (BD) is highly heterogeneous, and course variations are associated with patient outcomes. This diagnostic complexity challenges identification of patients in greatest need of intervention. Additionally, course variations have implications for offspring risk. First, latent class analysis (LCA) categorized parents with BD based on salient illness characteristics: BD type, onset age, polarity of index episode, pole of majority of episodes, rapid cycling, psychosis, anxiety comorbidity, and substance dependence. Fit indices favored three parental classes with some substantively meaningful patterns. Two classes, labeled "Earlier-Onset Bipolar-I" (EO-I) and "Earlier-Onset Bipolar-II" (EO-II), comprised parents who had a mean onset age in mid-adolescence, with EO-I primarily BD-I parents and EO-II entirely BD-II parents. The third class, labeled "Later-Onset BD" (LO) had an average onset age in adulthood. Classes also varied on probability of anxiety comorbidity, substance dependence, psychosis, rapid cycling, and pole of majority of episodes. Second, we examined rates of disorders in offspring (ages 4-33, Mage=13.46) based on parental latent class membership. Differences emerged for offspring anxiety disorders only such that offspring of EO-I and EO-II parents had higher rates, compared to offspring of LO parents, particularly for daughters. Findings may enhance understanding of BD and its nosology. PMID:26394919

  13. CLUSTER ANALYSIS UNTUK MEMPREDIKSI TALENTA PEMAIN BASKET MENGGUNAKAN JARINGAN SARAF TIRUAN SELF ORGANIZING MAPS (SOM

    Directory of Open Access Journals (Sweden)

    Gregorius Satia Budhi

    2008-01-01

    Full Text Available Basketball World has grown rapidly as the time goes on. This is signed by many competition and game all over the world. With the result there are many basketball players with their different playing characteristics. Demand for a coach or scout to look for or search great players to make a solid team as a coach requirement. With this application, a coach or scout will be helped in analyzing in decision making. This application uses Self Organizing Maps algorithm (SOM for Cluster Analysis. The real NBA player data is used for competitive learning or training process and real player data from Indonesian or Petra Christian University Basketball Players is used for testing process. The NBA Player data is prepared through cleaning process and then is transformed into a form that can be processed by SOM Algorithm. After that, the data is clustered with the SOM algorithm. The result of that clusters is displayed into a form that is easy to view and analyze. This result can be saved into a text file. By using the output / result of this application, that are the clusters of NBA player, the user can see the statistics of each cluster. With these cluster statistics coach or scout can predict the statistic and the position of a testing player who is in the same cluster. This information can give a support for the coach or scout to make a decision. Abstract in Bahasa Indonesia : Dunia bola basket telah berkembang dengan pesat seiring dengan berjalannya waktu. Hal ini ditandai dengan munculnya berbagai macam dan jenis kompetisi dan pertandingan baik dunia maupun dalam negeri. Sehingga makin banyak dilahirkannya pemain berbakat dengan berbagai karakteristik permainan yang berbeda. Tuntutan bagi seorang pelatih/pemandu bakat, untuk dapat melihat secara jeli dalam memenuhi kebutuhan tim untuk membentuk tim yang solid. Dengan dibuatnya aplikasi ini, maka akan membantu proses analisis dan pengambilan keputusan bagi pelatih maupun pemandu bakat Aplikasi ini

  14. Differentiating Procrastinators from Each Other: A Cluster Analysis.

    Science.gov (United States)

    Rozental, Alexander; Forsell, Erik; Svensson, Andreas; Forsström, David; Andersson, Gerhard; Carlbring, Per

    2015-01-01

    Procrastination refers to the tendency to postpone the initiation and completion of a given course of action. Approximately one-fifth of the adult population and half of the student population perceive themselves as being severe and chronic procrastinators. Albeit not a psychiatric diagnosis, procrastination has been shown to be associated with increased stress and anxiety, exacerbation of illness, and poorer performance in school and work. However, despite being severely debilitating, little is known about the population of procrastinators in terms of possible subgroups, and previous research has mainly investigated procrastination among university students. The current study examined data from a screening process recruiting participants to a randomized controlled trial of Internet-based cognitive behavior therapy for procrastination (Rozental et al., in press). In total, 710 treatment-seeking individuals completed self-report measures of procrastination, depression, anxiety, and quality of life. The results suggest that there might exist five separate subgroups, or clusters, of procrastinators: "Mild procrastinators" (24.93%), "Average procrastinators" (27.89%), "Well-adjusted procrastinators" (13.94%), "Severe procrastinators" (21.69%), and "Primarily depressed" (11.55%). Hence, there seems to be marked differences among procrastinators in terms of levels of severity, as well as a possible subgroup for which procrastinatory problems are primarily related to depression. Tailoring the treatment interventions to the specific procrastination profile of the individual could thus become important, as well as screening for comorbid psychiatric diagnoses in order to target difficulties associated with, for instance, depression. PMID:26178164

  15. Analysis of cardiac tissue by gold cluster ion bombardment

    Science.gov (United States)

    Aranyosiova, M.; Chorvatova, A.; Chorvat, D.; Biro, Cs.; Velic, D.

    2006-07-01

    Specific molecules in cardiac tissue of spontaneously hypertensive rats are studied by using time-of-flight secondary ion mass spectrometry (TOF-SIMS). The investigation determines phospholipids, cholesterol, fatty acids and their fragments in the cardiac tissue, with special focus on cardiolipin. Cardiolipin is a unique phospholipid typical for cardiomyocyte mitochondrial membrane and its decrease is involved in pathologic conditions. In the positive polarity, the fragments of phosphatydilcholine are observed in the mass region of 700-850 u. Peaks over mass 1400 u correspond to intact and cationized molecules of cardiolipin. In animal tissue, cardiolipin contains of almost exclusively 18 carbon fatty acids, mostly linoleic acid. Linoleic acid at 279 u, other fatty acids, and phosphatidylglycerol fragments, as precursors of cardiolipin synthesis, are identified in the negative polarity. These data demonstrate that SIMS technique along with Au 3+ cluster primary ion beam is a good tool for detection of higher mass biomolecules providing approximately 10 times higher yield in comparison with Au +.

  16. Remote sensing clustering analysis based on object-based interval modeling

    Science.gov (United States)

    He, Hui; Liang, Tianheng; Hu, Dan; Yu, Xianchuan

    2016-09-01

    In object-based clustering, image data are segmented into objects (groups of pixels) and then clustered based on the objects' features. This method can be used to automatically classify high-resolution, remote sensing images, but requires accurate descriptions of object features. In this paper, we ascertain that interval-valued data model is appropriate for describing clustering prototype features. With this in mind, we developed an object-based interval modeling method for high-resolution, multiband, remote sensing data. We also designed an adaptive interval-valued fuzzy clustering method. We ran experiments utilizing images from the SPOT-5 satellite sensor, for the Pearl River Delta region and Beijing. The results indicate that the proposed algorithm considers both the anisotropy of the remote sensing data and the ambiguity of objects. Additionally, we present a new dissimilarity measure for interval vectors, which better separates the interval vectors generated by features of the segmentation units (objects). This approach effectively limits classification errors caused by spectral mixing between classes. Compared with the object-based unsupervised classification method proposed earlier, the proposed algorithm improves the classification accuracy without increasing computational complexity.

  17. Latent Class Analysis of DSM-5 Alcohol Use Disorder Criteria Among Heavy-Drinking College Students.

    Science.gov (United States)

    Rinker, Dipali Venkataraman; Neighbors, Clayton

    2015-10-01

    The DSM-5 has created significant changes in the definition of alcohol use disorders (AUDs). Limited work has considered the impact of these changes in specific populations, such as heavy-drinking college students. Latent class analysis (LCA) is a person-centered approach that divides a population into mutually exclusive and exhaustive latent classes, based on observable indicator variables. The present research was designed to examine whether there were distinct classes of heavy-drinking college students who met DSM-5 criteria for an AUD and whether gender, perceived social norms, use of protective behavioral strategies (PBS), drinking refusal self-efficacy (DRSE), self-perceptions of drinking identity, psychological distress, and membership in a fraternity/sorority would be associated with class membership. Three-hundred and ninety-four college students who met DSM-5 criteria for an AUD were recruited from three different universities. Two distinct classes emerged: Less Severe (86%), the majority of whom endorsed both drinking more than intended and tolerance, as well as met criteria for a mild AUD; and More Severe (14%), the majority of whom endorsed at least half of the DSM-5 AUD criteria and met criteria for a severe AUD. Relative to the Less Severe class, membership in the More Severe class was negatively associated with DRSE and positively associated with self-identification as a drinker. There is a distinct class of heavy-drinking college students with a more severe AUD and for whom intervention content needs to be more focused and tailored. Clinical implications are discussed.

  18. The Swift X-ray Telescope Cluster Survey II. X-ray spectral analysis

    OpenAIRE

    P. TozziINAF, Osservatorio Astrofisico di Firenze; A. Moretti(Fermilab, Batavia, IL, USA); Tundo, E.; Liu, T.; Rosati, P.; Borgani, S.; Tagliaferri, G.; S. Campana; Fugazza, D.; Avanzo, P. D.

    2014-01-01

    (Abridged) We present a spectral analysis of a new, flux-limited sample of 72 X-ray selected clusters of galaxies identified with the X-ray Telescope (XRT) on board the Swift satellite down to a flux limit of ~10-14 erg/s/cm2 (SWXCS, Tundo et al. 2012). We carry out a detailed X-ray spectral analysis with the twofold aim of measuring redshifts and characterizing the properties of the Intra-Cluster Medium (ICM). Optical counterparts and spectroscopic or photometric redshifts ...

  19. Alcohol Use as Risk Factors for Older Adults’ Emergency Department Visits: A Latent Class Analysis

    Directory of Open Access Journals (Sweden)

    Namkee G. Choi, PhD

    2015-12-01

    Full Text Available Introduction: Late middle-aged and older adults’ share of emergency department (ED visits is increasing more than other age groups. ED visits by individuals with substance-related problems are also increasing. This paper was intended to identify subgroups of individuals aged 50+ by their risk for ED visits by examining their health/mental health status and alcohol use patterns. Methods: Data came from the 2013 National Health Interview Survey’s Sample Adult file (n=15,713. Following descriptive analysis of sample characteristics by alcohol use patterns, latent class analysis (LCA modeling was fit using alcohol use pattern (lifetime abstainers, ex-drinkers, current infrequent/light/ moderate drinkers, and current heavy drinkers, chronic health and mental health status, and past-year ED visits as indicators. Results: LCA identified a four-class model. All members of Class 1 (35% of the sample; lowest-risk group were infrequent/light/moderate drinkers and exhibited the lowest probabilities of chronic health/ mental health problems; Class 2 (21%; low-risk group consisted entirely of lifetime abstainers and, despite being the oldest group, exhibited low probabilities of health/mental health problems; Class 3 (37%; moderate-risk group was evenly divided between ex-drinkers and heavy drinkers; and Class 4 (7%; high-risk group included all four groups of drinkers but more ex-drinkers. In addition, Class 4 had the highest probabilities of chronic health/mental problems, unhealthy behaviors, and repeat ED visits, with the highest proportion of Blacks and the lowest proportions of college graduates and employed persons, indicating significant roles of these risk factors. Conclusion: Alcohol nonuse/use (and quantity of use and chronic health conditions are significant contributors to varying levels of ED visit risk. Clinicians need to help heavy-drinking older adults reduce unhealthy alcohol consumption and help both heavy drinkers and ex

  20. Likelihood analysis of spatial capture-recapture models for stratified or class structured populations

    Science.gov (United States)

    Royle, J. Andrew; Sutherland, Christopher S.; Fuller, Angela K.; Sun, Catherine C.

    2015-01-01

    We develop a likelihood analysis framework for fitting spatial capture-recapture (SCR) models to data collected on class structured or stratified populations. Our interest is motivated by the necessity of accommodating the problem of missing observations of individual class membership. This is particularly problematic in SCR data arising from DNA analysis of scat, hair or other material, which frequently yields individual identity but fails to identify the sex. Moreover, this can represent a large fraction of the data and, given the typically small sample sizes of many capture-recapture studies based on DNA information, utilization of the data with missing sex information is necessary. We develop the class structured likelihood for the case of missing covariate values, and then we address the scaling of the likelihood so that models with and without class structured parameters can be formally compared regardless of missing values. We apply our class structured model to black bear data collected in New York in which sex could be determined for only 62 of 169 uniquely identified individuals. The models containing sex-specificity of both the intercept of the SCR encounter probability model and the distance coefficient, and including a behavioral response are strongly favored by log-likelihood. Estimated population sex ratio is strongly influenced by sex structure in model parameters illustrating the importance of rigorous modeling of sex differences in capture-recapture models.

  1. A first passage problem and its applications to the analysis of a class of stochastic models

    Directory of Open Access Journals (Sweden)

    Lev Abolnikov

    1992-01-01

    Full Text Available A problem of the first passage of a cumulative random process with generally distributed discrete or continuous increments over a fixed level is considered in the article as an essential part of the analysis of a class of stochastic models (bulk queueing systems, inventory control and dam models.

  2. Class separation of buildings with high and low prevalence of SBS by principal component analysis

    DEFF Research Database (Denmark)

    Pommer, L.; Fick, J.; Andersson, B.;

    2002-01-01

    This method was able to separate buildings with high and low prevalence of SBS in two different classes using principal component analysis (PCA). Data from the Northern Swedish Office Illness Study describing the presence and level of chemical compounds in outdoor, supply and room air, respective...

  3. Analysis of Apprenticeship Training from the National Longitudinal Study of the High School Class of 1972.

    Science.gov (United States)

    Cook, Robert F.; And Others

    A study investigated effects of on-the-job or "hands-on" vocational training relative to standard classroom vocational instruction on subsequent employment, earnings, wages, and job satisfaction. The data used were from the National Longitudinal Study of the High School Class of 1972 and five follow-up surveys of this population. An analysis of…

  4. Diagnostic Performance Tests for Suspected Scaphoid Fractures Differ with Conventional and Latent Class Analysis

    NARCIS (Netherlands)

    G.A. Buijze; W.H. Mallee; F.J.P. Beeres; T.E. Hanson; W.O. Johnson; D. Ring

    2011-01-01

    Evaluation of the diagnostic performance characteristics of radiographic tests for diagnosing a true fracture among suspected scaphoid fractures is hindered by the lack of a consensus reference standard. Latent class analysis is a statistical method that takes advantage of unobserved, or latent, cla

  5. Prestige, Centrality, and Learning: A Social Network Analysis of an Online Class

    Science.gov (United States)

    Russo, Tracy C.; Koesten, Joy

    2005-01-01

    This study explored relations between social network characteristics in an online graduate class and two learning outcomes: affective and cognitive learning. The social network analysis data were compiled by entering the number of one-to-one postings sent by each student to each other student in a course web site discussion space into a specially…

  6. The Oropharyngeal Airway in Young Adults with Skeletal Class II and Class III Deformities: A 3-D Morphometric Analysis.

    Directory of Open Access Journals (Sweden)

    Yasas Shri Nalaka Jayaratne

    Full Text Available 1 To determine the accuracy and reliability of an automated anthropometric measurement software for the oropharyngeal airway and 2 To compare the anthropometric dimensions of the oropharyngeal airway in skeletal class II and III deformity patients.Cone-beam CT (CBCT scans of 62 patients with skeletal class II or III deformities were used for this study. Volumetric, linear and surface area measurements retroglossal (RG and retropalatal (RP compartments of the oropharyngeal airway was measured with the 3dMDVultus software. Accuracy of automated anthropometric pharyngeal airway measurements was assessed using an airway phantom.The software was found to be reasonably accurate for measuring dimensions of air passages. The total oropharyngeal volume was significantly greater in the skeletal class III deformity group (16.7 ± 9.04 mm3 compared with class II subjects (11.87 ± 4.01 mm3. The average surface area of both the RG and RP compartments were significantly larger in the class III deformity group. The most constricted area in the RG and RP airway was significantly larger in individuals with skeletal class III deformity. The anterior-posterior (AP length of this constriction was significantly greater in skeletal class III individuals in both compartments, whereas the width of the constriction was not significantly different between the two groups in both compartments. The RP compartment was larger but less uniform than the RG compartment in both skeletal deformities.Significant differences were observed in morphological characteristics of the oropharyngeal airway in individuals with skeletal class II and III deformities. This information may be valuable for surgeons in orthognathic treatment planning, especially for mandibular setback surgery that might compromise the oropharyngeal patency.

  7. Comprehensive behavioral analysis of cluster of differentiation 47 knockout mice.

    Directory of Open Access Journals (Sweden)

    Hisatsugu Koshimizu

    Full Text Available Cluster of differentiation 47 (CD47 is a member of the immunoglobulin superfamily which functions as a ligand for the extracellular region of signal regulatory protein α (SIRPα, a protein which is abundantly expressed in the brain. Previous studies, including ours, have demonstrated that both CD47 and SIRPα fulfill various functions in the central nervous system (CNS, such as the modulation of synaptic transmission and neuronal cell survival. We previously reported that CD47 is involved in the regulation of depression-like behavior of mice in the forced swim test through its modulation of tyrosine phosphorylation of SIRPα. However, other potential behavioral functions of CD47 remain largely unknown. In this study, in an effort to further investigate functional roles of CD47 in the CNS, CD47 knockout (KO mice and their wild-type littermates were subjected to a battery of behavioral tests. CD47 KO mice displayed decreased prepulse inhibition, while the startle response did not differ between genotypes. The mutants exhibited slightly but significantly decreased sociability and social novelty preference in Crawley's three-chamber social approach test, whereas in social interaction tests in which experimental and stimulus mice have direct contact with each other in a freely moving setting in a novel environment or home cage, there were no significant differences between the genotypes. While previous studies suggested that CD47 regulates fear memory in the inhibitory avoidance test in rodents, our CD47 KO mice exhibited normal fear and spatial memory in the fear conditioning and the Barnes maze tests, respectively. These findings suggest that CD47 is potentially involved in the regulation of sensorimotor gating and social behavior in mice.

  8. Analyzing Developing Country Market Integration with Incomplete Price Data Using Cluster Analysis

    OpenAIRE

    Ansah, I.G.; Gardebroek, C.; Ihle, R.; Jaletac, M.

    2014-01-01

    Recent global food price developments have spurred renewed interest in analyzing integration of local markets to global markets. A popular approach to quantify market integration is cointegration analysis. However, local market price data often has missing values, outliers, or short and incomplete series, making cointegration analysis impossible. Instead of imputing missing data, this paper proposes cluster analysis as an alternative methodological approach for analyzing market integration. I...

  9. Arabic Text Summarization Based on Latent Semantic Analysis to Enhance Arabic Documents Clustering

    Directory of Open Access Journals (Sweden)

    Hanane Froud

    2013-02-01

    Full Text Available Arabic Documents Clustering is an important task for obtaining good results with the traditional Information Retrieval (IR systems especially with the rapid growth of the number of online documents present in Arabic language. Documents clustering aim to automatically group similar documents in one cluster using different similarity/distance measures. This task is often affected by the documents length, useful information on the documents is often accompanied by a large amount of noise, and therefore it is necessary to eliminate this noise while keeping useful information to boost the performance of Documents clustering. In this paper, we propose to evaluate the impact of text summarization using the Latent Semantic Analysis Model on Arabic Documents Clustering in order to solve problems cited above, using five similarity/distance measures: Euclidean Distance, Cosine Similarity, Jaccard Coefficient, PearsonCorrelation Coefficient and Averaged Kullback-Leibler Divergence, for two times: without and with stemming. Our experimental results indicate that our proposed approach effectively solves the problems of noisy information and documents length, and thus significantly improve the clustering performance.

  10. ARABIC TEXT SUMMARIZATION BASED ON LATENT SEMANTIC ANALYSIS TO ENHANCE ARABIC DOCUMENTS CLUSTERING

    Directory of Open Access Journals (Sweden)

    Hanane Froud

    2013-01-01

    Full Text Available Arabic Documents Clustering is an important task for obtaining good results with the traditional Information Retrieval (IR systems especially with the rapid growth of the number of online documents present in Arabic language. Documents clustering aim to automatically group similar documents in one cluster using different similarity/distance measures. This task is often affected by the documents length, useful information on the documents is often accompanied by a large amount of noise, and therefore it is necessary to eliminate this noise while keeping useful information to boost the performance of Documents clustering. In this paper, we propose to evaluate the impact of text summarization using the Latent Semantic Analysis Model on Arabic Documents Clustering in order to solve problems cited above, using five similarity/distance measures: Euclidean Distance, Cosine Similarity, Jaccard Coefficient, Pearson Correlation Coefficient and Averaged Kullback-Leibler Divergence, for two times: without and with stemming. Our experimental results indicate that our proposed approach effectively solves the problems of noisy information and documents length, and thus significantly improve the clustering performance.

  11. X-ray view on a Class using Conceptual Analysis in Java Environment

    Directory of Open Access Journals (Sweden)

    Gulshan Kumar

    2011-09-01

    Full Text Available Modularity is one of the most important principles in software engineering and a necessity for every practical software. Since the design space of software is generally quite large, it is valuable to provide automatic means to help modularizing it. An automatic technique for software modularization is object- oriented concept analysis (OOCA. X-ray view of the class is one of the aspect of this Object oriented concept analysis. We shall use this concept in a java environment.

  12. Clustering of the Parameters of Rhythmographic Analysis of Man’s Electrocardiogram

    Directory of Open Access Journals (Sweden)

    Ekaterina A. Filippova

    2014-12-01

    Full Text Available The article considers the clustering of the parameters of man’s heart rate variability. The technique of parameters calculation and diagrams of rhythmographic analysis construction are presented. The algorithm of conceptual clustering Cobweb, modified for quantitative data, is used for parameters clustering. The results of the experiments prove the efficiency of the division of the learning range of electrocardiograms into the groups similar in terms of rhythmographic parameters. The practical application of the offered method as a part of the software support of electrocardiograms analysis will enable to provide operational evaluation of the rhythmographic nature of heart function in the course of screening examinations or in the emergency medicine for diagnosing and prediction.

  13. Cluster analysis in kinetic modelling of the brain: A noninvasive alternative to arterial sampling

    DEFF Research Database (Denmark)

    Liptrot, Matthew George; Adams, K.H.; Martiny, L.;

    2004-01-01

    In emission tomography, quantification of brain tracer uptake, metabolism or binding requires knowledge of the cerebral input function. Traditionally, this is achieved with arterial blood sampling. We propose a noninvasive alternative via the use of a blood vessel time-activity curve (TAC......) extracted directly from dynamic positron emission tomography (PET) scans by cluster analysis. Five healthy subjects were injected with the 5HT2A- receptor ligand [18F]-altanserin and blood samples were subsequently taken from the radial artery and cubital vein. Eight regions-of-interest (ROI) TACs were...... by the 'within-variance' measure and by 3D visual inspection of the homogeneity of the determined clusters. The cluster-determined input curve was then used in Logan plot analysis and compared with the arterial and venous blood samples, and additionally with one of the currently used alternatives to arterial...

  14. Evaluation of Portland cement from X-ray diffraction associated with cluster analysis

    International Nuclear Information System (INIS)

    The Brazilian cement industry produced 64 million tons of cement in 2012, with noteworthy contribution of CP-II (slag), CP-III (blast furnace) and CP-IV (pozzolanic) cements. The industrial pole comprises about 80 factories that utilize raw materials of different origins and chemical compositions that require enhanced analytical technologies to optimize production in order to gain space in the growing consumer market in Brazil. This paper assesses the sensitivity of mineralogical analysis by X-ray diffraction associated with cluster analysis to distinguish different kinds of cements with different additions. This technique can be applied, for example, in the prospection of different types of limestone (calcitic, dolomitic and siliceous) as well as in the qualification of different clinkers. The cluster analysis does not require any specific knowledge of the mineralogical composition of the diffractograms to be clustered; rather, it is based on their similarity. The materials tested for addition have different origins: fly ashes from different power stations from South Brazil and slag from different steel plants in the Southeast. Cement with different additions of limestone and white Portland cement were also used. The Rietveld method of qualitative and quantitative analysis was used for measuring the results generated by the cluster analysis technique. (author)

  15. Analysis of the class E amplifier used as electronic ballast with dimming capability for photovoltaic applications

    Energy Technology Data Exchange (ETDEWEB)

    Ponce, M.; Arau, J. [CENIDET-Electro' , Cuernavaca (Mexico); Alonso, J.M.; Rico-Secades, M. [ATE, Universidad de Oviedo, Campus de Viesques s/n Gijon (Spain)

    2001-07-01

    The analysis and design of a dimmable electronic ballast based on the class E amplifier and fed from solar cells with 12V backup batteries is described. The class E amplifier uses a capacitive impedance inverter as resonant tank and one diode antiparallel with the switch; these elements allow implementation of a dimming feature for the ballast and ignition of the lamp while maintaining zero voltage commutations in the switch. The designed electronic ballast drives a 21W lamp and operates at a switching frequency of 370kHz. Dimming is implemented using an SG3524 in a voltage-controlled oscillator fashion. (Author)

  16. FEATURE-MODEL-BASED COMMONALITY AND VARIABILITY ANALYSIS FOR VIRTUAL CLUSTER DISK PROVISIONING

    Directory of Open Access Journals (Sweden)

    Nayun Cho

    2016-01-01

    Full Text Available The rapid growth of networking and storage capacity allows collecting and analyzing massive amount of data by relying increasingly on scalable, flexible, and on-demand provisioned largescale computing resources. Virtualization is one of the feasible solution to provide large amounts of computational power with dynamic provisioning of underlying computing resources. Typically, distributed scientific applications for analyzing data run on cluster nodes to perform the same task in parallel. However, on-demand virtual disk provisioning for a set of virtual machines, called virtual cluster, is not a trivial task. This paper presents a feature model-based commonality and variability analysis system for virtual cluster disk provisioning to categorize types of virtual disks that should be provisioned. Also, we present an applicable case study to analyze common and variant software features between two different subgroups of the big data processing virtual cluster. Consequently, by using the analysis system, it is possible to provide an ability to accelerate the virtual disk creation process by reducing duplicate software installation activities on a set of virtual disks that need to be provisioned in the same virtual cluster.

  17. Performance Analysis of a Cluster-Based MAC Protocol for Wireless Ad Hoc Networks

    Directory of Open Access Journals (Sweden)

    Jesús Alonso-Zárate

    2010-01-01

    Full Text Available An analytical model to evaluate the non-saturated performance of the Distributed Queuing Medium Access Control Protocol for Ad Hoc Networks (DQMANs in single-hop networks is presented in this paper. DQMAN is comprised of a spontaneous, temporary, and dynamic clustering mechanism integrated with a near-optimum distributed queuing Medium Access Control (MAC protocol. Clustering is executed in a distributed manner using a mechanism inspired by the Distributed Coordination Function (DCF of the IEEE 802.11. Once a station seizes the channel, it becomes the temporary clusterhead of a spontaneous cluster and it coordinates the peer-to-peer communications between the clustermembers. Within each cluster, a near-optimum distributed queuing MAC protocol is executed. The theoretical performance analysis of DQMAN in single-hop networks under non-saturation conditions is presented in this paper. The approach integrates the analysis of the clustering mechanism into the MAC layer model. Up to the knowledge of the authors, this approach is novel in the literature. In addition, the performance of an ad hoc network using DQMAN is compared to that obtained when using the DCF of the IEEE 802.11, as a benchmark reference.

  18. Profitability and efficiency of Italian utilities: cluster analysis of financial statement ratios

    International Nuclear Information System (INIS)

    The last ten years have witnessed conspicuous changes in European and Italian regulation of public utility services and in the strategies of the major players in these fields. In response to these changes Italian utilities have made a variety of choices regarding size, presence in more or less capital-intensive stages of different value chains, and diversification. These choices have been implemented both through internal growth and by means of mergers and acquisitions. In this context it is interesting to try to establish whether there is a nexus between these choices and the performance of Italian utilities in terms of profitability and efficiency. Therefore statistical multivariate analysis techniques (cluster analysis and factor analysis) have been applied to several ratios obtained from the 2005 financial statement of 34 utilities. First, a hierarchical cluster analysis method has been applied to financial statement data in order to identify homogeneous groups based on several indicators of the incidence of costs (external costs, personnel costs, depreciation and amortization), profitability (return on sales, return on assets, return on equity) and efficiency (in the utilization of personnel, of total assets, of property, plant and equipment). Five clusters have been found. Then the clusters have been characterized in terms of the aforementioned indicators, the presence in different stages of the energy value chains (electricity and gas) and other descriptive variables (such as turnover, number of employees, assets, percentage of property, plant and equipment on total assets, sales revenues from electricity, gas, water supply and sanitation, waste collection and treatment and other services). In a second round cluster analysis has been preceded by factor analysis, in order to find a smaller set of variables. This procedure has revealed three not directly observable factors that can be interpreted as follows: i) efficiency in ordinary and financial management

  19. [The craniofacial architecture of class III malocclusion using the Coben analysis].

    Science.gov (United States)

    Vallée-Cussac, V

    1991-01-01

    In this study, longitudinal tracings of dental and skeletal Class III malocclusion group are compared to tracings of COBEN analysis standard values. Cephalometric measurements and surimpositions illustrate the dynamic variations of Class III cranio-facial architecture for two age ranges: 8 years +/- 1 year and 16 years +/- 1 year. The Class III pathology for children 8 years +/- 1 year aged is characterized by alterations of tracings sizes and position with excessive cranio-facial components length and rotation of cranial base into a more vertical position. A growth rate deficiency in length with a variable individual adaptation is showed for cranial structures except the mandibule after growth at the age of 16 years +/- 1 year.

  20. Is there a nonadherent subtype of hypertensive patient? A latent class analysis approach

    Directory of Open Access Journals (Sweden)

    Ranak B Trivedi

    2010-07-01

    Full Text Available Ranak B Trivedi1, Brian J Ayotte2, Carolyn T Thorpe3, David Edelman4, Hayden B Bosworth51Northwest Health Services Research and Development Service Center of Excellence, VA Puget Sound Health Care System, Seattle, Washington; 2Boston VA Health Care System, Boston, Massachusetts; 3Department of Population Health Sciences, University of Wisconsin, Madison, Wisconsin; 4Department of Medicine, Duke University Medical Center, Durham, North Carolina; 5Research Professor, Department of Medicine, Duke University Medical Center, Durham, North Carolina, USAAbstract: To determine subtypes of adherence, 636 hypertensive patients (48% White, 34% male reported adherence to medications, diet, exercise, smoking, and home blood pressure monitoring. A latent class analysis approach was used to identify subgroups that adhere to these five self-management behaviors. Fit statistics suggested two latent classes. The first class (labeled “more adherent” included patients with greater probability of adhering to ­recommendations compared with the second class (labeled “less adherent” with regard to nonsmoking (97.7% versus 76.3%, medications (75.5% versus 49.5%, diet (70.7% versus 46.9%, exercise (63.4% versus 27.2%, and blood pressure monitoring (32% versus 3.4%. Logistic regression analyses used to characterize the two classes showed that “more adherent” participants were more likely to report full-time employment, adequate income, and better emotional and physical well-being. Results suggest the presence of a less adherent subtype of hypertensive patients. Behavioral interventions designed to improve adherence might best target these at-risk patients for greater treatment efficiency.Keywords: adherence, hypertension, latent class analysis, self-management

  1. Improved Detection of Time Windows of Brain Responses in Fmri Using Modified Temporal Clustering Analysis

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    @@ Temporal clustering analysis (TCA) has been proposed recently as a method to detect time windows of brain responses in functional MRI (fMRI) studies when the timing and location of the activation are completely unknown. Modifications to the TCA technique are introduced in this report to further improve the sensitivity in detecting brain activation.

  2. Validation of an ANN Flow Prediction Model Using a Multt-Station Cluster Analysis

    NARCIS (Netherlands)

    Demirel, M.C.; Booij, M.J.; Kahya, E.

    2012-01-01

    The objective of this study is to validate a flow prediction model for a hydrometric station using a multistation criterion in addition to standard single-station performance criteria. In this contribution we used cluster analysis to identify the regional flow height, i.e., water-level patterns and

  3. Classification of shoulder complaints in general practice by means of cluster analysis

    NARCIS (Netherlands)

    Winters, JC; Groenier, KH; Sobel, JS; Arendzen, HH; Meyboom-de Jong, B

    1997-01-01

    Objective: To determine if a classification of shoulder complaints in general practice can be made with a cluster analysis of variables of medical history and physical examination. Method: One hundred one patients with shoulder complaints were examined upon inclusion (week 0) and after 2 weeks. Elev

  4. Spectral analysis of A and F dwarf members of the open cluster M6: preliminary results

    Science.gov (United States)

    Kılıçoǧlu, T.; Monier, R.; Fossati, L.

    2010-12-01

    We present the first abundance analysis of CD-32 13109 (NGC 6405 47), member of the M6 open cluster. The photospheric abundances of 14 chemical elements were determined by comparing synthetic spectra and observed spectra of the star. Findings show that this star should be an Am star.

  5. Cluster Analysis of Assessment in Anatomy and Physiology for Health Science Undergraduates

    Science.gov (United States)

    Brown, Stephen; White, Sue; Power, Nicola

    2016-01-01

    Academic content common to health science programs is often taught to a mixed group of students; however, content assessment may be consistent for each discipline. This study used a retrospective cluster analysis on such a group, first to identify high and low achieving students, and second, to determine the distribution of students within…

  6. Addressing preference heterogeneity in public health policy by combining Cluster Analysis and Multi-Criteria Decision Analysis

    DEFF Research Database (Denmark)

    Kaltoft, Mette Kjer; Turner, Robin; Cunich, Michelle;

    2015-01-01

    of the population in relation to the importance assigned to relevant criteria. It involves combining Cluster Analysis (CA), to generate the subgroup sets of preferences, with Multi-Criteria Decision Analysis (MCDA), to provide the policy framework into which the clustered preferences are entered. We employ three...... preferences as valid determinants of public policy, a transparent analytical procedure is needed. In this proof of method study we show how public preferences could be incorporated into policy decisions in a way that respects both the multi-criterial nature of those decisions, and the heterogeneity...

  7. Cluster Analysis of Polyphenols and Organic Acids in 11 Different Brand Cigarette Samples at Home and Abroad

    Institute of Scientific and Technical Information of China (English)

    Lan MI; Bilong DAI; Yu QIN; Wenjun ZHANG; Zhen XIONG; Yanhong WANG; Ting ZHU

    2015-01-01

    The objective of this research was to investigate the differences between local cigarette and foreign cigarette and supplied a base for improving the quality of cigarette. Different kinds of polyphenols and organic acids in 11 different brand cigarette samples at home and abroad were classified by the method of cluster analysis. The results indicated that the 11 samples could be classified into 2 class-es. Suyan, Furongwang, Chinese, Baisha, Dihao, Yunyan, Hongtashan belonged to type 1; foreign cigarettes that represented by Marboro, Blue pacific and Brazil cigarette belonged to type 2. The content of malic acid and citric acid in type 1 was higher than type 2, the content of malonic acid was higher in type 2, and there is no difference between the type 1 and type 2 about the content of polyphe-nols. In conclusion, the content of malic acid and citric in Chinese cigarettes was higher than foreign, but the content of malonic acid was lower than foreign. There is no difference between Chinese cigarettes and foreign cigarettes about the content of polyphenols.

  8. Unraveling the dha cluster in Citrobacter werkmanii: comparative genomic analysis of bacterial 1,3-propanediol biosynthesis clusters.

    Science.gov (United States)

    Maervoet, Veerle E T; De Maeseneire, Sofie L; Soetaert, Wim K; De Mey, Marjan

    2014-04-01

    In natural 1,3-propanediol (PDO) producing microorganisms such as Klebsiella pneumoniae, Citrobacter freundii and Clostridium sp., the genes coding for PDO producing enzymes are grouped in a dha cluster. This article describes the dha cluster of a novel candidate for PDO production, Citrobacter werkmanii DSM17579 and compares the cluster to the currently known PDO clusters of Enterobacteriaceae and Clostridiaceae. Moreover, we attribute a putative function to two previously unannotated ORFs, OrfW and OrfY, both in C. freundii and in C. werkmanii: both proteins might form a complex and support the glycerol dehydratase by converting cob(I)alamin to the glycerol dehydratase cofactor coenzyme B12. Unraveling this biosynthesis cluster revealed high homology between the deduced amino acid sequence of the open reading frames of C. werkmanii DSM17579 and those of C. freundii DSM30040 and K. pneumoniae MGH78578, i.e., 96 and 87.5 % identity, respectively. On the other hand, major differences between the clusters have also been discovered. For example, only one dihydroxyacetone kinase (DHAK) is present in the dha cluster of C. werkmanii DSM17579, while two DHAK enzymes are present in the cluster of K. pneumoniae MGH78578 and Clostridium butyricum VPI1718.

  9. Cluster analysis of breeding values for milk yield and lactation persistency in Guzerá cattle

    OpenAIRE

    Diego Augusto Campos da Cruz; Rodrigo Pelicioni Savegnago; Annaíza Braga Bignardi Santana; Maria Gabriela Campolina Diniz Peixoto; Frank Angelo Tomita Bruneli; Lenira El Faro

    2016-01-01

    ABSTRACT: The aim of this study was to explore the pattern of genetic lactation curves of Guzerá cattle using cluster analysis. Test-day milk yields of 5,274 first-lactation Guzerá cows were recorded in a progeny test. A total of 34,193 monthly records were analyzed with a random regression animal model using Legendre polynomials to fit additive genetic and permanent environmental random effects and mean trends. Hierarchical and non-hierarchical cluster analyses were performed based on the EB...

  10. Independent Component Analysis to Detect Clustered Microcalcification Breast Cancers

    Directory of Open Access Journals (Sweden)

    R. Gallardo-Caballero

    2012-01-01

    current reproducible studies on the same mammogram set. This proposal is mainly based on the use of extracted image features obtained by independent component analysis, but we also study the inclusion of the patient’s age as a nonimage feature which requires no human expertise. Our system achieves an average of 2.55 false positives per image at a sensitivity of 81.8% and 4.45 at a sensitivity of 91.8% in diagnosing the BCRP_CALC_1 subset of DDSM.

  11. Identifying clinical course patterns in SMS data using cluster analysis

    DEFF Research Database (Denmark)

    Kent, Peter; Kongsted, Alice

    2012-01-01

    ABSTRACT: BACKGROUND: Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important...... clinically interpretable and different from those of the whole group. Similar patterns were obtained when the number of SMS time points was reduced to monthly. The advantages and disadvantages of this method were contrasted to that of first transforming SMS data by spline analysis. CONCLUSIONS: This study...

  12. Degradation Assessment and Fault Diagnosis for Roller Bearing Based on AR Model and Fuzzy Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Lingli Jiang

    2011-01-01

    Full Text Available This paper proposes a new approach combining autoregressive (AR model and fuzzy cluster analysis for bearing fault diagnosis and degradation assessment. AR model is an effective approach to extract the fault feature, and is generally applied to stationary signals. However, the fault vibration signals of a roller bearing are non-stationary and non-Gaussian. Aiming at this problem, the set of parameters of the AR model is estimated based on higher-order cumulants. Consequently, the AR parameters are taken as the feature vectors, and fuzzy cluster analysis is applied to perform classification and pattern recognition. Experiments analysis results show that the proposed method can be used to identify various types and severities of fault bearings. This study is significant for non-stationary and non-Gaussian signal analysis, fault diagnosis and degradation assessment.

  13. Class-first analysis in a continuum: an approach to the complexities of schools, society, and insurgent science

    Science.gov (United States)

    Valdiviezo, Laura Alicia

    2010-06-01

    This essay addresses Katherine Richardson Bruna's paper: Mexican Immigrant Transnational Social Capital and Class Transformation: Examining the Role of Peer Mediation in Insurgent Science, through five main points . First, I offer a comparison between the traditional analysis of classism in Latin America and Richardson Bruna's call for a class-first analysis in the North American social sciences where there has been a tendency to obviate the specific examination of class relations and class issues. Secondly, I discuss that a class-first analysis solely cannot suffice to depict the complex dimensions in the relations of schools and society. Thus, I suggest a continuum in the class-first analysis. Third, I argue that social constructions surrounding issues of language, ethnicity, and gender necessarily intersect with issues of class and that, in fact, those other constructions offer compatible epistemologies that aid in representing the complexity of social and institutional practices in the capitalist society. Richardson Bruna's analysis of Augusto's interactions with his teacher and peers in the science class provides a fourth point of discussion in this essay. As a final point in my response I discuss Richardson Bruna's idea of making accessible class-first analysis knowledge to educators and especially to science teachers.

  14. Deep observations of the Super-CLASS super-cluster at 325 MHz with the GMRT: the low-frequency source catalogue

    CERN Document Server

    Riseley, C J; Hales, C A; Harrison, I; Birkinshaw, M; Battye, R A; Beswick, R J; Brown, M L; Casey, C M; Chapman, S C; Demetroullas, C; Hung, C -L; Jackson, N J; Muxlow, T; Watson, B

    2016-01-01

    We present the results of 325 MHz GMRT observations of a super-cluster field, known to contain five Abell clusters at redshift $z \\sim 0.2$. We achieve a nominal sensitivity of $34\\,\\mu$Jy beam$^{-1}$ toward the phase centre. We compile a catalogue of 3257 sources with flux densities in the range $183\\,\\mu\\rm{Jy}\\,-\\,1.5\\,\\rm{Jy}$ within the entire $\\sim 6.5$ square degree field of view. Subsequently, we use available survey data at other frequencies to derive the spectral index distribution for a sub-sample of these sources, recovering two distinct populations -- a dominant population which exhibit spectral index trends typical of steep-spectrum synchrotron emission, and a smaller population of sources with typically flat or rising spectra. We identify a number of sources with ultra-steep spectra or rising spectra for further analysis, finding two candidate high-redshift radio galaxies and three gigahertz-peaked-spectrum radio sources. Finally, we derive the Euclidean-normalised differential source counts us...

  15. Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis

    OpenAIRE

    Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M. Margaret; Butte, Atul J; Manley, Geoffrey T.

    2010-01-01

    Introduction Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enabl...

  16. Market segmentation for multiple option healthcare delivery systems--an application of cluster analysis.

    Science.gov (United States)

    Jarboe, G R; Gates, R H; McDaniel, C D

    1990-01-01

    Healthcare providers of multiple option plans may be confronted with special market segmentation problems. This study demonstrates how cluster analysis may be used for discovering distinct patterns of preference for multiple option plans. The availability of metric, as opposed to categorical or ordinal, data provides the ability to use sophisticated analysis techniques which may be superior to frequency distributions and cross-tabulations in revealing preference patterns.

  17. Consumer Acceptance of Genetically Modified Foods in South Korea: Factor and Cluster Analysis

    OpenAIRE

    Onyango, Benjamin M.; Govindasamy, Ramu; Hallman, William K.; Jang, Ho-Min; Puduri, Venkata S.

    2006-01-01

    This study extends biotechnology discourse to cover South Korea in the Asian sub-continent showing a marked difference in perceptions between traditional and GM foods. Factor analysis suggests South Koreans may treat foods that are locally produced and those with no artificial flavors or colorings preferentially to GM foods. Additionally, South Koreans have concerns about perceived risks related to biotechnology, and, given a choice, they may pay more to avoid GM foods. Cluster analysis resul...

  18. Competitiveness Analysis of Processing Industry Cluster of Livestock Products in Inner Mongolia Based on "Diamond Model"

    Institute of Scientific and Technical Information of China (English)

    YANG Xing-long; REN Ya-tong

    2012-01-01

    Using Michael Porter’s "diamond model", based on regional development characteristics, we conduct analysis of the competitiveness of processing industry cluster of livestock products in Inner Mongolia from six aspects (the factor conditions, demand conditions, corporate strategy, structure and competition, related and supporting industries, government and opportunities). And we put forward the following rational recommendations for improving the competitiveness of processing industry cluster of livestock products in Inner Mongolia: (i) The government should increase capital input, focus on supporting processing industry of livestock products, and give play to the guidance and aggregation effect of financial funds; (ii) In terms of enterprises, it is necessary to vigorously develop leading enterprises, to give full play to the cluster effect of the leading enterprises.

  19. Automation of Large-scale Computer Cluster Monitoring Information Analysis

    Science.gov (United States)

    Magradze, Erekle; Nadal, Jordi; Quadt, Arnulf; Kawamura, Gen; Musheghyan, Haykuhi

    2015-12-01

    High-throughput computing platforms consist of a complex infrastructure and provide a number of services apt to failures. To mitigate the impact of failures on the quality of the provided services, a constant monitoring and in time reaction is required, which is impossible without automation of the system administration processes. This paper introduces a way of automation of the process of monitoring information analysis to provide the long and short term predictions of the service response time (SRT) for a mass storage and batch systems and to identify the status of a service at a given time. The approach for the SRT predictions is based on Adaptive Neuro Fuzzy Inference System (ANFIS). An evaluation of the approaches is performed on real monitoring data from the WLCG Tier 2 center GoeGrid. Ten fold cross validation results demonstrate high efficiency of both approaches in comparison to known methods.

  20. Global Analysis of miRNA Gene Clusters and Gene Families Reveals Dynamic and Coordinated Expression

    Directory of Open Access Journals (Sweden)

    Li Guo

    2014-01-01

    Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.

  1. Dynamical analysis of NGC 110: cluster of fainter stars or data fluctuation?

    CERN Document Server

    Joshi, Gireesh C

    2016-01-01

    The stellar enhancement of the cluster NGC 110 is investigated in various optical and infrared (IR) bands. The radial density profile of the IR region does not show a stellar enhancement in the central region of the cluster. This stellar deficiency may be occurring by undetected fainter stars due to the contamination effect of massive stars. Since, our analysis is not indicating the stellar enhancement below 16.5 mag of I band, therefore the cluster is assumed to be a group of fainter stars. The proposed magnitude scatter factor would be an excellent tool to understand the characteristic of colour-scattering of stars. The most probable members do not coincide with the model isochronic fitting in the optical bands due to poor data quality of P P MXL catalogue. The different values of the mean proper motions are found for the fainter stars of the cluster and field regions, whereas similar values are obtained for radial zones of the cluster. The symmetrical distribution of fainter stars of the core are found aro...

  2. Links between patterns of racial socialization and discrimination experiences and psychological adjustment: a cluster analysis.

    Science.gov (United States)

    Ajayi, Alex A; Syed, Moin

    2014-10-01

    This study used a person-oriented analytic approach to identify meaningful patterns of barriers-focused racial socialization and perceived racial discrimination experiences in a sample of 295 late adolescents. Using cluster analysis, three distinct groups were identified: Low Barrier Socialization-Low Discrimination, High Barrier Socialization-Low Discrimination, and High Barrier Socialization-High Discrimination clusters. These groups were substantively unique in terms of the frequency of racial socialization messages about bias preparation and out-group mistrust its members received and their actual perceived discrimination experiences. Further, individuals in the High Barrier Socialization-High Discrimination cluster reported significantly higher depressive symptoms than those in the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. However, no differences in adjustment were observed between the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. Overall, the findings highlight important individual differences in how young people of color experience their race and how these differences have significant implications on psychological adjustment. PMID:25124381

  3. Chemical abundance analysis of the old, rich open cluster Trumpler 20

    CERN Document Server

    Carraro, Giovanni; Monaco, Lorenzo; Beccari, Giacomo; Ahumada, Javier; Boffin, Henri

    2014-01-01

    Trumpler 20 is an open cluster located at low Galactic longitude, just beyond the great Carina spiral arm, and whose metallicity and fundamental parameters were very poorly known until now. As it is most likely a rare example of an old, rich open cluster -- possibly a twin of NGC 7789 -- it is useful to characterize it. To this end, we determine here the abundance of several elements and their ratios in a sample of stars in the clump of Trumpler 20. The primary goal is to measure Trumpler 20 metallicity, so far very poorly constrained, and revise the cluster's fundamental parameters. We present high-resolution spectroscopy of eight clump stars. Based on their radial velocities, we identify six bona fide cluster members, and for five of them (the sixth being a fast rotator) we perform a detailed abundance analysis. We find that Trumpler 20 is slightly more metal-rich than the Sun, having [Fe/H]=+0.09$\\pm$0.10. The abundance ratios of alpha-elements are generally solar. In line with recent studies of clusters a...

  4. A clustering analysis of eddies' spatial distribution in the South China Sea

    Directory of Open Access Journals (Sweden)

    J. Yi

    2012-11-01

    Full Text Available Spatial variation is important for studying the mesoscale eddies in the South China Sea (SCS. To investigate such spatial variations, this study made a clustering analysis on eddies' distribution using the K-means approach. Results showed that clustering tendency of anticyclonic eddies (AEs and cyclonic eddies (CEs were weak but not random, and the number of clusters were proved greater than four. Finer clustering results showed 10 regions where AEs densely populated and 6 regions for CEs in the SCS. Previous studies confirmed these partitions and possible generation mechanisms were related. Comparisons between AEs and CEs revealed that patterns of AE are relatively more aggregated than those of CE, and specific distinctions were summarized: (1 to the southwest of Luzon Island, AEs and CEs are generated spatially apart; AEs are likely located north of 14° N and closer to shore, while CEs are to the south and further offshore; (2 the Central SCS and Nansha Trough are mostly dominated by AEs; (3 along 112° E, clusters of AEs and CEs are located sequentially apart, and the pair off Vietnam represents the dipole eddies; (4 to the southwest of Dongsha Islands, AEs are concentrated to the east of CEs. Overlaps of AEs and CEs in the northeastern and Southern SCS were further examined considering seasonal variations. The northeastern overlap represented near-concentric distributions while the southern one was a mixed effect of seasonal variations, complex circulations and topography influences.

  5. A clustering analysis of eddies' spatial distribution in the South China Sea

    Directory of Open Access Journals (Sweden)

    J. Yi

    2013-02-01

    Full Text Available Spatial variation is important for studying the mesoscale eddies in the South China Sea (SCS. To investigate such spatial variations, this study made a clustering analysis on eddies' distribution using the K-means approach. Results showed that clustering tendency of anticyclonic eddies (AEs and cyclonic eddies (CEs were weak but not random, and the number of clusters were proved greater than four. Finer clustering results showed 10 regions where AEs densely populated and 6 regions for CEs in the SCS. Previous studies confirmed these partitions and possible generation mechanisms were related. Comparisons between AEs and CEs revealed that patterns of AE are relatively more aggregated than those of CE, and specific distinctions were summarized: (1 to the southwest of Luzon Island, AEs and CEs are generated spatially apart; AEs are likely located north of 14° N and closer to shore, while CEs are to the south and further offshore. (2 The central SCS and Nansha Trough are mostly dominated by AEs. (3 Along 112° E, clusters of AEs and CEs are located sequentially apart, and the pairs off Vietnam represent the dipole structures. (4 To the southwest of the Dongsha Islands, AEs are concentrated to the east of CEs. Overlaps of AEs and CEs in the northeastern and southern SCS were further examined considering seasonal variations. The northeastern overlap represented near-concentric distributions while the southern one was a mixed effect of seasonal variations, complex circulations and topography influences.

  6. Latent class analysis of lifestyle characteristics and health risk behaviors among college youth.

    Science.gov (United States)

    Laska, Melissa Nelson; Pasch, Keryn E; Lust, Katherine; Story, Mary; Ehlinger, Ed

    2009-12-01

    Few studies have examined the context of a wide range of risk behaviors among emerging adults (ages 18-25 years), approximately half of whom in the USA enroll in post-secondary educational institutions. The objective of this research was to examine behavioral patterning in weight behaviors (diet and physical activity), substance use, sexual behavior, stress, and sleep among undergraduate students. Health survey data were collected among undergraduates attending a large, public US university (n = 2,026). Latent class analysis was used to identify homogeneous, mutually exclusive "classes" (patterns) of ten leading risk behaviors. Resulting classes differed for males and females. Female classes were defined as: (1) poor lifestyle (diet, physical activity, sleep), yet low-risk behaviors (e.g., smoking, binge drinking, sexual risk, drunk driving; 40.0% of females), (2) high risk (high substance use, intoxicated sex, drunk driving, poor diet, inadequate sleep) (24.3%), (3) moderate lifestyle, few risk behaviors (20.4%), (4) "health conscious" (favorable diet/physical activity with some unhealthy weight control; 15.4%). Male classes were: (1) poor lifestyle, low risk (with notably high stress, insufficient sleep, 9.2% of males), (2) high risk (33.6% of males, similar to class 2 in females), (3) moderate lifestyle, low risk (51.0%), and (4) "classic jocks" (high physical activity, binge drinking, 6.2%). To our knowledge, this is among the first research to examine complex lifestyle patterning among college youth, particularly with emphasis on the role of weight-related behaviors. These findings have important implications for targeting much needed health promotion strategies among emerging adults and college youth. PMID:19499339

  7. Validation of hierarchical cluster analysis for identification of bacterial species using 42 bacterial isolates

    Science.gov (United States)

    Ghebremedhin, Meron; Yesupriya, Shubha; Luka, Janos; Crane, Nicole J.

    2015-03-01

    Recent studies have demonstrated the potential advantages of the use of Raman spectroscopy in the biomedical field due to its rapidity and noninvasive nature. In this study, Raman spectroscopy is applied as a method for differentiating between bacteria isolates for Gram status and Genus species. We created models for identifying 28 bacterial isolates using spectra collected with a 785 nm laser excitation Raman spectroscopic system. In order to investigate the groupings of these samples, partial least squares discriminant analysis (PLSDA) and hierarchical cluster analysis (HCA) was implemented. In addition, cluster analyses of the isolates were performed using various data types consisting of, biochemical tests, gene sequence alignment, high resolution melt (HRM) analysis and antimicrobial susceptibility tests of minimum inhibitory concentration (MIC) and degree of antimicrobial resistance (SIR). In order to evaluate the ability of these models to correctly classify bacterial isolates using solely Raman spectroscopic data, a set of 14 validation samples were tested using the PLSDA models and consequently the HCA models. External cluster evaluation criteria of purity and Rand index were calculated at different taxonomic levels to compare the performance of clustering using Raman spectra as well as the other datasets. Results showed that Raman spectra performed comparably, and in some cases better than, the other data types with Rand index and purity values up to 0.933 and 0.947, respectively. This study clearly demonstrates that the discrimination of bacterial species using Raman spectroscopic data and hierarchical cluster analysis is possible and has the potential to be a powerful point-of-care tool in clinical settings.

  8. Assessing Consistency of Consumer Confidence Data using Dynamic Latent Class Analysis

    OpenAIRE

    Sunil Kumar; Zakir Husain; Diganta Mukherjee

    2015-01-01

    In many countries information on expectations collected through consumer confidence surveys are used in macroeconomic policy formulation. Unfortunately, before doing so, the consistency of responses is often not taken into account, leading to biases creeping in and affecting the reliability of the indices hence created. This paper describes how latent class analysis may be used to check the consistency of responses and ensure a parsimonious questionnaire. In particular, we examine how tempora...

  9. Patterns and predictors of violence against children in Uganda: a latent class analysis

    OpenAIRE

    Clarke, K.; Patalay, P; Van Allen, E; Knight, L; Naker, D.; DeVries, K.

    2016-01-01

    Objective To explore patterns of physical, emotional and sexual violence against Ugandan children. Design Latent class and multinomial logistic regression analysis of cross-sectional data. Setting Luwero District, Uganda. Participants In all, 3706 primary 5, 6 and 7 students attending 42 primary schools. Main outcome and measure To measure violence, we used the International Society for the Prevention of Child Abuse and Neglect Child Abuse Screening Tool—Child Institutional. We used the Stren...

  10. Identifying differences in the experience of (in)authenticity: a latent class analysis approach

    OpenAIRE

    Alison P. Lenton; Slabu, Letitia; Bruder, Martin; Sedikides, Constantine

    2014-01-01

    Generally, psychologists consider state authenticity – that is, the subjective sense of being one’s true self – to be a unitary and unidimensional construct, such that (a) the phenomenological experience of authenticity is thought to be similar no matter its trigger, and (b) inauthenticity is thought to be simply the opposing pole (on the same underlying construct) of authenticity. Using latent class analysis, we put this conceptualization to a test. In order to avoid over-reliance on a Weste...

  11. A tensor analysis to evaluate the effect of high-pull headgear on Class II malocclusions.

    Science.gov (United States)

    Ngan, P; Scheick, J; Florman, M

    1993-03-01

    The inaccuracies inherent in cephalometric analysis of treatment effects are well known. The objective of this article is to present a more reliable research tool in the analysis of cephalometric data. Bookstein introduced a dilation function by means of a homogeneous deformation tensor as a method of describing changes in cephalometric data. His article gave an analytic description of the deformation tensor that permits the rapid and highly accurate calculation of it on a desktop computer. The first part of this article describes the underlying ideas and mathematics. The second part uses the tensor analysis to analyze the cephalometric results of a group of patients treated with high-pull activator (HPA) to demonstrate the application of this research tool. Eight patients with Class II skeletal open bite malocclusions in the mixed dentition were treated with HPA. A control sample consisting of eight untreated children with Class II who were obtained from The Ohio State University Growth Study was used as a comparison group. Lateral cephalograms taken before and at the completion of treatment were traced, digitized, and analyzed with the conventional method and tensor analysis. The results showed that HPA had little or no effect on maxillary skeletal structures. However, reduction in growth rate was found with the skeletal triangle S-N-A, indicating a posterior tipping and torquing of the maxillary incisors. The treatment also induced additional deformation on the mandible in a downward and slightly forward direction. Together with the results from the conventional cephalometric analysis, HPA seemed to provide the vertical and rotational control of the maxilla during orthopedic Class II treatment by inhibiting the downward and forward eruptive path of the upper posterior teeth. The newly designed computer software permits rapid analysis of cephalometric data with the tensor analysis on a desktop computer. This tool may be useful in analyzing growth changes for

  12. Profiles of Community Violence Exposure Among African American Youth: An Examination of Desensitization to Violence Using Latent Class Analysis.

    Science.gov (United States)

    Gaylord-Harden, Noni K; Dickson, Daniel; Pierre, Cynthia

    2016-07-01

    The current study employed latent class analysis (LCA) to identify distinct profiles of community violence exposure and their associations to desensitization outcomes in 241 African American early adolescents (M age = 12.86, SD = 1.28) in the sixth through eighth grade from under-resourced urban communities. Participants self-reported on their exposure to community violence, as well as on depressive and anxiety symptoms. The LCA revealed three distinct classes: a class exposed to low levels of violence (low exposure class), a class exposed to moderately high levels of victimization (victimization class), and a class exposed to high levels of all types of violence (high exposure class). Consistent with predictions, the high exposure class showed the lowest levels of depressive symptoms, suggesting a desensitization outcome. Gender and age were also examined in relation to the classes, and age was significantly associated with an increased risk of being a member of the high exposure class relative to the low exposure class. Using person-based analyses to examine desensitization outcomes provides useful information for prevention and intervention efforts, as it helps to identify a specific subgroup of youth that may be more likely to show desensitization outcomes in the context of community violence. PMID:25716195

  13. CLUSTERING ANALYSIS OF OFFICER'S BEHAVIOURS IN LONDON POLICE FOOT PATROL ACTIVITIES

    Directory of Open Access Journals (Sweden)

    J. Shen

    2015-07-01

    Full Text Available In this small paper we aim at presenting a framework of conceptual representation and clustering analysis of police officers’ patrol pattern obtained from mining their raw movement trajectory data. This have been achieved by a model developed to accounts for the spatio-temporal dynamics human movements by incorporating both the behaviour features of the travellers and the semantic meaning of the environment they are moving in. Hence, the similarity metric of traveller behaviours is jointly defined according to the stay time allocation in each Spatio-temporal region of interests (ST-ROI to support clustering analysis of patrol behaviours. The proposed framework enables the analysis of behaviour and preferences on higher level based on raw moment trajectories. The model is firstly applied to police patrol data provided by the Metropolitan Police and will be tested by other type of dataset afterwards.

  14. Cluster analysis of midlatitude oceanic cloud regimes: mean properties and temperature sensitivity

    Directory of Open Access Journals (Sweden)

    N. D. Gordon

    2010-07-01

    Full Text Available Clouds play an important role in the climate system by reducing the amount of shortwave radiation reaching the surface and the amount of longwave radiation escaping to space. Accurate simulation of clouds in computer models remains elusive, however, pointing to a lack of understanding of the connection between large-scale dynamics and cloud properties. This study uses a k-means clustering algorithm to group 21 years of satellite cloud data over midlatitude oceans into seven clusters, and demonstrates that the cloud clusters are associated with distinct large-scale dynamical conditions. Three clusters correspond to low-level cloud regimes with different cloud fraction and cumuliform or stratiform characteristics, but all occur under large-scale descent and a relatively dry free troposphere. Three clusters correspond to vertically extensive cloud regimes with tops in the middle or upper troposphere, and they differ according to the strength of large-scale ascent and enhancement of tropospheric temperature and humidity. The final cluster is associated with a lower troposphere that is dry and an upper troposphere that is moist and experiencing weak ascent and horizontal moist advection.

    Since the present balance of reflection of shortwave and absorption of longwave radiation by clouds could change as the atmosphere warms from increasing anthropogenic greenhouse gases, we must also better understand how increasing temperature modifies cloud and radiative properties. We therefore undertake an observational analysis of how midlatitude oceanic clouds change with temperature when dynamical processes are held constant (i.e., partial derivative with respect to temperature. For each of the seven cloud regimes, we examine the difference in cloud and radiative properties between warm and cold subsets. To avoid misinterpreting a cloud response to large-scale dynamical forcing as a cloud response to temperature, we require horizontal and vertical

  15. Cluster analysis of European surface ozone observations for evaluation of MACC reanalysis data

    Science.gov (United States)

    Lyapina, Olga; Schultz, Martin G.; Hense, Andreas

    2016-06-01

    The high density of European surface ozone monitoring sites provides unique opportunities for the investigation of regional ozone representativeness and for the evaluation of chemistry climate models. The regional representativeness of European ozone measurements is examined through a cluster analysis (CA) of 4 years of 3-hourly ozone data from 1492 European surface monitoring stations in the Airbase database; the time resolution corresponds to the output frequency of the model that is compared to the data in this study. K-means clustering is implemented for seasonal-diurnal variations (i) in absolute mixing ratio units and (ii) normalized by the overall mean ozone mixing ratio at each site. Statistical tests suggest that each CA can distinguish between four and five different ozone pollution regimes. The individual clusters reveal differences in seasonal-diurnal cycles, showing typical patterns of the ozone behavior for more polluted stations or more rural background. The robustness of the clustering was tested with a series of k-means runs decreasing randomly the size of the initial data set or lengths of the time series. Except for the Po Valley, the clustering does not provide a regional differentiation, as the member stations within each cluster are generally distributed all over Europe. The typical seasonal, diurnal, and weekly cycles of each cluster are compared to the output of the multi-year global reanalysis produced within the Monitoring of Atmospheric Composition and Climate (MACC) project. While the MACC reanalysis generally captures the shape of the diurnal cycles and the diurnal amplitudes, it is not able to reproduce the seasonal cycles very well and it exhibits a high bias up to 12 nmol mol-1. The bias decreases from more polluted clusters to cleaner ones. Also, the seasonal and weekly cycles and frequency distributions of ozone mixing ratios are better described for clusters with relatively clean signatures. Due to relative sparsity of CO and NOx

  16. Galaxy Cluster Pressure Profiles as Determined by Sunyaev Zel'dovich Effect Observations with MUSTANG and Bolocam II: Joint Analysis of Fourteen Clusters

    CERN Document Server

    Romero, Charles; Sayers, Jack; Mroczkowski, Tony; Sarazin, Craig; Donahue, Megan; Baldi, Alessandro; Clarke, Tracy E; Young, Alexander; Sievers, Jonathan; Dicker, Simon; Reese, Erik; Czakon, Nicole; Devlin, Mark; Korngut, Phillip; Golwala, Sunil

    2016-01-01

    We present pressure profiles of galaxy clusters determined from high resolution Sunyaev-Zel'dovich (SZ) effect observations of fourteen clusters, which span the redshift range $ 0.25 < z < 0.89$. The procedure simultaneously fits spherical cluster models to MUSTANG and Bolocam data. In this analysis, we adopt the generalized NFW parameterization of pressure profiles to produce our models. Our constraints on ensemble-average pressure profile parameters, in this study $\\gamma$, $C_{500}$, and $P_0$, are consistent with those in previous studies, but for individual clusters we find discrepancies with the X-ray derived pressure profiles from the ACCEPT2 database. We investigate potential sources of these discrepancies, especially cluster geometry, electron temperature of the intracluster medium, and substructure. We find that the ensemble mean profile for all clusters in our sample is described by the parameters: $[\\gamma,C_{500},P_0] = [0.3_{-0.1}^{+0.1}, 1.3_{-0.1}^{+0.1}, 8.6_{-2.4}^{+2.4}]$, for cool co...

  17. Subclassification of Recursive Partitioning Analysis Class II Patients With Brain Metastases Treated Radiosurgically

    Energy Technology Data Exchange (ETDEWEB)

    Yamamoto, Masaaki, E-mail: BCD06275@nifty.com [Katsuta Hospital Mito GammaHouse, Hitachi-naka (Japan); Department of Neurosurgery, Tokyo Women' s Medical University Medical Center East, Tokyo (Japan); Sato, Yasunori [Clinical Research Center, Chiba University Graduate School of Medicine, Chiba (Japan); Serizawa, Toru [Tokyo Gamma Unit Center, Tsukiji Neurologic Clinic, Tokyo (Japan); Kawabe, Takuya [Katsuta Hospital Mito GammaHouse, Hitachi-naka (Japan); Department of Neurosurgery, Kyoto Prefectural University of Medicine Graduate School of Medical Sciences, Kyoto (Japan); Higuchi, Yoshinori [Department of Neurosurgery, Chiba University Graduate School of Medicine, Chiba (Japan); Nagano, Osamu [Gamma Knife House, Chiba Cardiovascular Center, Ichihara (Japan); Barfod, Bierta E. [Katsuta Hospital Mito GammaHouse, Hitachi-naka (Japan); Ono, Junichi [Gamma Knife House, Chiba Cardiovascular Center, Ichihara (Japan); Kasuya, Hidetoshi [Department of Neurosurgery, Tokyo Women' s Medical University Medical Center East, Tokyo (Japan); Urakawa, Yoichi [Katsuta Hospital Mito GammaHouse, Hitachi-naka (Japan)

    2012-08-01

    Purpose: Although the recursive partitioning analysis (RPA) class is generally used for predicting survival periods of patients with brain metastases (METs), the majority of such patients are Class II and clinical factors vary quite widely within this category. This prompted us to divide RPA Class II patients into three subclasses. Methods and Materials: This was a two-institution, institutional review board-approved, retrospective cohort study using two databases: the Mito series (2,000 consecutive patients, comprising 787 women and 1,213 men; mean age, 65 years [range, 19-96 years]) and the Chiba series (1,753 patients, comprising 673 female and 1,080 male patients; mean age, 65 years [range, 7-94 years]). Both patient series underwent Gamma Knife radiosurgery alone, without whole-brain radiotherapy, for brain METs during the same 10-year period, July 1998 through June 2008. The Cox proportional hazard model with a step-wise selection procedure was used for multivariate analysis. Results: In the Mito series, four factors were identified as favoring longer survival: Karnofsky Performance Status (90% to 100% vs. 70% to 80%), tumor numbers (solitary vs. multiple), primary tumor status (controlled vs. not controlled), and non-brain METs (no vs. yes). This new index is the sum of scores (0 and 1) of these four factors: RPA Class II-a, score of 0 or 1; RPA Class II-b, score of 2; and RPA Class II-c, score of 3 or 4. Next, using the Chiba series, we tested whether our index is valid for a different patient group. This new system showed highly statistically significant differences among subclasses in both the Mito series and the Chiba series (p < 0.001 for all subclasses). In addition, this new index was confirmed to be applicable to Class II patients with four major primary tumor sites, that is, lung, breast, alimentary tract, and urogenital organs. Conclusions: Our new grading system should be considered when designing future clinical trials involving brain MET

  18. Patterns of client behavior with their most recent male escort: an application of latent class analysis.

    Science.gov (United States)

    Grov, Christian; Starks, Tyrel J; Wolff, Margaret; Smith, Michael D; Koken, Juline A; Parsons, Jeffrey T

    2015-05-01

    Research examining interactions between male escorts and clients has relied heavily on data from escorts, men working on the street, and behavioral data aggregated over time. In the current study, 495 clients of male escorts answered questions about sexual behavior with their last hire. Latent class analysis identified four client sets based on these variables. The largest (n = 200, 40.4 %, labeled Typical Escort Encounter) included men endorsing behavior prior research found typical of paid encounters (e.g., oral sex and kissing). The second largest class (n = 157, 31.7 %, Typical Escort Encounter + Erotic Touching) included men reporting similar behaviors, but with greater variety along a spectrum of touching (e.g., mutual masturbation and body worship). Those classed BD/SM and Kink (n = 76, 15.4 %) reported activity along the kink spectrum (BD/SM and role play). Finally, men classed Erotic Massage Encounters (n = 58, 11.7 %) primarily engaged in erotic touch. Clients reporting condomless anal sex were in the minority (12.2 % overall). Escorts who engage in anal sex with clients might be appropriate to train in HIV prevention and other harm reduction practices-adopting the perspective of "sex workers as sex educators." PMID:24777440

  19. Going beyond clustering in MD trajectory analysis: an application to villin headpiece folding.

    Directory of Open Access Journals (Sweden)

    Aruna Rajan

    Full Text Available Recent advances in computing technology have enabled microsecond long all-atom molecular dynamics (MD simulations of biological systems. Methods that can distill the salient features of such large trajectories are now urgently needed. Conventional clustering methods used to analyze MD trajectories suffer from various setbacks, namely (i they are not data driven, (ii they are unstable to noise and changes in cut-off parameters such as cluster radius and cluster number, and (iii they do not reduce the dimensionality of the trajectories, and hence are unsuitable for finding collective coordinates. We advocate the application of principal component analysis (PCA and a non-metric multidimensional scaling (nMDS method to reduce MD trajectories and overcome the drawbacks of clustering. To illustrate the superiority of nMDS over other methods in reducing data and reproducing salient features, we analyze three complete villin headpiece folding trajectories. Our analysis suggests that the folding process of the villin headpiece is structurally heterogeneous.

  20. Bayesian Analysis of Two Stellar Populations in Galactic Globular Clusters. II. NGC 5024, NGC 5272, and NGC 6352

    Science.gov (United States)

    Wagner-Kaiser, R.; Stenning, D. C.; Robinson, E.; von Hippel, T.; Sarajedini, A.; van Dyk, D. A.; Stein, N.; Jefferys, W. H.

    2016-07-01

    We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival Advanced Camera for Surveys Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from ∼0.05 to 0.11 for these three clusters. Model grids with solar α-element abundances ([α/Fe] = 0.0) and enhanced α-elements ([α/Fe] = 0.4) are adopted.

  1. Bayesian Analysis of Two Stellar Populations in Galactic Globular Clusters. II. NGC 5024, NGC 5272, and NGC 6352

    Science.gov (United States)

    Wagner-Kaiser, R.; Stenning, D. C.; Robinson, E.; von Hippel, T.; Sarajedini, A.; van Dyk, D. A.; Stein, N.; Jefferys, W. H.

    2016-07-01

    We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival Advanced Camera for Surveys Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from ˜0.05 to 0.11 for these three clusters. Model grids with solar α-element abundances ([α/Fe] = 0.0) and enhanced α-elements ([α/Fe] = 0.4) are adopted.

  2. Bayesian Analysis of Two Stellar Populations in Galactic Globular Clusters II: NGC 5024, NGC 5272, and NGC 6352

    CERN Document Server

    Wagner-Kaiser, R; Robinson, E; von Hippel, T; Sarajedini, A; van Dyk, D A; Stein, N; Jefferys, W H

    2016-01-01

    We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from $\\sim$0.05 to 0.11 for these three clusters. Model grids with solar $\\alpha$-element abundances ([$\\alpha$/Fe] =0.0) and enhanced $\\alpha$-elements ([$\\alpha$/Fe]=0.4) are adopted.

  3. Clustering of drinker prototype characteristics: what characterizes the typical drinker?

    Science.gov (United States)

    van Lettow, Britt; Vermunt, Jeroen K; de Vries, Hein; Burdorf, Alex; van Empelen, Pepijn

    2013-08-01

    Prototypes (social images) have been shown to influence behaviour, which is likely to depend on the type of image. Prototype evaluation is based on (un)desirable characteristics related to that image. By an elicitation procedure we examined which adjectives are attributed to specific drinker prototypes. In total 149 young Dutch adults (18-25 years of age) provided adjectives for five drinker prototypes: abstainer, moderate drinker, heavy drinker, tipsy, and drunk person. Twenty-three unique adjectives were found. Multilevel latent class cluster analysis revealed six adjective clusters, each with unique and minor overlapping adjectives: 'negative, excessive drinker,' 'moderate, responsible drinker,' 'funny tipsy drinker,' 'determined abstainer cluster,' 'uncontrolled excessive drinker,' and 'elated tipsy cluster.' In addition, four respondent classes were identified. Respondent classes showed differences in their focus on specific adjective clusters. Classes could be labelled 'focus-on-control class,' 'focus-on-hedonism class,' 'contrasting-extremes-prototypes class,' and 'focus-on-elation class.' Respondent classes differed in gender, educational level and drinking behaviour. The results underscore the importance to differentiate between various prototypes and in prototype adjectives among young adults: subgroup differences in prototype salience and relevance are possibly due to differences in adjective labelling. The results provide insights into explaining differences in drinking behaviour and could potentially be used to target and tailor interventions aimed at lowering alcohol consumption among young adults via prototype alteration. PMID:23848388

  4. Cephalometric-radiographic study, in lateral norm, considering the established standards of white Brazilian teenagers who presented normal occlusions and mal-occlusions of Class I and Class II, 1st Division and the ones from Ricketts' analysis

    International Nuclear Information System (INIS)

    In the present work, our purpose was make a cephalometric-radiographic study, comparing white Brazilian teenagers who presented normal occlusion and the ones who presented malocclusions of Class I and Class II, according to RICKETT'S analysis (1960). (author)

  5. Characterization of Local Wind Patterns around the Kori Nuclear Power Plant using Cluster Analysis and WRF meteorological modeling

    International Nuclear Information System (INIS)

    To accurately predict the atmospheric diffusion of radioactive effluent, detailed analysis of local wind patterns nearby nuclear power plants are necessary. In this study, the characteristics of typical local winds around the Kori Nuclear Power Plant (Kori NPP) were investigated using the cluster analysis and Weather Research and Forecasting (WRF) meteorological modeling. In this study, the local wind characteristics around the Kori NPP were analyzed using cluster analysis and WRF meteorological modeling. As a result of the cluster analysis, four wind patterns around the Kori NPP were selected. The model study indicated the possibility that the local winds in the target area can largely contribute to the atmospheric diffusion of radioactive effluents

  6. Accuracy of a class of concurrent algorithms for transient finite element analysis

    Science.gov (United States)

    Ortiz, Michael; Sotelino, Elisa D.; Nour-Omid, Bahram

    1988-01-01

    The accuracy of a new class of concurrent procedures for transient finite element analysis is examined. A phase error analysis is carried out which shows that wave retardation leading to unacceptable loss of accuracy may occur if a Courant condition based on the dimensions of the subdomains is violated. Numerical tests suggest that this Courant condition is conservative for typical structural applications and may lead to a marked increase in accuracy as the number of subdomains is increased. Theoretical speed-up ratios are derived which suggest that the algorithms under consideration can be expected to exhibit a performance superior to that of globally implicit methods when implemented on parallel machines.

  7. Study and Analysis of K-Means Clustering Algorithm Using Rapidminer

    Directory of Open Access Journals (Sweden)

    Abhinn Pandey

    2014-12-01

    Full Text Available Institution is a place where teacher explains and student just understands and learns the lesson. Every student has his own definition for toughness and easiness and there isn’t any absolute scale for measuring knowledge but examination score indicate the performance of student. In this case study, knowledge of data mining is combined with educational strategies to improve students’ performance. Generally, data mining (sometimes called data or knowledge discovery is the process of analysing data from different perspectives and summarizing it into useful information. Data mining software is one of a number of analytical tools for data. It allows users to analyse data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational database. Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster are more similar (in some sense or another to each other than to those in other groups (clusters.This project describes the use of clustering data mining technique to improve the efficiency of academic performance in the educational institutions .In this project, a live experiment was conducted on students .By conducting an exam on students of computer science major using MOODLE(LMS and analysing that data generated using RapidMiner(Datamining Software and later by performing clustering on the data. This method helps to identify the students who need special advising or counselling by the teacher to give high quality of education.

  8. WHY DO SOME NATIONS SUCCEED AND OTHERS FAIL IN INTERNATIONAL COMPETITION? FACTOR ANALYSIS AND CLUSTER ANALYSIS AT EUROPEAN LEVEL

    Directory of Open Access Journals (Sweden)

    Popa Ion

    2015-07-01

    Full Text Available As stated by Michael Porter (1998: 57, 'this is perhaps the most frequently asked economic question of our times.' However, a widely accepted answer is still missing. The aim of this paper is not to provide the BIG answer for such a BIG question, but rather to provide a different perspective on the competitiveness at the national level. In this respect, we followed a two step procedure, called “tandem analysis”. (OECD, 2008. First we employed a Factor Analysis in order to reveal the underlying factors of the initial dataset followed by a Cluster Analysis which aims classifying the 35 countries according to the main characteristics of competitiveness resulting from Factor Analysis. The findings revealed that clustering the 35 states after the first two factors: Smart Growth and Market Development, which recovers almost 76% of common variability of the twelve original variables, are highlighted four clusters as well as a series of useful information in order to analyze the characteristics of the four clusters and discussions on them.

  9. Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook

    Science.gov (United States)

    Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.

    2012-12-01

    The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system

  10. Fuzzy Clustering

    DEFF Research Database (Denmark)

    Berks, G.; Keyserlingk, Diedrich Graf von; Jantzen, Jan;

    2000-01-01

    -mean clustering is an easy and well improved tool, which has been applied in many medical fields. We used c-mean fuzzy clustering after feature extraction from an aphasia database. Factor analysis was applied on a correlation matrix of 26 symptoms of language disorders and led to five factors. The factors...

  11. Principal component cluster analysis of ECG time series based on Lyapunov exponent spectrum

    Institute of Scientific and Technical Information of China (English)

    WANG Nai; RUAN Jiong

    2004-01-01

    In this paper we propose an approach of principal component cluster analysis based on Lyapunov exponent spectrum (LES) to analyze the ECG time series. Analysis results of 22 sample-files of ECG from the MIT-BIH database confirmed the validity of our approach. Another technique named improved teacher selecting student (TSS) algorithm is presented to analyze unknown samples by means of some known ones, which is of better accuracy. This technique combines the advantages of both statistical and nonlinear dynamical methods and is shown to be significant to the analysis of nonlinear ECG time series.

  12. Clustering by Pattern Similarity

    Institute of Scientific and Technical Information of China (English)

    Hai-xun Wang; Jian Pei

    2008-01-01

    The task of clustering is to identify classes of similar objects among a set of objects. The definition of similarity varies from one clustering model to another. However, in most of these models the concept of similarity is often based on such metrics as Manhattan distance, Euclidean distance or other Lp distances. In other words, similar objects must have close values in at least a set of dimensions. In this paper, we explore a more general type of similarity. Under the pCluster model we proposed, two objects are similar if they exhibit a coherent pattern on a subset of dimensions. The new similarity concept models a wide range of applications. For instance, in DNA microarray analysis, the expression levels of two genes may rise and fall synchronously in response to a set of environmental stimuli. Although the magnitude of their expression levels may not be close, the patterns they exhibit can be very much alike. Discovery of such clusters of genes is essential in revealing significant connections in gene regulatory networks. E-commerce applications, such as collaborative filtering, can also benefit from the new model, because it is able to capture not only the closeness of values of certain leading indicators but also the closeness of (purchasing, browsing, etc.) patterns exhibited by the customers. In addition to the novel similarity model, this paper also introduces an effective and efficient algorithm to detect such clusters, and we perform tests on several real and synthetic data sets to show its performance.

  13. Analysis of Decision Trees in Context Clustering of Hidden Markov Model Based Thai Speech Synthesis

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2011-01-01

    Full Text Available Problem statement: In Thai speech synthesis using Hidden Markov model (HMM based synthesis system, the tonal speech quality is degraded due to tone distortion. This major problem must be treated appropriately to preserve the tone characteristics of each syllable unit. Since tone brings about the intelligibility of the synthesized speech. It is needed to establish the tone questions and other phonetic questions in tree-based context clustering process accordingly. Approach: This study describes the analysis of questions in tree-based context clustering process of an HMM-based speech synthesis system for Thai language. In the system, spectrum, pitch or F0 and state duration are modeled simultaneously in a unified framework of HMM, their parameter distributions are clustered independently by using a decision-tree based context clustering technique. The contextual factors which affect spectrum, pitch and duration, i.e., part of speech, position and number of phones in a syllable, position and number of syllables in a word, position and number of words in a sentence, phone type and tone type, are taken into account for constructing the questions of the decision tree. All in all, thirteen sets of questions are analyzed in comparison. Results: In the experiment, we analyzed the decision trees by counting the number of questions in each node coming from those thirteen sets and by calculating the dominance score given to each question as the reciprocal of the distance from the root node to the question node. The highest number and dominance score are of the set of phonetic type, while the second, third highest ones are of the set of part of speech and tone type. Conclusion: By counting the number of questions in each node and calculating the dominance score, we can set the priority of each question set. All in all, the analysis results bring about further development of Thai speech synthesis with efficient context clustering process in

  14. Lifestyle health behaviors of Hong Kong Chinese: results of a cluster analysis.

    Science.gov (United States)

    Chan, Choi Wan; Leung, Sau Fong

    2015-04-01

    Sociodemographics affect health through pathways of lifestyle choices. Using data from a survey of 467 Hong Kong Chinese, this study aims to examine the prevalence of their lifestyle behaviors, identify profiles based on their sociodemographic and lifestyle variables, and compare differences among the profile groups. Two-step cluster analysis was used to identify natural profile groups within the data set: only 37% of the participants engaged in regular physical exercises, and less than 50% monitored their dietary intake carefully. The analysis yields 2 clusters, representing a "healthy" and a "less-healthy" lifestyle group. The "less-healthy" group was predominantly male, younger, employed, and had high-to-middle levels of education. The findings reveal the lifestyle behavior patterns and sociodemographic characteristics of a high-risk group, which are essential to provide knowledge for the planning of health promotion activities. PMID:25296668

  15. Joint Analysis of Galaxy-Galaxy Lensing and Galaxy Clustering: Methodology and Forecasts for DES

    CERN Document Server

    Park, Y; Dodelson, S; Jain, B; Amara, A; Becker, M R; Bridle, S L; Clampitt, J; Crocce, M; Fosalba, P; Gaztanaga, E; Honscheid, K; Rozo, E; Sobreira, F; Sánchez, C; Wechsler, R H; Abbott, T; Abdalla, F B; Allam, S; Benoit-Lévy, A; Bertin, E; Brooks, D; Buckley-Geer, E; Burke, D L; Rosell, A Carnero; Kind, M Carrasco; Carretero, J; Castander, F J; da Costa, L N; DePoy, D L; Desai, S; Dietrich, J P; Gerdes, D W; Gruen, D; Gruendl, R A; Gutierrez, G; James, D J; Kent, S; Kuehn, K; Kuropatkin, N; Lima, M; Maia, M A G; Marshall, J L; Melchior, P; Miller, C J; Sanchez, E; Scarpine, V; Schubnell, M; Sevilla-Noarbe, I; Soares-Santos, M; Suchyta, E; Swanson, M E C; Tarle, G; Thaler, J; Vikram, V; Walker, A R; Weller, J; Zuntz, J

    2015-01-01

    The joint analysis of galaxy-galaxy lensing and galaxy clustering is a promising method for inferring the growth function of large scale structure. This analysis will be carried out on data from the Dark Energy Survey (DES), with its measurements of both the distribution of galaxies and the tangential shears of background galaxies induced by these foreground lenses. We develop a practical approach to modeling the assumptions and systematic effects affecting small scale lensing, which provides halo masses, and large scale galaxy clustering. Introducing parameters that characterize the halo occupation distribution (HOD), photometric redshift uncertainties, and shear measurement errors, we study how external priors on different subsets of these parameters affect our growth constraints. Degeneracies within the HOD model, as well as between the HOD and the growth function, are identified as the dominant source of complication, with other systematic effects sub-dominant. The impact of HOD parameters and their degen...

  16. Design of the optimum insulator gate bipolar transistor using response surface method with cluster analysis

    CERN Document Server

    Wang, Chi Ling; Huang Sy Ruen; Yeh Chao Yu

    2004-01-01

    In this paper, a statistical methodology that can be used for the optimization of the Insulator Gate Bipolar Transistor (IGBT) devices is proposed. This is achieved by integrating the response surface method (RSM) with cluster analysis, weighted composite method and genetic algorithm (GA). The device characteristic of IGBT was simulated based upon the fabrication simulator, ATHENA, and the device simulator, ATLAS. This methodology, yielded another way to investigate the IGBT device and to make a decision in the tradeoff between the breakdown voltage and the on-resistance. In this methodology, we also show how to use cluster analysis to determine the dominant factors that are not visible in the screening of all experiments. 20 Refs.

  17. Review of design criteria and safety analysis of safety class electric building for fuel test loop

    Energy Technology Data Exchange (ETDEWEB)

    Kim, J. Y.

    1998-02-01

    Steady state fuel test loop will be equipped in HANARO to obtain the development and betterment of advanced fuel and materials through the irradiation tests. HANARO fuel test loop was designed for CANDU and PWR fuel testing. Safety related system of Fuel Test Loop such as emergency cooling water system, component cooling water system, safety ventilation system, high energy line break mitigation system and remote control room was required 1E class electric supply to meet the safety operation in accordance with related code. Therefore, FTL electric building was designed to construction and install the related equipment based on seismic category I. The objective of this study is to review the design criteria and analysis the safety function of safety class electric building for fuel test loop, and this results will become guidance for the irradiation testing in future. (author). 10 refs., 6 tabs., 30 figs.

  18. Needs Analysis of English Culture Background Introduction in Spoken English Classes

    Institute of Scientific and Technical Information of China (English)

    王丛丛

    2013-01-01

    With globalization there is an increasing need for spoken English for Chinese college students. However, the situation of the teaching system of spoken English in most universities in China is not satisfactory. Influenced by the exam-oriented teach⁃ing system, great importance has been put on English grammar and the skills of reading and writing. As a result, students’enthusi⁃asm in improving their spoken English is diminished. Actually, students need some knowledge about English culture background before they can communicate effectively. It is significant for teachers to find out the students’wants and lacks during their spoken English study. Besides, attentions should be paid to English culture background in spoken English classes by both teachers and students. This paper mainly aimed at finding out the needs of students for English culture background in spoken English classes based on the theory of needs analysis.

  19. Emergent team roles in organizational meetings: Identifying communication patterns via cluster analysis.

    OpenAIRE

    Lehmann-Willenbrock, N.K.; Beck, S.J.; Kauffeld, S.

    2016-01-01

    Previous team role taxonomies have largely relied on self-report data, focused on functional roles, and described individual predispositions or personality traits. Instead, this study takes a communicative approach and proposes that team roles are produced, shaped, and sustained in communicative behaviors. To identify team roles communicatively, 59 regular organizational meetings were videotaped and analyzed. Cluster analysis revealed five emergent roles: the solution seeker, the problem anal...

  20. Network analysis identifies protein clusters of functional importance in juvenile idiopathic arthritis

    OpenAIRE

    Stevens, Adam; Meyer, Stefan; Hanson, Daniel; Clayton, Peter; Donn, Rachelle

    2014-01-01

    Introduction Our objective was to utilise network analysis to identify protein clusters of greatest potential functional relevance in the pathogenesis of oligoarticular and rheumatoid factor negative (RF-ve) polyarticular juvenile idiopathic arthritis (JIA). Methods JIA genetic association data were used to build an interactome network model in BioGRID 3.2.99. The top 10% of this protein:protein JIA Interactome was used to generate a minimal essential network (MEN). Reactome FI Cytoscape 2.83...