WorldWideScience

Sample records for environmental cluster analysis

  1. Fuzzy Clustering Analysis in Environmental Impact Assessment--A Complement Tool to Environmental Quality Index.

    Science.gov (United States)

    Kung, Hsiang-Te; And Others

    1993-01-01

    In spite of rapid progress achieved in the methodological research underlying environmental impact assessment (EIA), the problem of weighting various parameters has not yet been solved. This paper presents a new approach, fuzzy clustering analysis, which is illustrated with an EIA case study on Baoshan-Wusong District in Shanghai, China. (Author)

  2. Clustering analysis

    International Nuclear Information System (INIS)

    Romli

    1997-01-01

    Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods

  3. Cluster analysis

    CERN Document Server

    Everitt, Brian S; Leese, Morven; Stahl, Daniel

    2011-01-01

    Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons

  4. Environmental technology strongholds. A business analysis of cluster creation; Miljoeteknologiske styrkepositioner. En erhvervsanalyse af klyngedannelse

    Energy Technology Data Exchange (ETDEWEB)

    Rosted, J.; Andersen, Torsten; Degn Bertelsen, M. [FORA (Denmark)

    2006-08-31

    Global focus on environmental responsibility has increased interest in new environmental technology solutions, and environmental technologies will see impressive global growth rates in the coming decades. Environmental technologies make important contributions to solving global environmental challenges. But they are only part of the solution. The development of ground-braking environmental technology solutions should go hand in hand with political decisions on binding environmental goals, public environmental regulation and economic incentives that promote an appropriate behaviour among companies and consumers. The environmental technology market is a highly competitive market that focuses on utilising new and emerging technologies. A large number of Danish companies are active participants in the global competition. There are several examples of government institutions taking an active part in the competition. More and more, new environmental technologies are developed in a binding and strategic collaboration involving companies, universities, research laboratories and government authorities. The level of Danish government authority participation is a critical element. However, this is not the focus of this analysis. The purpose of the analysis is to identify environmental technology areas where Denmark potentially could create new strongholds, if strategic and binding collaboration involving companies, knowledge institutions and government authorities is carried out. The actual level of co-operation should be decided among the relevant stake holders. (au)

  5. Detection of major climatic and environmental predictors of liver fluke exposure risk in Ireland using spatial cluster analysis.

    Science.gov (United States)

    Selemetas, Nikolaos; de Waal, Theo

    2015-04-30

    Fasciolosis caused by Fasciola hepatica (liver fluke) can cause significant economic and production losses in dairy cow farms. The aim of the current study was to identify important weather and environmental predictors of the exposure risk to liver fluke by detecting clusters of fasciolosis in Ireland. During autumn 2012, bulk-tank milk samples from 4365 dairy farms were collected throughout Ireland. Using an in-house antibody-detection ELISA, the analysis of BTM samples showed that 83% (n=3602) of dairy farms had been exposed to liver fluke. The Getis-Ord Gi* statistic identified 74 high-risk and 130 low-risk significant (Pclimatic variables (monthly and seasonal mean rainfall and temperatures, total wet days and rain days) and environmental datasets (soil types, enhanced vegetation index and normalised difference vegetation index) were used to investigate dissimilarities in the exposure to liver fluke between clusters. Rainfall, total wet days and rain days, and soil type were the significant classes of climatic and environmental variables explaining the differences between significant clusters. A discriminant function analysis was used to predict the exposure risk to liver fluke using 80% of data for modelling and the remaining subset of 20% for post hoc model validation. The most significant predictors of the model risk function were total rainfall in August and September and total wet days. The risk model presented 100% sensitivity and 91% specificity and an accuracy of 95% correctly classified cases. A risk map of exposure to liver fluke was constructed with higher probability of exposure in western and north-western regions. The results of this study identified differences between clusters of fasciolosis in Ireland regarding climatic and environmental variables and detected significant predictors of the exposure risk to liver fluke. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  7. Marketing research cluster analysis

    Directory of Open Access Journals (Sweden)

    Marić Nebojša

    2002-01-01

    Full Text Available One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.

  8. A novel exploratory chemometric approach to environmental monitorring by combining block clustering with Partial Least Square (PLS) analysis.

    Science.gov (United States)

    Nica, Dragos V; Bordean, Despina Maria; Pet, Ioan; Pet, Elena; Alda, Simion; Gergen, Iosif

    2013-08-30

    Given the serious threats posed to terrestrial ecosystems by industrial contamination, environmental monitoring is a standard procedure used for assessing the current status of an environment or trends in environmental parameters. Measurement of metal concentrations at different trophic levels followed by their statistical analysis using exploratory multivariate methods can provide meaningful information on the status of environmental quality. In this context, the present paper proposes a novel chemometric approach to standard statistical methods by combining the Block clustering with Partial least square (PLS) analysis to investigate the accumulation patterns of metals in anthropized terrestrial ecosystems. The present study focused on copper, zinc, manganese, iron, cobalt, cadmium, nickel, and lead transfer along a soil-plant-snai food chain, and the hepatopancreas of the Roman snail (Helix pomatia) was used as a biological end-point of metal accumulation. Block clustering deliniates between the areas exposed to industrial and vehicular contamination. The toxic metals have similar distributions in the nettle leaves and snail hepatopancreas. PLS analysis showed that (1) zinc and copper concentrations at the lower trophic levels are the most important latent factors that contribute to metal accumulation in land snails; (2) cadmium and lead are the main determinants of pollution pattern in areas exposed to industrial contamination; (3) at the sites located near roads lead is the most threatfull metal for terrestrial ecosystems. There were three major benefits by applying block clustering with PLS for processing the obtained data: firstly, it helped in grouping sites depending on the type of contamination. Secondly, it was valuable for identifying the latent factors that contribute the most to metal accumulation in land snails. Finally, it optimized the number and type of data that are best for monitoring the status of metallic contamination in terrestrial ecosystems

  9. CLEAN: CLustering Enrichment ANalysis

    Science.gov (United States)

    Freudenberg, Johannes M; Joshi, Vineet K; Hu, Zhen; Medvedovic, Mario

    2009-01-01

    Background Integration of biological knowledge encoded in various lists of functionally related genes has become one of the most important aspects of analyzing genome-wide functional genomics data. In the context of cluster analysis, functional coherence of clusters established through such analyses have been used to identify biologically meaningful clusters, compare clustering algorithms and identify biological pathways associated with the biological process under investigation. Results We developed a computational framework for analytically and visually integrating knowledge-based functional categories with the cluster analysis of genomics data. The framework is based on the simple, conceptually appealing, and biologically interpretable gene-specific functional coherence score (CLEAN score). The score is derived by correlating the clustering structure as a whole with functional categories of interest. We directly demonstrate that integrating biological knowledge in this way improves the reproducibility of conclusions derived from cluster analysis. The CLEAN score differentiates between the levels of functional coherence for genes within the same cluster based on their membership in enriched functional categories. We show that this aspect results in higher reproducibility across independent datasets and produces more informative genes for distinguishing different sample types than the scores based on the traditional cluster-wide analysis. We also demonstrate the utility of the CLEAN framework in comparing clusterings produced by different algorithms. CLEAN was implemented as an add-on R package and can be downloaded at . The package integrates routines for calculating gene specific functional coherence scores and the open source interactive Java-based viewer Functional TreeView (FTreeView). Conclusion Our results indicate that using the gene-specific functional coherence score improves the reproducibility of the conclusions made about clusters of co

  10. Multilevel functional clustering analysis.

    Science.gov (United States)

    Serban, Nicoleta; Jiang, Huijing

    2012-09-01

    In this article, we investigate clustering methods for multilevel functional data, which consist of repeated random functions observed for a large number of units (e.g., genes) at multiple subunits (e.g., bacteria types). To describe the within- and between variability induced by the hierarchical structure in the data, we take a multilevel functional principal component analysis (MFPCA) approach. We develop and compare a hard clustering method applied to the scores derived from the MFPCA and a soft clustering method using an MFPCA decomposition. In a simulation study, we assess the estimation accuracy of the clustering membership and the cluster patterns under a series of settings: small versus moderate number of time points; various noise levels; and varying number of subunits per unit. We demonstrate the applicability of the clustering analysis to a real data set consisting of expression profiles from genes activated by immunity system cells. Prevalent response patterns are identified by clustering the expression profiles using our multilevel clustering analysis. © 2012, The International Biometric Society.

  11. Identifying built environmental patterns using cluster analysis and GIS: relationships with walking, cycling and body mass index in French adults.

    Science.gov (United States)

    Charreire, Hélène; Weber, Christiane; Chaix, Basile; Salze, Paul; Casey, Romain; Banos, Arnaud; Badariotti, Dominique; Kesse-Guyot, Emmanuelle; Hercberg, Serge; Simon, Chantal; Oppert, Jean-Michel

    2012-05-23

    Socio-ecological models suggest that both individual and neighborhood characteristics contribute to facilitating health-enhancing behaviors such as physical activity. Few European studies have explored relationships between local built environmental characteristics, recreational walking and cycling and weight status in adults. The aim of this study was to identify built environmental patterns in a French urban context and to assess associations with recreational walking and cycling behaviors as performed by middle-aged adult residents. We used a two-step procedure based on cluster analysis to identify built environmental patterns in the region surrounding Paris, France, using measures derived from Geographic Information Systems databases on green spaces, proximity facilities (destinations) and cycle paths. Individual data were obtained from participants in the SU.VI.MAX cohort; 1,309 participants residing in the Ile-de-France in 2007 were included in this analysis. Associations between built environment patterns, leisure walking/cycling data (h/week) and measured weight status were assessed using multinomial logistic regression with adjustment for individual and neighborhood characteristics. Based on accessibility to green spaces, proximity facilities and availability of cycle paths, seven built environmental patterns were identified. The geographic distribution of built environmental patterns in the Ile-de-France showed that a pattern characterized by poor spatial accessibility to green spaces and proximity facilities and an absence of cycle paths was found only in neighborhoods in the outer suburbs, whereas patterns characterized by better spatial accessibility to green spaces, proximity facilities and cycle paths were more evenly distributed across the region. Compared to the reference pattern (poor accessibility to green areas and facilities, absence of cycle paths), subjects residing in neighborhoods characterized by high accessibility to green areas and local

  12. Cluster analysis of fasciolosis in dairy cow herds in Munster province of Ireland and detection of major climatic and environmental predictors of the exposure risk.

    Science.gov (United States)

    Selemetas, Nikolaos; Phelan, Paul; O'Kiely, Padraig; de Waal, Theo

    2015-03-19

    Fasciolosis caused by Fasciola hepatica is a widespread parasitic disease in cattle farms. The aim of this study was to detect clusters of fasciolosis in dairy cow herds in Munster Province, Ireland and to identify significant climatic and environmental predictors of the exposure risk. In total, 1,292 dairy herds across Munster was sampled in September 2012 providing a single bulk tank milk (BTM) sample. The analysis of samples by an in-house antibody-detection enzyme-linked immunosorbent assay (ELISA), showed that 65% of the dairy herds (n = 842) had been exposed to F. hepatica. Using the Getis-Ord Gi* statistic, 16 high-risk and 24 low-risk (P <0.01) clusters of fasciolosis were identified. The spatial distribution of high-risk clusters was more dispersed and mainly located in the northern and western regions of Munster compared to the low-risk clusters that were mostly concentrated in the southern and eastern regions. The most significant classes of variables that could reflect the difference between high-risk and low-risk clusters were the total number of wet-days and rain-days, rainfall, the normalized difference vegetation index (NDVI), temperature and soil type. There was a bigger proportion of well-drained soils among the low-risk clusters, whereas poorly drained soils were more common among the high-risk clusters. These results stress the role of precipitation, grazing, temperature and drainage on the life cycle of F. hepatica in the temperate Irish climate. The findings of this study highlight the importance of cluster analysis for identifying significant differences in climatic and environmental variables between high-risk and low-risk clusters of fasciolosis in Irish dairy herds.

  13. Cluster analysis of fasciolosis in dairy cow herds in Munster province of Ireland and detection of major climatic and environmental predictors of the exposure risk

    Directory of Open Access Journals (Sweden)

    Nikolaos Selemetas

    2015-03-01

    Full Text Available Fasciolosis caused by Fasciola hepatica is a widespread parasitic disease in cattle farms. The aim of this study was to detect clusters of fasciolosis in dairy cow herds in Munster Province, Ireland and to identify significant climatic and environmental predictors of the exposure risk. In total, 1,292 dairy herds across Munster was sampled in September 2012 providing a single bulk tank milk (BTM sample. The analysis of samples by an in-house antibody-detection enzyme-linked immunosorbent assay (ELISA, showed that 65% of the dairy herds (n = 842 had been exposed to F. hepatica. Using the Getis-Ord Gi* statistic, 16 high-risk and 24 low-risk (P <0.01 clusters of fasciolosis were identified. The spatial distribution of high-risk clusters was more dispersed and mainly located in the northern and western regions of Munster compared to the low-risk clusters that were mostly concentrated in the southern and eastern regions. The most significant classes of variables that could reflect the difference between high-risk and low-risk clusters were the total number of wet-days and rain-days, rainfall, the normalized difference vegetation index (NDVI, temperature and soil type. There was a bigger proportion of well-drained soils among the low-risk clusters, whereas poorly drained soils were more common among the high-risk clusters. These results stress the role of precipitation, grazing, temperature and drainage on the life cycle of F. hepatica in the temperate Irish climate. The findings of this study highlight the importance of cluster analysis for identifying significant differences in climatic and environmental variables between high-risk and low-risk clusters of fasciolosis in Irish dairy herds.

  14. Comprehensive cluster analysis with Transitivity Clustering.

    Science.gov (United States)

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

    2011-03-01

    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  15. Hydrologic classification of rivers based on cluster analysis of dimensionless hydrologic signatures: Applications for environmental instream flows

    Science.gov (United States)

    Praskievicz, S. J.; Luo, C.

    2017-12-01

    Classification of rivers is useful for a variety of purposes, such as generating and testing hypotheses about watershed controls on hydrology, predicting hydrologic variables for ungaged rivers, and setting goals for river management. In this research, we present a bottom-up (based on machine learning) river classification designed to investigate the underlying physical processes governing rivers' hydrologic regimes. The classification was developed for the entire state of Alabama, based on 248 United States Geological Survey (USGS) stream gages that met criteria for length and completeness of records. Five dimensionless hydrologic signatures were derived for each gage: slope of the flow duration curve (indicator of flow variability), baseflow index (ratio of baseflow to average streamflow), rising limb density (number of rising limbs per unit time), runoff ratio (ratio of long-term average streamflow to long-term average precipitation), and streamflow elasticity (sensitivity of streamflow to precipitation). We used a Bayesian clustering algorithm to classify the gages, based on the five hydrologic signatures, into distinct hydrologic regimes. We then used classification and regression trees (CART) to predict each gaged river's membership in different hydrologic regimes based on climatic and watershed variables. Using existing geospatial data, we applied the CART analysis to classify ungaged streams in Alabama, with the National Hydrography Dataset Plus (NHDPlus) catchment (average area 3 km2) as the unit of classification. The results of the classification can be used for meeting management and conservation objectives in Alabama, such as developing statewide standards for environmental instream flows. Such hydrologic classification approaches are promising for contributing to process-based understanding of river systems.

  16. Relation chain based clustering analysis

    Science.gov (United States)

    Zhang, Cheng-ning; Zhao, Ming-yang; Luo, Hai-bo

    2011-08-01

    Clustering analysis is currently one of well-developed branches in data mining technology which is supposed to find the hidden structures in the multidimensional space called feature or pattern space. A datum in the space usually possesses a vector form and the elements in the vector represent several specifically selected features. These features are often of efficiency to the problem oriented. Generally, clustering analysis goes into two divisions: one is based on the agglomerative clustering method, and the other one is based on divisive clustering method. The former refers to a bottom-up process which regards each datum as a singleton cluster while the latter refers to a top-down process which regards entire data as a cluster. As the collected literatures, it is noted that the divisive clustering is currently overwhelming both in application and research. Although some famous divisive clustering methods are designed and well developed, clustering problems are still far from being solved. The k - means algorithm is the original divisive clustering method which initially assigns some important index values, such as the clustering number and the initial clustering prototype positions, and that could not be reasonable in some certain occasions. More than the initial problem, the k - means algorithm may also falls into local optimum, clusters in a rigid way and is not available for non-Gaussian distribution. One can see that seeking for a good or natural clustering result, in fact, originates from the one's understanding of the concept of clustering. Thus, the confusion or misunderstanding of the definition of clustering always derives some unsatisfied clustering results. One should consider the definition deeply and seriously. This paper demonstrates the nature of clustering, gives the way of understanding clustering, discusses the methodology of designing a clustering algorithm, and proposes a new clustering method based on relation chains among 2D patterns. In

  17. Comprehensive Meta-analysis of Ontology Annotated 16S rRNA Profiles Identifies Beta Diversity Clusters of Environmental Bacterial Communities.

    Directory of Open Access Journals (Sweden)

    Andreas Henschel

    2015-10-01

    Full Text Available Comprehensive mapping of environmental microbiomes in terms of their compositional features remains a great challenge in understanding the microbial biosphere of the Earth. It bears promise to identify the driving forces behind the observed community patterns and whether community assembly happens deterministically. Advances in Next Generation Sequencing allow large community profiling studies, exceeding sequencing data output of conventional methods in scale by orders of magnitude. However, appropriate collection systems are still in a nascent state. We here present a database of 20,427 diverse environmental 16S rRNA profiles from 2,426 independent studies, which forms the foundation of our meta-analysis. We conducted a sample size adaptive all-against-all beta diversity comparison while also respecting phylogenetic relationships of Operational Taxonomic Units(OTUs. After conventional hierarchical clustering we systematically test for enrichment of Environmental Ontology terms and their abstractions in all possible clusters. This post-hoc algorithm provides a novel formalism that quantifies to what extend compositional and semantic similarity of microbial community samples coincide. We automatically visualize significantly enriched subclusters on a comprehensive dendrogram of microbial communities. As a result we obtain the hitherto most differentiated and comprehensive view on global patterns of microbial community diversity. We observe strong clusterability of microbial communities in ecosystems such as human/mammal-associated, geothermal, fresh water, plant-associated, soils and rhizosphere microbiomes, whereas hypersaline and anthropogenic samples are less homogeneous. Moreover, saline samples appear less cohesive in terms of compositional properties than previously reported.

  18. Remodularization Analysis Using Semantic Clustering

    OpenAIRE

    Santos, Gustavo; Tulio Valente, Marco; Anquetil, Nicolas

    2014-01-01

    International audience; In this paper, we report an experience on using and adapting Semantic Clustering to evaluate software remodularizations. Semantic Clustering is an approach that relies on information retrieval and clustering techniques to extract sets of similar classes in a system, according to their vocabularies. We adapted Semantic Clustering to support remodularization analysis. We evaluate our adaptation using six real-world remodularizations of four software systems. We report th...

  19. Unsupervised statistical clustering of environmental shotgun sequences

    Directory of Open Access Journals (Sweden)

    Bhatnagar Srijak

    2009-10-01

    Full Text Available Abstract Background The development of effective environmental shotgun sequence binning methods remains an ongoing challenge in algorithmic analysis of metagenomic data. While previous methods have focused primarily on supervised learning involving extrinsic data, a first-principles statistical model combined with a self-training fitting method has not yet been developed. Results We derive an unsupervised, maximum-likelihood formalism for clustering short sequences by their taxonomic origin on the basis of their k-mer distributions. The formalism is implemented using a Markov Chain Monte Carlo approach in a k-mer feature space. We introduce a space transformation that reduces the dimensionality of the feature space and a genomic fragment divergence measure that strongly correlates with the method's performance. Pairwise analysis of over 1000 completely sequenced genomes reveals that the vast majority of genomes have sufficient genomic fragment divergence to be amenable for binning using the present formalism. Using a high-performance implementation, the binner is able to classify fragments as short as 400 nt with accuracy over 90% in simulations of low-complexity communities of 2 to 10 species, given sufficient genomic fragment divergence. The method is available as an open source package called LikelyBin. Conclusion An unsupervised binning method based on statistical signatures of short environmental sequences is a viable stand-alone binning method for low complexity samples. For medium and high complexity samples, we discuss the possibility of combining the current method with other methods as part of an iterative process to enhance the resolving power of sorting reads into taxonomic and/or functional bins.

  20. Environmental Analysis

    Science.gov (United States)

    1980-01-01

    Burns & McDonnell Engineering's environmental control study is assisted by NASA's Computer Software Management and Information Center's programs in environmental analyses. Company is engaged primarily in design of such facilities as electrical utilities, industrial plants, wastewater treatment systems, dams and reservoirs and aviation installations. Company also conducts environmental engineering analyses and advises clients as to the environmental considerations of a particular construction project. Company makes use of many COSMIC computer programs which have allowed substantial savings.

  1. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  2. A facility for using cluster research to study environmental problems

    International Nuclear Information System (INIS)

    1991-11-01

    This report begins by describing the general application of cluster based research to environmental chemistry and the development of a Cluster Structure and Dynamics Research Facility (CSDRF). Next, four important areas of cluster research are described in more detail, including how they can impact environmental problems. These are: surface-supported clusters, water and contaminant interactions, time-resolved dynamic studies in clusters, and cluster structures and reactions. These facilities and equipment required for each area of research are then presented. The appendices contain workshop agenda and a listing of the researchers who participated in the workshop discussions that led to this report

  3. Toward understanding environmental effects in SDSS clusters

    Energy Technology Data Exchange (ETDEWEB)

    Einasto, Jaan; Tago, E.; Einasto, M.; Saar, E.; Suhhonenko, I.; /Tartu Observ.; Heinamaki, P.; /Tartu Observ. /Tuorla Observ.; Hutsi, G.; /Tartu Observ. /Garching, Max; Tucker, D.L.; /Fermilab

    2004-11-01

    We find clusters and superclusters of galaxies using the Data Release 1 of the Sloan Digital Sky Survey. We determine the luminosity function of clusters and find that clusters in a high-density environment have a luminosity a factor of {approx}5 higher than in a low-density environment. We also study clusters and superclusters in numerical simulations. Simulated clusters in a high-density environment are also more massive than those in a low-density environment. Comparison of the density distribution at various epochs in simulations shows that in large low-density regions (voids) dynamical evolution is very slow and stops at an early epoch. In contrast, in large regions of higher density (superclusters) dynamical evolution starts early and continues until the present; here particles cluster early, and by merging of smaller groups very rich systems of galaxies form.

  4. The SMART CLUSTER METHOD - adaptive earthquake cluster analysis and declustering

    Science.gov (United States)

    Schaefer, Andreas; Daniell, James; Wenzel, Friedemann

    2016-04-01

    Earthquake declustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity with usual applications comprising of probabilistic seismic hazard assessments (PSHAs) and earthquake prediction methods. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation. Various methods have been developed to address this issue from other researchers. These have differing ranges of complexity ranging from rather simple statistical window methods to complex epidemic models. This study introduces the smart cluster method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal identification. Hereby, an adaptive search algorithm for data point clusters is adopted. It uses the earthquake density in the spatio-temporal neighbourhood of each event to adjust the search properties. The identified clusters are subsequently analysed to determine directional anisotropy, focussing on a strong correlation along the rupture plane and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010/2011 Darfield-Christchurch events, an adaptive classification procedure is applied to disassemble subsequent ruptures which may have been grouped into an individual cluster using near-field searches, support vector machines and temporal splitting. The steering parameters of the search behaviour are linked to local earthquake properties like magnitude of completeness, earthquake density and Gutenberg-Richter parameters. The method is capable of identifying and classifying earthquake clusters in space and time. It is tested and validated using earthquake data from California and New Zealand. As a result of the cluster identification process, each event in

  5. Stability analysis in K-means clustering.

    Science.gov (United States)

    Steinley, Douglas

    2008-11-01

    This paper develops a new procedure, called stability analysis, for K-means clustering. Instead of ignoring local optima and only considering the best solution found, this procedure takes advantage of additional information from a K-means cluster analysis. The information from the locally optimal solutions is collected in an object by object co-occurrence matrix. The co-occurrence matrix is clustered and subsequently reordered by a steepest ascent quadratic assignment procedure to aid visual interpretation of the multidimensional cluster structure. Subsequently, measures are developed to determine the overall structure of a data set, the number of clusters and the multidimensional relationships between the clusters.

  6. Tanzania: A Hierarchical Cluster Analysis Approach | Ngaruko ...

    African Journals Online (AJOL)

    Using survey data from Kibondo district, west Tanzania, we use hierarchical cluster analysis to classify borrower farmers according to their borrowing behaviour into four distinctive clusters. The appreciation of the existence of heterogeneous farmer clusters is vital in forging credit delivery policies that are not only ...

  7. Cluster analysis in phenotyping a Portuguese population.

    Science.gov (United States)

    Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J

    2015-09-03

    Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.

  8. Scalable clustering algorithms for continuous environmental flow cytometry.

    Science.gov (United States)

    Hyrkas, Jeremy; Clayton, Sophie; Ribalet, Francois; Halperin, Daniel; Armbrust, E Virginia; Howe, Bill

    2016-02-01

    Recent technological innovations in flow cytometry now allow oceanographers to collect high-frequency flow cytometry data from particles in aquatic environments on a scale far surpassing conventional flow cytometers. The SeaFlow cytometer continuously profiles microbial phytoplankton populations across thousands of kilometers of the surface ocean. The data streams produced by instruments such as SeaFlow challenge the traditional sample-by-sample approach in cytometric analysis and highlight the need for scalable clustering algorithms to extract population information from these large-scale, high-frequency flow cytometers. We explore how available algorithms commonly used for medical applications perform at classification of such a large-scale, environmental flow cytometry data. We apply large-scale Gaussian mixture models to massive datasets using Hadoop. This approach outperforms current state-of-the-art cytometry classification algorithms in accuracy and can be coupled with manual or automatic partitioning of data into homogeneous sections for further classification gains. We propose the Gaussian mixture model with partitioning approach for classification of large-scale, high-frequency flow cytometry data. Source code available for download at https://github.com/jhyrkas/seaflow_cluster, implemented in Java for use with Hadoop. hyrkas@cs.washington.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Analysis of Germination Capacity and Germinant Receptor (Sub)clusters of Genome-Sequenced Bacillus cereus Environmental Isolates and Model Strains.

    Science.gov (United States)

    Warda, Alicja K; Xiao, Yinghua; Boekhorst, Jos; Wells-Bennik, Marjon H J; Nierop Groot, Masja N; Abee, Tjakko

    2017-02-15

    Spore germination of 17 Bacillus cereus food isolates and reference strains was evaluated using flow cytometry analysis in combination with fluorescent staining at a single-spore level. This approach allowed for rapid collection of germination data under more than 20 conditions, including heat activation of spores, germination in complex media (brain heart infusion [BHI] and tryptone soy broth [TSB]), and exposure to saturating concentrations of single amino acids and the combination of alanine and inosine. Whole-genome sequence comparison revealed a total of 11 clusters of operons encoding germinant receptors (GRs): GerK, GerI, and GerL were present in all strains, whereas GerR, GerS, GerG, GerQ, GerX, GerF, GerW, and GerZ (sub)clusters showed a more diverse presence/absence in different strains. The spores of tested strains displayed high diversity with regard to their sensitivity and responsiveness to selected germinants and heat activation. The two laboratory strains, B. cereus ATCC 14579 and ATCC 10987, and 11 food isolates showed a good germination response under a range of conditions, whereas four other strains (B. cereus B4085, B4086, B4116, and B4153) belonging to phylogenetic group IIIA showed a very weak germination response even in BHI and TSB media. Germination responses could not be linked to specific (combinations of) GRs, but it was noted that the four group IIIA strains contained pseudogenes or variants of subunit C in their gerL cluster. Additionally, two of those strains (B4086 and B4153) carried pseudogenes in the gerK and gerR I (sub)clusters that possibly affected the functionality of these GRs. Germination of bacterial spores is a critical step before vegetative growth can resume. Food products may contain nutrient germinants that trigger germination and outgrowth of Bacillus species spores, possibly leading to food spoilage or foodborne illness. Prediction of spore germination behavior is, however, very challenging, especially for spores of

  10. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    Science.gov (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  11. [Cluster analysis and its application].

    Science.gov (United States)

    Půlpán, Zdenĕk

    2002-01-01

    The study exploits knowledge-oriented and context-based modification of well-known algorithms of (fuzzy) clustering. The role of fuzzy sets is inherently inclined towards coping with linguistic domain knowledge also. We try hard to obtain from rich diverse data and knowledge new information about enviroment that is being explored.

  12. Foramgeographical affinities of the west and east coasts of India: An approach through cluster analysis and comparison of taxonomical, environmental and ecological parameters of Recent foraminiferal thanatotopes

    Digital Repository Service at National Institute of Oceanography (India)

    Katha, P.K.; Bhalla, S.N.; Nigam, R.

    realms as their assemblages form two separate 'clusters' with only 0.1 level of 'Degree of Association' amongst them. The pattern of clusters shows two faunal assemblages namely- 'Inner' and 'Outer East Marginal Assemblage of Arabian Sea' along the West...

  13. ASteCA: Automated Stellar Cluster Analysis

    Science.gov (United States)

    Perren, G. I.; Vázquez, R. A.; Piatti, A. E.

    2015-04-01

    We present the Automated Stellar Cluster Analysis package (ASteCA), a suit of tools designed to fully automate the standard tests applied on stellar clusters to determine their basic parameters. The set of functions included in the code make use of positional and photometric data to obtain precise and objective values for a given cluster's center coordinates, radius, luminosity function and integrated color magnitude, as well as characterizing through a statistical estimator its probability of being a true physical cluster rather than a random overdensity of field stars. ASteCA incorporates a Bayesian field star decontamination algorithm capable of assigning membership probabilities using photometric data alone. An isochrone fitting process based on the generation of synthetic clusters from theoretical isochrones and selection of the best fit through a genetic algorithm is also present, which allows ASteCA to provide accurate estimates for a cluster's metallicity, age, extinction and distance values along with its uncertainties. To validate the code we applied it on a large set of over 400 synthetic MASSCLEAN clusters with varying degrees of field star contamination as well as a smaller set of 20 observed Milky Way open clusters (Berkeley 7, Bochum 11, Czernik 26, Czernik 30, Haffner 11, Haffner 19, NGC 133, NGC 2236, NGC 2264, NGC 2324, NGC 2421, NGC 2627, NGC 6231, NGC 6383, NGC 6705, Ruprecht 1, Tombaugh 1, Trumpler 1, Trumpler 5 and Trumpler 14) studied in the literature. The results show that ASteCA is able to recover cluster parameters with an acceptable precision even for those clusters affected by substantial field star contamination. ASteCA is written in Python and is made available as an open source code which can be downloaded ready to be used from its official site.

  14. Cluster analysis of pharmacists' work attitudes.

    Science.gov (United States)

    Nakagomi, Keiichi; Hayashi, Yukikazu; Komiyama, Takako

    2017-12-01

    Few studies in Japan use clustering to examine the work attitudes of pharmacists. This study conducts an exploratory analysis to classify those attitudes based on previous studies to help staff pharmacists and their management to understand their mutually beneficial requirements. Survey data collected in previous studies from 1 228 community pharmacists and 419 hospital pharmacists were analyzed using Quantification Theory 3 and clustering. Among community pharmacists, two clusters, namely 30- to 34-year-old married males and married males aged over 35 years, reported the highest job satisfaction, intending to remain in their jobs for 5 years or more or until retirement. Conversely, one cluster of 35- to 39-year-old single females reported the lowest job satisfaction and intended to remain for less than 5  years or were undecided. Among hospital pharmacists, one cluster of 22- to 25-year-old single males reported the highest job satisfaction and intended to remain for more than 5 years. Conversely, one cluster of 30- to 34-year-old married males reported the lowest job satisfaction and a period of working undetermined. This study used clustering to explore how pharmacists of different ages, marital statuses, and experience felt regarding their work. Job satisfaction and human relationships are significant in considering future work plans of practicing pharmacists. Pharmacy staff, supervisors, and managers of community or hospital pharmacies must recognize features of pharmacists' work attitudes for offering high-quality service to patients.

  15. Fuzzy clustering analysis of microarray data.

    Science.gov (United States)

    Han, Lixin; Zeng, Xiaoqin; Yan, Hong

    2008-10-01

    Fuzzy clustering is a useful tool for identifying relevant subsets of microarray data. This paper proposes a fuzzy clustering method for microarray data analysis. An advantage of the method is that it used a combination of the fuzzy c-means and the principal component analysis to identify the groups of genes that show similar expression patterns. It allows a gene to belong to more than a gene expression pattern with different membership grades. The method is suitable for the analysis of large amounts of noisy microarray data.

  16. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

    Science.gov (United States)

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

    2014-11-01

    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. Environmental analysis support

    International Nuclear Information System (INIS)

    Miller, R.L.

    1994-01-01

    Activities in environmental analysis support included assistance to the Morgantown and Pittsburgh Energy Technology Centers (METC and PETC) in reviewing and preparing documents required by the National Environmental Policy Act (NEPA) for several projects selected for the Clean Coal Technology (CCT) Program. A key milestone was the completion for PETC of the final Environmental Impact Statement (EIS) for the Healy Clean Coal Project (HCCP) in Healy, Alaska. This work is notable because it is the first site-specific EIS completed for the CCT Program. Another important activity was the preparation for METC of a draft Environmental Assessment (EA) for the Externally Fired Combined Cycle (EFCC) Project in Warren, Pennsylvania. Also, the final EA was completed for the Gasification Product Improvement Facility (GPIF), a proposed project near Morgantown, West Virginia, which is part of METC's R ampersand D Program. In addition, ORNL staff members published a Technical Memorandum entitled open-quotes Potential Effects of Clean Coal Technologies on Acid Precipitation, Greenhouse Gases, and Solid Waste Disposalclose quotes which documents the findings of three open-quotes white papersclose quotes prepared for DOE/FE

  18. Cluster Analysis of Properties of Temperament

    Directory of Open Access Journals (Sweden)

    A I Krupnov

    2014-12-01

    Full Text Available The paper presents the cluster analysis of various properties of temperament, based on the systematic structure of its main components. On the basis of the received data the qualitative psychological characteristic of the four types of temperament is given.

  19. Environmental analysis and evaluation

    International Nuclear Information System (INIS)

    Cooper, M.B.; Tracy, B.L.

    1990-01-01

    One of the requirements of an environmental radiation surveillance program is to identify and determine long-lived radioactive elements which may be released to the environment during the operation of a uranium (or thorium) mine and mill. Radioanalytical techniques which are suitable for quantitative and qualitative determination of long-lived radionuclides of the natural uranium and thorium series are reviewed. The general features of an analytical program are discussed in terms of sample preparation, radiochemical separation of radioactive elements, and radioactive measurement. There are situations in environmental analysis to which high-resolution γ-spectrometry can be applied and the advantages and limitations of this technique are considered. Quality assurance is an essential component of any monitoring program in order to ensure that reliable and precise data are available for the assessment of the radiological impact of the mining and milling operation

  20. Cluster analysis for determining distribution center location

    Science.gov (United States)

    Lestari Widaningrum, Dyah; Andika, Aditya; Murphiyanto, Richard Dimas Julian

    2017-12-01

    Determination of distribution facilities is highly important to survive in the high level of competition in today’s business world. Companies can operate multiple distribution centers to mitigate supply chain risk. Thus, new problems arise, namely how many and where the facilities should be provided. This study examines a fast-food restaurant brand, which located in the Greater Jakarta. This brand is included in the category of top 5 fast food restaurant chain based on retail sales. There were three stages in this study, compiling spatial data, cluster analysis, and network analysis. Cluster analysis results are used to consider the location of the additional distribution center. Network analysis results show a more efficient process referring to a shorter distance to the distribution process.

  1. Fuzzy clustering analysis of osteosarcoma related genes.

    Science.gov (United States)

    Chen, Kai; Wu, Dajiang; Bai, Yushu; Zhu, Xiaodong; Chen, Ziqiang; Wang, Chuanfeng; Zhao, Yingchuan; Li, Ming

    2014-07-01

    Osteosarcoma is the most common malignant bone-tumor with a peak manifestation during the second and third decade of life. In order to explore the influence of genetic factors on the mechanism of osteosarcoma by analyzing the inter relationship between osteosarcoma and its related genes, and then provide potential genetic references for the prevention, diagnosis and treatment of osteosarcoma, we collected osteosarcoma related gene sequences in Genebank of National Center for Biotechnology Information (NCBI) and local alignment analysis for a pair of sequences was carried out to identify the measurement association among related sequences. Then fuzzy clustering method was used for clustering analysis so as to contact the unknown genes through the consistent osteosarcoma related genes in one class. From the result of fuzzy clustering analysis, we could classify the osteosarcoma related genes into two groups and deduced that the genes clustered into one group had similar function. Based on this knowledge, we found more genes related to the pathogenesis of osteosarcoma and these genes could exert similar function as Runx2, a risk factor confirmed in osteosarcoma, this study may help better understand the genetic mechanism and provide new molecular markers and therapies for osteosarcoma.

  2. Analysis of environmental sounds

    Science.gov (United States)

    Lee, Keansub

    Environmental sound archives - casual recordings of people's daily life - are easily collected by MPS players or camcorders with low cost and high reliability, and shared in the web-sites. There are two kinds of user generated recordings we would like to be able to handle in this thesis: Continuous long-duration personal audio and Soundtracks of short consumer video clips. These environmental recordings contain a lot of useful information (semantic concepts) related with activity, location, occasion and content. As a consequence, the environment archives present many new opportunities for the automatic extraction of information that can be used in intelligent browsing systems. This thesis proposes systems for detecting these interesting concepts on a collection of these real-world recordings. The first system is to segment and label personal audio archives - continuous recordings of an individual's everyday experiences - into 'episodes' (relatively consistent acoustic situations lasting a few minutes or more) using the Bayesian Information Criterion and spectral clustering. The second system is for identifying regions of speech or music in the kinds of energetic and highly-variable noise present in this real-world sound. Motivated by psychoacoustic evidence that pitch is crucial in the perception and organization of sound, we develop a noise-robust pitch detection algorithm to locate speech or music-like regions. To avoid false alarms resulting from background noise with strong periodic components (such as air-conditioning), a new scheme is added in order to suppress these noises in the domain of autocorrelogram. In addition, the third system is to automatically detect a large set of interesting semantic concepts; which we chose for being both informative and useful to users, as well as being technically feasible. These 25 concepts are associated with people's activities, locations, occasions, objects, scenes and sounds, and are based on a large collection of

  3. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    Science.gov (United States)

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations

  4. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations.

    Science.gov (United States)

    Corrigan, Neil; Bankart, Michael J G; Gray, Laura J; Smith, Karen L

    2014-05-24

    There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations include avoidance of cluster merges where

  5. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  6. The use of a cluster analysis in across herd genetic evaluation for ...

    African Journals Online (AJOL)

    To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters: solution ...

  7. The use of a cluster analysis in across herd genetic evaluation for ...

    African Journals Online (AJOL)

    uovs

    Abstract. To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters:.

  8. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    OpenAIRE

    Corrigan, Neil; Bankart, Michael J G; Gray, Laura J; Smith, Karen L

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed ...

  9. MANNER OF STOCKS SORTING USING CLUSTER ANALYSIS METHODS

    Directory of Open Access Journals (Sweden)

    Jana Halčinová

    2014-06-01

    Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.

  10. Advanced analysis of forest fire clustering

    Science.gov (United States)

    Kanevski, Mikhail; Pereira, Mario; Golay, Jean

    2017-04-01

    Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index

  11. Cluster Analysis in Rapeseed (Brassica Napus L.)

    International Nuclear Information System (INIS)

    Mahasi, J.M

    2002-01-01

    With widening edible deficit, Kenya has become increasingly dependent on imported edible oils. Many oilseed crops (e.g. sunflower, soya beans, rapeseed/mustard, sesame, groundnuts etc) can be grown in Kenya. But oilseed rape is preferred because it very high yielding (1.5 tons-4.0 tons/ha) with oil content of 42-46%. Other uses include fitting in various cropping systems as; relay/inter crops, rotational crops, trap crops and fodder. It is soft seeded hence oil extraction is relatively easy. The meal is high in protein and very useful in livestock supplementation. Rapeseed can be straight combined using adjusted wheat combines. The priority is to expand domestic oilseed production, hence the need to introduce improved rapeseed germplasm from other countries. The success of any crop improvement programme depends on the extent of genetic diversity in the material. Hence, it is essential to understand the adaptation of introduced genotypes and the similarities if any among them. Evaluation trials were carried out on 17 rapeseed genotypes (nine Canadian origin and eight of European origin) grown at 4 locations namely Endebess, Njoro, Timau and Mau Narok in three years (1992, 1993 and 1994). Results for 1993 were discarded due to severe drought. An analysis of variance was carried out only on seed yields and the treatments were found to be significantly different. Cluster analysis was then carried out on mean seed yields and based on this analysis; only one major group exists within the material. In 1992, varieties 2,3,8 and 9 didn't fall in the same cluster as the rest. Variety 8 was the only one not classified with the rest of the Canadian varieties. Three European varieties (2,3 and 9) were however not classified with the others. In 1994, varieties 10 and 6 didn't fall in the major cluster. Of these two, variety 10 is of Canadian origin. Varieties were more similar in 1994 than 1992 due to favorable weather. It is evident that, genotypes from different geographical

  12. Chaotic map clustering algorithm for EEG analysis

    Science.gov (United States)

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  13. Clustering Analysis within Text Classification Techniques

    Directory of Open Access Journals (Sweden)

    Madalina ZURINI

    2011-01-01

    Full Text Available The paper represents a personal approach upon the main applications of classification which are presented in the area of knowledge based society by means of methods and techniques widely spread in the literature. Text classification is underlined in chapter two where the main techniques used are described, along with an integrated taxonomy. The transition is made through the concept of spatial representation. Having the elementary elements of geometry and the artificial intelligence analysis, spatial representation models are presented. Using a parallel approach, spatial dimension is introduced in the process of classification. The main clustering methods are described in an aggregated taxonomy. For an example, spam and ham words are clustered and spatial represented, when the concepts of spam, ham and common and linkage word are presented and explained in the xOy space representation.

  14. Tweets clustering using latent semantic analysis

    Science.gov (United States)

    Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

    2017-04-01

    Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.

  15. Documentation to the workshop 'Cluster in the environmental protection economy'; Dokumentation zum Workshop ''Cluster in der Umweltschutzwirtschaft''

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2009-12-11

    Within the workshop 'Cluster in the environmental protection economy' at the Umweltbundesamt (Dessau-Rosslau, Federal Republic of Germany) at 27th November, 2008, the following lectures were held: (a) Which contribution can cluster and cluster politics contribute to the promotion of the environmental protection economy? (Harald Legler); (b) Cluster in the environmental protection economy: Targets and expectations (Dieter Rehfeld); (c) Demands at the management of clusters (Karin Hoerhan); (d) Demands at the cluster politics in the environmental protection economy (Bernhard Hausberg); (e) Photovoltaics in Eastern Germany (Johann Wackenbauer); (f) Automotive industry in Bergisches Land (Thomas Lemken); (g) Competence centre environment Augsburg-Schwaben (Egon Beckord).

  16. Analysis of clusterization and networking processes in developing intermodal transportation

    Directory of Open Access Journals (Sweden)

    Sinkevičius Gintaras

    2016-06-01

    Full Text Available Analysis of the processes of clusterization and networking draws attention to the necessity of integration of railway transport into the intermodal or multimodal transport chain. One of the most widespread methods of combined transport is interoperability of railway and road transport. The objective is to create an uninterrupted transport chain in combining several modes of transport. The aim of this is to save energy resources, to form an effective, competitive, attractive to the client and safe and environmentally friendly transport system.

  17. GC in Environmental Analysis

    Science.gov (United States)

    Gosink, Thomas A.

    1975-01-01

    Gas chromatography can be used to quantitate various gases, complex organic molecules, metals, anions, and pesticides in the lab or in the field. Important advances in gas chromatography and how they directly apply to environmental analyses plus suggestions where they will be of importance to environmental chemists are discussed. (BT)

  18. EM cluster analysis for categorical data

    Czech Academy of Sciences Publication Activity Database

    Grim, Jiří

    2006-01-01

    Roč. 44, č. 4109 (2006), s. 640-648 ISSN 0302-9743. [Joint IAPR International Workshops SSPR 2006 and SPR 2006. Hong Kong , 17.08.2006-19.08.2006] R&D Projects: GA AV ČR 1ET400750407; GA MŠk 1M0572 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : cluster analysis * categorical data * EM algorithm Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.402, year: 2005

  19. Environmental Management as a Strategic Capability: a Study on the Furniture Manufacturing Cluster of Southern Brazil

    Directory of Open Access Journals (Sweden)

    Janielen Pissolatto Deliberal

    2016-01-01

    Full Text Available The incorporation of company programs aimed at sustainability strategies contributes to a balance between economic growth and the use of natural resources. This issue is not exclusive for companies from developed markets. Companies from emerging markets also need to find a way to achieve sustainable practices and organizational performance at the same time. In this context, the aim of this study was to analyze whether environmental management can be considered as a strategic capability, contributing positively to the performance of the manufacturing companies belonging to the Furniture Manufacturing Cluster of Southern Brazil (FMCSB. In order to achieve our objective, we performed a quantitative study through a survey. The sample collected data from 162 companies. Based on univariate and multivariate analysis the results suggest that environmental management can be considered as a strategic capability for the FMCSB since environmental practices are significantly related to organizational performance.

  20. Adaptive Fuzzy Consensus Clustering Framework for Clustering Analysis of Cancer Data.

    Science.gov (United States)

    Yu, Zhiwen; Chen, Hantao; You, Jane; Liu, Jiming; Wong, Hau-San; Han, Guoqiang; Li, Le

    2015-01-01

    Performing clustering analysis is one of the important research topics in cancer discovery using gene expression profiles, which is crucial in facilitating the successful diagnosis and treatment of cancer. While there are quite a number of research works which perform tumor clustering, few of them considers how to incorporate fuzzy theory together with an optimization process into a consensus clustering framework to improve the performance of clustering analysis. In this paper, we first propose a random double clustering based cluster ensemble framework (RDCCE) to perform tumor clustering based on gene expression data. Specifically, RDCCE generates a set of representative features using a randomly selected clustering algorithm in the ensemble, and then assigns samples to their corresponding clusters based on the grouping results. In addition, we also introduce the random double clustering based fuzzy cluster ensemble framework (RDCFCE), which is designed to improve the performance of RDCCE by integrating the newly proposed fuzzy extension model into the ensemble framework. RDCFCE adopts the normalized cut algorithm as the consensus function to summarize the fuzzy matrices generated by the fuzzy extension models, partition the consensus matrix, and obtain the final result. Finally, adaptive RDCFCE (A-RDCFCE) is proposed to optimize RDCFCE and improve the performance of RDCFCE further by adopting a self-evolutionary process (SEPP) for the parameter set. Experiments on real cancer gene expression profiles indicate that RDCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.

  1. Presence of CTX gene cluster in environmental non-O1/O139 Vibrio cholerae and its potential clinical significance

    Directory of Open Access Journals (Sweden)

    B Bakhshi

    2012-01-01

    Full Text Available Purpose: The aim of this study was to understand the epidemiological linkage of clinical and environmental isolates of Vibrio cholerae and to determine their genotypes and virulence genes content. Materials and Methods: A total of 60 V. cholerae strains obtained from clinical specimens (n = 40 and surface waters (n = 20 were subjected to genotyping using PFGE and determination of their virulence-associated gene clusters. Result: PCR analysis showed the presence of chromosomally located hly and RTX genetic elements in 100% and 90% of the environmental isolates, respectively. The phage-mediated genetic elements such as CTX, TLC and VPI were detected in 5% of the environmental isolates suggesting that the environmental isolates cannot acquire certain mobile gene clusters. A total of 4 and 18 pulsotypes were obtained among the clinical and environmental V. cholerae isolates, respectively. Non-pathogenic environmentally isolated V. cholerae constituted a distinct cluster with one single non-O1, non-O139 strain (EP6 carrying the virulence genes similar to the epidemic strains. This may suggest the possible potential of conversion of non-pathogenic to a pathogenic environmental strain. Conclusions: The emergence of a single environmental isolate in our study containing the pathogenicity genes amongst the diverse non-pathogenic environmental isolates needs to be further studied in the context of V. cholerae pathogenicity sero-coversion.

  2. Application of cluster analysis for data driven market segmentation ...

    African Journals Online (AJOL)

    This research work is all out to capture: which standard of application of cluster analysis have emerged in the academic marketing literature, compare their standards of applying the methodological knowledge about clustering procedures and delineate sudden changes in clustering habits. These goals are achieved by ...

  3. Cluster analysis of word frequency dynamics

    Science.gov (United States)

    Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.

  4. Cluster analysis of word frequency dynamics

    International Nuclear Information System (INIS)

    Maslennikova, Yu S; Bochkarev, V V; Belashova, I A

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations

  5. Segmentation of Residential Gas Consumers Using Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Marta P. Fernandes

    2017-12-01

    Full Text Available The growing environmental concerns and liberalization of energy markets have resulted in an increased competition between utilities and a strong focus on efficiency. To develop new energy efficiency measures and optimize operations, utilities seek new market-related insights and customer engagement strategies. This paper proposes a clustering-based methodology to define the segmentation of residential gas consumers. The segments of gas consumers are obtained through a detailed clustering analysis using smart metering data. Insights are derived from the segmentation, where the segments result from the clustering process and are characterized based on the consumption profiles, as well as according to information regarding consumers’ socio-economic and household key features. The study is based on a sample of approximately one thousand households over one year. The representative load profiles of consumers are essentially characterized by two evident consumption peaks, one in the morning and the other in the evening, and an off-peak consumption. Significant insights can be derived from this methodology regarding typical consumption curves of the different segments of consumers in the population. This knowledge can assist energy utilities and policy makers in the development of consumer engagement strategies, demand forecasting tools and in the design of more sophisticated tariff systems.

  6. From virtual clustering analysis to self-consistent clustering analysis: a mathematical study

    Science.gov (United States)

    Tang, Shaoqiang; Zhang, Lei; Liu, Wing Kam

    2018-03-01

    In this paper, we propose a new homogenization algorithm, virtual clustering analysis (VCA), as well as provide a mathematical framework for the recently proposed self-consistent clustering analysis (SCA) (Liu et al. in Comput Methods Appl Mech Eng 306:319-341, 2016). In the mathematical theory, we clarify the key assumptions and ideas of VCA and SCA, and derive the continuous and discrete Lippmann-Schwinger equations. Based on a key postulation of "once response similarly, always response similarly", clustering is performed in an offline stage by machine learning techniques (k-means and SOM), and facilitates substantial reduction of computational complexity in an online predictive stage. The clear mathematical setup allows for the first time a convergence study of clustering refinement in one space dimension. Convergence is proved rigorously, and found to be of second order from numerical investigations. Furthermore, we propose to suitably enlarge the domain in VCA, such that the boundary terms may be neglected in the Lippmann-Schwinger equation, by virtue of the Saint-Venant's principle. In contrast, they were not obtained in the original SCA paper, and we discover these terms may well be responsible for the numerical dependency on the choice of reference material property. Since VCA enhances the accuracy by overcoming the modeling error, and reduce the numerical cost by avoiding an outer loop iteration for attaining the material property consistency in SCA, its efficiency is expected even higher than the recently proposed SCA algorithm.

  7. Environmental conditions analysis program

    International Nuclear Information System (INIS)

    Holten, J.

    1991-01-01

    The PC-based program discussed in this paper has the capability of determining the steady state temperatures of environmental zones (rooms). A program overview will be provided along with examples of formula use. Required input and output from the program will also be discussed. Specific application of plant monitored temperatures and utilization of this program will be offered. The presentation will show how the program can project individual room temperature profiles without continual temperature monitoring of equipment. A discussion will also be provided for the application of the program generated data. Evaluations of anticipated or planned plant modifications and the use of the subject program will also be covered

  8. Cluster-based exposure variation analysis

    Science.gov (United States)

    2013-01-01

    Background Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation. Methods For this purpose, we simulated a repeated cyclic exposure varying within each cycle between “low” and “high” exposure levels in a “near” or “far” range, and with “low” or “high” velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a “small” or “large” standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity. Each simulation trace included two realizations of 100 concatenated cycles with either low (ρ = 0.1), medium (ρ = 0.5) or high (ρ = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined. Results C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate

  9. An analysis of hospital brand mark clusters.

    Science.gov (United States)

    Vollmers, Stacy M; Miller, Darryl W; Kilic, Ozcan

    2010-07-01

    This study analyzed brand mark clusters (i.e., various types of brand marks displayed in combination) used by hospitals in the United States. The brand marks were assessed against several normative criteria for creating brand marks that are memorable and that elicit positive affect. Overall, results show a reasonably high level of adherence to many of these normative criteria. Many of the clusters exhibited pictorial elements that reflected benefits and that were conceptually consistent with the verbal content of the cluster. Also, many clusters featured icons that were balanced and moderately complex. However, only a few contained interactive imagery or taglines communicating benefits.

  10. Smartness and Italian Cities. A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Flavio Boscacci

    2014-05-01

    Full Text Available Smart cities have been recently recognized as the most pleasing and attractive places to live in; due to this, both scholars and policy-makers pay close attention to this topic. Specifically, urban “smartness” has been identified by plenty of characteristics that can be grouped into six dimensions (Giffinger et al. 2007: smart Economy (competitiveness, smart People (social and human capital, smart Governance (participation, smart Mobility (both ICTs and transport, smart Environment (natural resources, and smart Living (quality of life. According to this analytical framework, in the present paper the relation between urban attractiveness and the “smart” characteristics has been investigated in the 103 Italian NUTS3 province capitals in the year 2011. To this aim, a descriptive statistics has been followed by a regression analysis (OLS, where the dependent variable measuring the urban attractiveness has been proxied by housing market prices. Besides, a Cluster Analysis (CA has been developed in order to find differences and commonalities among the province capitals.The OLS results indicate that living, people and economy are the key drivers for achieving a better urban attractiveness. Environment, instead, keeps on playing a minor role. Besides, the CA groups the province capitals a

  11. UV TO FAR-IR CATALOG OF A GALAXY SAMPLE IN NEARBY CLUSTERS: SPECTRAL ENERGY DISTRIBUTIONS AND ENVIRONMENTAL TRENDS

    International Nuclear Information System (INIS)

    Hernández-Fernández, Jonathan D.; Iglesias-Páramo, J.; Vílchez, J. M.

    2012-01-01

    In this paper, we present a sample of cluster galaxies devoted to study the environmental influence on the star formation activity. This sample of galaxies inhabits in clusters showing a rich variety in their characteristics and have been observed by the SDSS-DR6 down to M B ∼ –18, and by the Galaxy Evolution Explorer AIS throughout sky regions corresponding to several megaparsecs. We assign the broadband and emission-line fluxes from ultraviolet to far-infrared to each galaxy performing an accurate spectral energy distribution for spectral fitting analysis. The clusters follow the general X-ray luminosity versus velocity dispersion trend of L X ∝ σ 4.4 c . The analysis of the distributions of galaxy density counting up to the 5th nearest neighbor Σ 5 shows: (1) the virial regions and the cluster outskirts share a common range in the high density part of the distribution. This can be attributed to the presence of massive galaxy structures in the surroundings of virial regions. (2) The virial regions of massive clusters (σ c > 550 km s –1 ) present a Σ 5 distribution statistically distinguishable (∼96%) from the corresponding distribution of low-mass clusters (σ c –1 ). Both massive and low-mass clusters follow a similar density-radius trend, but the low-mass clusters avoid the high density extreme. We illustrate, with ABELL 1185, the environmental trends of galaxy populations. Maps of sky projected galaxy density show how low-luminosity star-forming galaxies appear distributed along more spread structures than their giant counterparts, whereas low-luminosity passive galaxies avoid the low-density environment. Giant passive and star-forming galaxies share rather similar sky regions with passive galaxies exhibiting more concentrated distributions.

  12. The Psychology of Yoga Practitioners: A Cluster Analysis.

    Science.gov (United States)

    Genovese, Jeremy E C; Fondran, Kristine M

    2017-11-01

    Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall -Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.

  13. Simultaneous Two-Way Clustering of Multiple Correspondence Analysis

    Science.gov (United States)

    Hwang, Heungsun; Dillon, William R.

    2010-01-01

    A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…

  14. Using Cluster Analysis for Data Mining in Educational Technology Research

    Science.gov (United States)

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  15. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...

  16. Visual cluster analysis and pattern recognition methods

    Science.gov (United States)

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    2001-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  17. Year clustering analysis for modelling olive flowering phenology

    Science.gov (United States)

    Oteros, J.; García-Mozo, H.; Hervás-Martínez, C.; Galán, C.

    2013-07-01

    It is now widely accepted that weather conditions occurring several months prior to the onset of flowering have a major influence on various aspects of olive reproductive phenology, including flowering intensity. Given the variable characteristics of the Mediterranean climate, we analyse its influence on the registered variations in olive flowering intensity in southern Spain, and relate them to previous climatic parameters using a year-clustering approach, as a first step towards an olive flowering phenology model adapted to different year categories. Phenological data from Cordoba province (Southern Spain) for a 30-year period (1982-2011) were analysed. Meteorological and phenological data were first subjected to both hierarchical and "K-means" clustering analysis, which yielded four year-categories. For this classification purpose, three different models were tested: (1) discriminant analysis; (2) decision-tree analysis; and (3) neural network analysis. Comparison of the results showed that the neural-networks model was the most effective, classifying four different year categories with clearly distinct weather features. Flowering-intensity models were constructed for each year category using the partial least squares regression method. These category-specific models proved to be more effective than general models. They are better suited to the variability of the Mediterranean climate, due to the different response of plants to the same environmental stimuli depending on the previous weather conditions in any given year. The present detailed analysis of the influence of weather patterns of different years on olive phenology will help us to understand the short-term effects of climate change on olive crop in the Mediterranean area that is highly affected by it.

  18. Genetic analysis of loose cluster architecture in grapevine

    Directory of Open Access Journals (Sweden)

    Richter Robert

    2017-01-01

    Full Text Available Loose cluster architecture is a well known trait supporting Botrytis resilience by permitting a faster drying of bunches. Furthermore, a loose bunch enables a better application of fungicides into the cluster. The analysis of 150 F1 plants of the superior breeding line GF.GA-47-42 (‘Bacchus' x ‘Seyval blanc' crossed with ‘Villard blanc' segregating for compactness of the cluster was used for QTL analysis. Plenty of QTL were identified reproducibly for two years, QTLs stable over three growing seasons were identified for rachis length, peduncle length, and pedicel length. In a second approach ‘Pinot noir' clones showing variation for cluster architecture were analyzed for differential gene expression. Grown in three different German viticultural areas, loose versus compact clustered ‘Pinot noir' clones showed in gene expression experiments a candidate gene expressed fivefold higher in loosely clustered clones between stages BBCH57 and BBCH71.

  19. Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.

    Science.gov (United States)

    Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun

    2017-01-01

    Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.

  20. EM Clustering Analysis of Diabetes Patients Basic Diagnosis Index

    OpenAIRE

    Wu, Cai; Steinbauer, Jeffrey R.; Kuo, Grace M.

    2005-01-01

    Cluster analysis can group similar instances into same group. Partitioning cluster assigns classes to samples without known the classes in advance. Most common algorithms are K-means and Expectation Maximization (EM). EM clustering algorithm can find number of distributions of generating data and build “mixture models”. It identifies groups that are either overlapping or varying sizes and shapes. In this project, by using EM in Machine Learning Algorithm in JAVA (WEKA) syste...

  1. The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

    Science.gov (United States)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-07-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  2. Allergen Sensitization Pattern by Sex: A Cluster Analysis in Korea.

    Science.gov (United States)

    Ohn, Jungyoon; Paik, Seung Hwan; Doh, Eun Jin; Park, Hyun-Sun; Yoon, Hyun-Sun; Cho, Soyun

    2017-12-01

    Allergens tend to sensitize simultaneously. Etiology of this phenomenon has been suggested to be allergen cross-reactivity or concurrent exposure. However, little is known about specific allergen sensitization patterns. To investigate the allergen sensitization characteristics according to gender. Multiple allergen simultaneous test (MAST) is widely used as a screening tool for detecting allergen sensitization in dermatologic clinics. We retrospectively reviewed the medical records of patients with MAST results between 2008 and 2014 in our Department of Dermatology. A cluster analysis was performed to elucidate the allergen-specific immunoglobulin (Ig)E cluster pattern. The results of MAST (39 allergen-specific IgEs) from 4,360 cases were analyzed. By cluster analysis, 39items were grouped into 8 clusters. Each cluster had characteristic features. When compared with female, the male group tended to be sensitized more frequently to all tested allergens, except for fungus allergens cluster. The cluster and comparative analysis results demonstrate that the allergen sensitization is clustered, manifesting allergen similarity or co-exposure. Only the fungus cluster allergens tend to sensitize female group more frequently than male group.

  3. A critical cluster analysis of 44 indicators of author-level performance

    DEFF Research Database (Denmark)

    Wildgaard, Lorna Elizabeth

    2015-01-01

    . Publication and citation data for 741 researchers across Astronomy, Environmental Science, Philosophy and Public Health was collected in Web of Science (WoS). Forty-four indicators of individual performance were computed using the data. A two-step cluster analysis using IBM SPSS version 22 was performed...

  4. Entropic Approach to Multiscale Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Antonio Insolia

    2012-05-01

    Full Text Available Recently, a novel method has been introduced to estimate the statistical significance of clustering in the direction distribution of objects. The method involves a multiscale procedure, based on the Kullback–Leibler divergence and the Gumbel statistics of extreme values, providing high discrimination power, even in presence of strong background isotropic contamination. It is shown that the method is: (i semi-analytical, drastically reducing computation time; (ii very sensitive to small, medium and large scale clustering; (iii not biased against the null hypothesis. Applications to the physics of ultra-high energy cosmic rays, as a cosmological probe, are presented and discussed.

  5. Uncertainty analysis of environmental models

    International Nuclear Information System (INIS)

    Monte, L.

    1990-01-01

    In the present paper an evaluation of the output uncertainty of an environmental model for assessing the transfer of 137 Cs and 131 I in the human food chain are carried out on the basis of a statistical analysis of data reported by the literature. The uncertainty analysis offers the oppotunity of obtaining some remarkable information about the uncertainty of models predicting the migration of non radioactive substances in the environment mainly in relation to the dry and wet deposition

  6. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    Science.gov (United States)

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  7. The Reliability of Inverse Screen Tests for Cluster Analysis.

    Science.gov (United States)

    Lathrop, Richard G.; Williams, Janice E.

    1987-01-01

    A Monte Carlo study, involving 6,000 "computer subjects" and three raters, explored the reliability of the inverse screen test for cluster analysis. Results indicate that the inverse screen may be a useful and reliable cluster analytic technique for determining the number of true groups. (TJH)

  8. Blaeu: Mapping and navigating large tables with cluster analysis

    NARCIS (Netherlands)

    T.H.J. Sellam (Thibault); C.P. Cijvat (Robin); R.A. Koopmanschap (Richard); M.L. Kersten (Martin)

    2016-01-01

    textabstractBlaeu is an interactive database exploration tool. Its aim is to guide casual users through large data tables, ultimately triggering insights and serendipity. To do so, it relies on a double cluster analysis mechanism. It clusters the data vertically: it detects themes, groups of

  9. SUPPLY CHAIN ANALYSIS AND PERFORMANCE ASSESSMENT OF SME FISHERIES CLUSTERS

    Directory of Open Access Journals (Sweden)

    Anton Agus Setyawan

    2017-12-01

    Full Text Available Study of SME in Indonesia related with business networks and performance in these business organizations. In many cases, regional administration in Indonesia develops SME business network in the form of clusters. This study analyzes SME fisheries clusters with supply chain analysis.  We also develop performance assessment of SME fisheries cluster by using multivariate model. This study involves 62 SMEs in Sragen, Central Java Indonesia. Those SMEs  includes in fisheries cluster in the area. Our findings show that SME fisheries cluster has in-efficient supply chain. This business clusters has problems in profit setting and delivery time which harm their performance. We measure business performance by using business selling, profit rate and asset growth. We found that cost structure, man power and physical production has positive effects to business performance.

  10. Merging Galaxy Clusters: Analysis of Simulated Analogs

    Science.gov (United States)

    Nguyen, Jayke; Wittman, David; Cornell, Hunter

    2018-01-01

    The nature of dark matter can be better constrained by observing merging galaxy clusters. However, uncertainty in the viewing angle leads to uncertainty in dynamical quantities such as 3-d velocities, 3-d separations, and time since pericenter. The classic timing argument links these quantities via equations of motion, but neglects effects of nonzero impact parameter (i.e. it assumes velocities are parallel to the separation vector), dynamical friction, substructure, and larger-scale environment. We present a new approach using n-body cosmological simulations that naturally incorporate these effects. By uniformly sampling viewing angles about simulated cluster analogs, we see projected merger parameters in the many possible configurations of a given cluster. We select comparable simulated analogs and evaluate the likelihood of particular merger parameters as a function of viewing angle. We present viewing angle constraints for a sample of observed mergers including the Bullet cluster and El Gordo, and show that the separation vectors are closer to the plane of the sky than previously reported.

  11. Genotypic stability and clustering analysis of confectionery ...

    African Journals Online (AJOL)

    Nine groundnut genotypes were evaluated in terminal moisture-stress areas of northeastern Ethiopia during 2005 and 2006 cropping seasons with the objective of analyzing genotypic stability and clustering of confectionery groundnut for seed and protein yield. The genotypes were evaluated on a plot size of 15 m2 at Kobo ...

  12. Timor-Leste : Country Environmental Analysis

    OpenAIRE

    World Bank

    2009-01-01

    The Country Environmental Analysis (CEA) for Timor-Leste identifies environmental priorities through a systematic review of environmental issues in natural resources management and environmental health in the context of the country's economic development and environmental institutions. Lack of data has been the main limitation in presenting a more rigorous analysis. Nevertheless, the repor...

  13. Environmental sampling for trace analysis

    International Nuclear Information System (INIS)

    Markert, B.

    1994-01-01

    Often too little attention is given to the sampling before and after actual instrumental measurement. This leads to errors, despite increasingly sensitive analytical systems. This is one of the first books to pay proper attention to representative sampling. It offers an overview of the most common techniques used today for taking environmental samples. The techniques are clearly presented, yield accurate and reproducible results and can be used to sample -air - water - soil and sediments - plants and animals. A comprehensive handbook, this volume provides an excellent starting point for researchers in the rapidly expanding field of environmental analysis. (orig.)

  14. Applications of a Novel Clustering Approach Using Non-Negative Matrix Factorization to Environmental Research in Public Health.

    Science.gov (United States)

    Fogel, Paul; Gaston-Mathé, Yann; Hawkins, Douglas; Fogel, Fajwel; Luta, George; Young, S Stanley

    2016-05-18

    Often data can be represented as a matrix, e.g., observations as rows and variables as columns, or as a doubly classified contingency table. Researchers may be interested in clustering the observations, the variables, or both. If the data is non-negative, then Non-negative Matrix Factorization (NMF) can be used to perform the clustering. By its nature, NMF-based clustering is focused on the large values. If the data is normalized by subtracting the row/column means, it becomes of mixed signs and the original NMF cannot be used. Our idea is to split and then concatenate the positive and negative parts of the matrix, after taking the absolute value of the negative elements. NMF applied to the concatenated data, which we call PosNegNMF, offers the advantages of the original NMF approach, while giving equal weight to large and small values. We use two public health datasets to illustrate the new method and compare it with alternative clustering methods, such as K-means and clustering methods based on the Singular Value Decomposition (SVD) or Principal Component Analysis (PCA). With the exception of situations where a reasonably accurate factorization can be achieved using the first SVD component, we recommend that the epidemiologists and environmental scientists use the new method to obtain clusters with improved quality and interpretability.

  15. Prognostic value of cluster analysis of severe asthma phenotypes.

    Science.gov (United States)

    Bourdin, Arnaud; Molinari, Nicolas; Vachier, Isabelle; Varrin, Muriel; Marin, Grégory; Gamez, Anne-Sophie; Paganin, Fabrice; Chanez, Pascal

    2014-11-01

    Cross-sectional severe asthma cluster analysis identified different phenotypes. We tested the hypothesis that these clusters will follow different courses. We aimed to identify which asthma outcomes are specific and coherently associated with these different phenotypes in a prospective longitudinal cohort. In a longitudinal cohort of 112 patients with severe asthma, the 5 Severe Asthma Research Program (SARP) clusters were identified by means of algorithm application. Because patients of the present cohort all had severe asthma compared with the SARP cohort, homemade clusters were identified and also tested. At the subsequent visit, we investigated several outcomes related to asthma control at 1 year (6-item Asthma Control Questionnaire [ACQ-6], lung function, and medication requirement) and then recorded the 3-year exacerbations rate and time to first exacerbation. The SARP algorithm discriminated the 5 clusters at entry for age, asthma duration, lung function, blood eosinophil measurement, ACQ-6 scores, and diabetes comorbidity. Four homemade clusters were mostly segregated by best ever achieved FEV1 values and discriminated the groups by a few clinical characteristics. Nonetheless, all these clusters shared similar asthma outcomes related to asthma control as follows. The ACQ-6 score did not change in any cluster. Exacerbation rate and time to first exacerbation were similar, as were treatment requirements. Severe asthma phenotypes identified by using a previously reported cluster analysis or newly homemade clusters do not behave differently concerning asthma control-related outcomes, which are used to assess the response to innovative therapies. This study demonstrates a potential limitation of the cluster analysis approach in the field of severe asthma. Copyright © 2014. Published by Elsevier Inc.

  16. Analysis of Aspects of Innovation in a Brazilian Cluster

    Directory of Open Access Journals (Sweden)

    Adriana Valélia Saraceni

    2012-09-01

    Full Text Available Innovation through clustering has become very important on the increased significance that interaction represents on innovation and learning process concept. This study aims to identify whereas a case analysis on innovation process in a cluster represents on the learning process. Therefore, this study is developed in two stages. First, we used a preliminary case study verifying a cluster innovation analysis and it Innovation Index, for further, exploring a combined body of theory and practice. Further, the second stage is developed by exploring the learning process concept. Both stages allowed us building a theory model for the learning process development in clusters. The main results of the model development come up with a mechanism of improvement implementation on clusters when case studies are applied.

  17. Describing the homeless mentally ill: cluster analysis results.

    Science.gov (United States)

    Mowbray, C T; Bybee, D; Cohen, E

    1993-02-01

    Presented descriptive data on a group of homeless, mentally ill individuals (N = 108) served by a two-site demonstration project, funded by NIMH. Comparing results with those from other studies of this population produced some differences and some similarities. Cluster analysis techniques were applied to the data, producing a 4-group solution. Data validating the cluster solution are presented. It is suggested that the cluster results provide a more meaningful and useful method of understanding the descriptive data. Results suggest that while the population of individuals served as homeless and mentally ill is quite heterogeneous, many have well-developed functioning skills--only one cluster, making up 35.2% of the sample, fits the stereotype of the aggressive, psychotic individual with skill deficits in many areas. Further discussion is presented concerning the implications of the cluster analysis results for demonstrating contextual effects and thus better interpreting research results from other studies and assisting in future services planning.

  18. A Flocking Based algorithm for Document Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  19. Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.

    Science.gov (United States)

    Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost

    2018-04-01

    Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).

  20. Cluster analysis of typhoid cases in Kota Bharu, Kelantan, Malaysia

    Directory of Open Access Journals (Sweden)

    Nazarudin Safian

    2008-09-01

    Full Text Available Typhoid fever is still a major public health problem globally as well as in Malaysia. This study was done to identify the spatial epidemiology of typhoid fever in the Kota Bharu District of Malaysia as a first step to developing more advanced analysis of the whole country. The main characteristic of the epidemiological pattern that interested us was whether typhoid cases occurred in clusters or whether they were evenly distributed throughout the area. We also wanted to know at what spatial distances they were clustered. All confirmed typhoid cases that were reported to the Kota Bharu District Health Department from the year 2001 to June of 2005 were taken as the samples. From the home address of the cases, the location of the house was traced and a coordinate was taken using handheld GPS devices. Spatial statistical analysis was done to determine the distribution of typhoid cases, whether clustered, random or dispersed. The spatial statistical analysis was done using CrimeStat III software to determine whether typhoid cases occur in clusters, and later on to determine at what distances it clustered. From 736 cases involved in the study there was significant clustering for cases occurring in the years 2001, 2002, 2003 and 2005. There was no significant clustering in year 2004. Typhoid clustering also occurred strongly for distances up to 6 km. This study shows that typhoid cases occur in clusters, and this method could be applicable to describe spatial epidemiology for a specific area. (Med J Indones 2008; 17: 175-82Keywords: typhoid, clustering, spatial epidemiology, GIS

  1. Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis

    Science.gov (United States)

    de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.

    2006-01-01

    K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…

  2. Understanding clusters of risk factors across different environmental and social contexts for the prediction of injuries among Canadian youth.

    Science.gov (United States)

    Russell, K; Davison, C; King, N; Pike, I; Pickett, W

    2016-05-01

    Among Canadian youth, injury is the most common reason for presentation to the emergency department. Youth who commonly engage in multiple risk-taking behaviours are at greater risk for injury, but is it unknown if this phenomenon is more pronounced in different contexts. We aimed to study relationships between risk-taking behaviours and injury, and variations in such relationships between different environmental and social contexts, among youth in Canada. Risk-taking behaviour and injury outcome data were collected from grade 9 to 10 students using the 2009-2010 (Cycle 6) of the Health Behaviour in School-Aged Children Survey (n=10,429). Principal components analysis was used to identify clusters of risk-taking behaviours. Within each identified cluster, the degree of risk-taking was categorized into quartiles from lowest to highest engagement in the behaviours. Risk ratios with 95% confidence intervals were calculated to determine the association between the risk of any injury and the degree of risk-taking behaviour specific to the cluster. Clusters were then examined across home, school, neighbourhood and sport contexts. Four clusters of risk-taking behaviour were identified which were labelled as "gateway substance use", "hard drugs and weapons", "overt risk-taking", and "physical activity". Each cluster was related to injury occurrence in a graded fashion. Clusters of risk behaviour were most strongly associated with injuries sustained in neighbourhood settings, and expectedly, increasing physical activity behaviours were associated with increased risk of sport injuries and injuries occurring at school. This study furthers understanding of clustered risk-taking phenomena that put youth at increasing levels of injury risk. Higher risks for injury and associated gradients were observed in less structured contexts such as neighbourhoods. In contrast, clustered physical activity behaviours were most related to school injury or sport injury and were more likely to

  3. Comparative analysis of genomic signal processing for microarray data clustering.

    Science.gov (United States)

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  4. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li

    2017-01-01

    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  5. A Distributed Flocking Approach for Information Stream Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.

  6. Cluster analysis of clinical data identifies fibromyalgia subgroups.

    Directory of Open Access Journals (Sweden)

    Elisa Docampo

    Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.

  7. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Årup; Frutiger, Sally A.

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing. Hum. Brain Mapping 15...

  8. Emergy-based comparative analysis on industrial clusters: economic and technological development zone of Shenyang area, China.

    Science.gov (United States)

    Liu, Zhe; Geng, Yong; Zhang, Pan; Dong, Huijuan; Liu, Zuoxi

    2014-09-01

    In China, local governments of many areas prefer to give priority to the development of heavy industrial clusters in pursuit of high value of gross domestic production (GDP) growth to get political achievements, which usually results in higher costs from ecological degradation and environmental pollution. Therefore, effective methods and reasonable evaluation system are urgently needed to evaluate the overall efficiency of industrial clusters. Emergy methods links economic and ecological systems together, which can evaluate the contribution of ecological products and services as well as the load placed on environmental systems. This method has been successfully applied in many case studies of ecosystem but seldom in industrial clusters. This study applied the methodology of emergy analysis to perform the efficiency of industrial clusters through a series of emergy-based indices as well as the proposed indicators. A case study of Shenyang Economic Technological Development Area (SETDA) was investigated to show the emergy method's practical potential to evaluate industrial clusters to inform environmental policy making. The results of our study showed that the industrial cluster of electric equipment and electronic manufacturing produced the most economic value and had the highest efficiency of energy utilization among the four industrial clusters. However, the sustainability index of the industrial cluster of food and beverage processing was better than the other industrial clusters.

  9. Development of small scale cluster computer for numerical analysis

    Science.gov (United States)

    Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

    2017-09-01

    In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.

  10. Fuzzy clustering analysis to study geomagnetic coastal effects

    Directory of Open Access Journals (Sweden)

    M. Sridharan

    2005-06-01

    Full Text Available The utility of fuzzy set theory in cluster analysis and pattern recognition has been evolving since the mid 1960s, in conjunction with the emergence and evolution of computer technology. The classification of objects into categories is the subject of cluster analysis. The aim of this paper is to employ Fuzzy-clustering technique to examine the interrelationship of geomagnetic coastal and other effects at Indian observatories. Data from the observatories used for the present studies are from Alibag on the West Coast, Visakhapatnam and Pondicherry on the East Coast, Hyderabad and Nagpur as central inland stations which are located far from either of the coasts; all the above stations are free from the influence of the daytime equatorial electrojet. It has been found that Alibag and Pondicherry Observatories form a separate cluster showing anomalous variations in the vertical (Z-component. H- and D-components form different clusters. The results are compared with the graphical method. Analytical technique and the results of Fuzzy-clustering analysis are discussed here.

  11. Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples.

    Science.gov (United States)

    Shi, Jinlong; Luo, Zhigang

    2010-08-01

    Gene expression data are the representation of nonlinear interactions among genes and environmental factors. Computing analysis of these data is expected to gain knowledge of gene functions and disease mechanisms. Clustering is a classical exploratory technique of discovering similar expression patterns and function modules. However, gene expression data are usually of high dimensions and relatively small samples, which results in the main difficulty for the application of clustering algorithms. Principal component analysis (PCA) is usually used to reduce the data dimensions for further clustering analysis. While PCA estimates the similarity between expression profiles based on the Euclidean distance, which cannot reveal the nonlinear connections between genes. This paper uses nonlinear dimensionality reduction (NDR) as a preprocessing strategy for feature selection and visualization, and then applies clustering algorithms to the reduced feature spaces. In order to estimate the effectiveness of NDR for capturing biologically relevant structures, the comparative analysis between NDR and PCA is exploited to five real cancer expression datasets. Results show that NDR can perform better than PCA in visualization and clustering analysis of complex gene expression data. Copyright 2010 Elsevier Ltd. All rights reserved.

  12. Using ICD for structural analysis of clusters: a case study on NeAr clusters

    Science.gov (United States)

    Fasshauer, E.; Förstel, M.; Pallmann, S.; Pernpointner, M.; Hergenhahn, U.

    2014-10-01

    We present a method to utilize interatomic Coulombic decay (ICD) to retrieve information about the mean geometric structures of heteronuclear clusters. It is based on observation and modelling of competing ICD channels, which involve the same initial vacancy, but energetically different final states with vacancies in different components of the cluster. Using binary rare gas clusters of Ne and Ar as an example, we measure the relative intensity of ICD into (Ne+)2 and Ne+Ar+ final states with spectroscopically well separated ICD peaks. We compare in detail the experimental ratios of the Ne-Ne and Ne-Ar ICD contributions and their positions and widths to values calculated for a diverse set of possible structures. We conclude that NeAr clusters exhibit a core-shell structure with an argon core surrounded by complete neon shells and, possibly, further an incomplete shell of neon atoms for the experimental conditions investigated. Our analysis allows one to differentiate between clusters of similar size and stochiometric Ar content, but different internal structure. We find evidence for ICD of Ne 2s-1, producing Ar+ vacancies in the second coordination shell of the initial site.

  13. ENVIRONMENTAL EFFECTS IN CLUSTERS: MODIFIED FAR-INFRARED-RADIO RELATIONS WITHIN VIRGO CLUSTER GALAXIES

    International Nuclear Information System (INIS)

    Murphy, E. J.; Kenney, J. D. P.; Helou, G.; Chung, A.; Howell, J. H.

    2009-01-01

    We present a study on the effects of the intracluster medium (ICM) on the interstellar medium (ISM) of 10 Virgo Cluster galaxies using Spitzer far-infrared (FIR) and Very Large Array radio continuum imaging. Relying on the FIR-radio correlation within normal galaxies, we use our infrared data to create model radio maps, which we compare to the observed radio images. For six of our sample galaxies, we find regions along their outer edges that are highly deficient in the radio compared with our models. We also detect FIR emission slightly beyond the observed radio disk along these outer edges. We believe these observations are the signatures of ICM ram pressure. For NGC 4522, we find the radio-deficit region to lie just exterior to a region of high radio polarization and flat radio spectral index, although the total 20 cm radio continuum in this region does not appear strongly enhanced. These characteristics seem consistent for other galaxies with radio polarization data in the literature. The strength of the radio deficit is inversely correlated with the time since peak pressure as inferred from stellar population studies and gas-stripping simulations, consistent with the strength of the radio deficit being a good indicator of the strength of the current ram pressure. We also find that galaxies having local radio deficits appear to have enhanced global radio fluxes. Our preferred physical picture is that the observed radio-deficit regions arise from the ICM wind sweeping away cosmic-ray (CR) electrons and the associated magnetic field, thereby creating synchrotron tails as observed for some of our galaxies. We propose that CR particles are also reaccelerated by ICM-driven shocklets behind the observed radio-deficit regions which, in turn, enhances the remaining radio disk brightness. The high radio polarization and lack of precisely coincident enhancement in the total synchrotron power for these regions suggest shearing, and possibly mild compression of the magnetic

  14. Cluster Analytical Method of Fault Risk Analysis in Systems

    Science.gov (United States)

    Michaľčonok, German; Horalová Kalinová, Michaela

    2016-12-01

    In providing safety functions, the proposal of safety functions of control systems is an important part of a risk reduction strategy. In the specification of security requirements, it is necessary to determine and document individual characteristics and the desired performance level for each safety. This article presents the results of the experiment cluster analysis. The results of the experiment prove that the methods of cluster analysis provide a suitable tool for analyzing the reliability of safety systems analysis. Regarding the increasing complexity of the systems, we can state that the application of these methods in the subject area is a good choice.

  15. Principal Component Clustering Approach to Teaching Quality Discriminant Analysis

    Science.gov (United States)

    Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan

    2016-01-01

    Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…

  16. Proteome Profiling of Vitreoretinal Diseases by Cluster Analysis

    OpenAIRE

    Shitama, Tomomi; Hayashi, Hideyuki; Noge, Sumiyo; Uchio, Eiichi; Oshima, Kenji; Haniu, Hisao; Takemori, Nobuaki; Komori, Naoka; Matsumoto, Hiroyuki

    2008-01-01

    Vitreous samples collected in retinopathic surgeries have diverse properties, making proteomics analysis difficult. We report a cluster analysis to evade this difficulty. Vitreous and subretinal fluid samples were collected from 60 patients during surgical operation of non-proliferative diabetic retinopathy, proliferative diabetic retinopathy, proliferative vitreoretinopathy, and rhegmatogenous retinal detachment. For controls we collected vitreous fluid from patients of idiopathic macular ho...

  17. Pattern recognition in menstrual bleeding diaries by statistical cluster analysis

    Directory of Open Access Journals (Sweden)

    Wessel Jens

    2009-07-01

    Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.

  18. Breast cancer clustering in Kanagawa, Japan: a geographic analysis.

    Science.gov (United States)

    Katayama, Kayoko; Yokoyama, Kazuhito; Yako-Suketomo, Hiroko; Okamoto, Naoyuki; Tango, Toshiro; Inaba, Yutaka

    2014-01-01

    The purpose of the present study was to determine geographic clustering of breast cancer incidence in Kanagawa Prefecture, using cancer registry data. The study also aimed at examining the association between socio-economic factors and any identified cluster. Incidence data were collected for women who were first diagnosed with breast cancer during the period from January to December 2006 in Kanagawa. The data consisted of 2,326 incidence cases extracted from the total of 34,323 Kanagawa Cancer Registration data issued in 2011. To adjust for differences in age distribution, the standardized mortality ratio (SMR) and the standardized incidence ratio (SIR) of breast cancer were calculated for each of 56 municipalities (e.g., city, special ward, town, and village) in Kanagawa by an indirect method using Kanagawa female population data. Spatial scan statistics were used to detect any area of elevated risk as a cluster for breast cancer deaths and/ or incidences. The Student t-test was performed to examine differences in socio-economic variables, viz, persons per household, total fertility rate, age at first marriage for women, and marriage rate, between cluster and other regions. There was a statistically significant cluster of breast cancer incidence (p=0.001) composed of 11 municipalities in southeastern area of Kanagawa Prefecture, whose SIR was 35 percent higher than that of the remainder of Kanagawa Prefecture. In this cluster, average value of age at first-marriage for women was significantly higher than in the rest of Kanagawa (p=0.017). No statistically significant clusters of breast cancer deaths were detected (p=0.53). There was a statistically significant cluster of high breast cancer incidence in southeastern area of Kanagawa Prefecture. It was suggested that the cluster region was related to the tendency to marry later. This study methodology will be helpful in the analysis of geographical disparities in cancer deaths and incidence.

  19. Technology Clusters Exploration for Patent Portfolio through Patent Abstract Analysis

    Directory of Open Access Journals (Sweden)

    Gabjo Kim

    2016-12-01

    Full Text Available This study explores technology clusters through patent analysis. The aim of exploring technology clusters is to grasp competitors’ levels of sustainable research and development (R&D and establish a sustainable strategy for entering an industry. To achieve this, we first grouped the patent documents with similar technologies by applying affinity propagation (AP clustering, which is effective while grouping large amounts of data. Next, in order to define the technology clusters, we adopted the term frequency-inverse document frequency (TF-IDF weight, which lists the terms in order of importance. We collected the patent data of Korean electric car companies from the United States Patent and Trademark Office (USPTO to verify our proposed methodology. As a result, our proposed methodology presents more detailed information on the Korean electric car industry than previous studies.

  20. clusters

    Indian Academy of Sciences (India)

    2017-09-27

    Sep 27, 2017 ... while CuCoNO, Co3NO, Cu3CoNO, Cu2Co3NO, Cu3Co3NO and Cu6CoNO clusters display stronger chemical stability. Magnetic and electronic properties are also discussed. The magnetic moment is affected by charge transfer and the spd hybridization. Keywords. CumConNO (m + n = 2–7) clusters; ...

  1. Ecosystem health pattern analysis of urban clusters based on emergy synthesis: Results and implication for management

    International Nuclear Information System (INIS)

    Su, Meirong; Fath, Brian D.; Yang, Zhifeng; Chen, Bin; Liu, Gengyuan

    2013-01-01

    The evaluation of ecosystem health in urban clusters will help establish effective management that promotes sustainable regional development. To standardize the application of emergy synthesis and set pair analysis (EM–SPA) in ecosystem health assessment, a procedure for using EM–SPA models was established in this paper by combining the ability of emergy synthesis to reflect health status from a biophysical perspective with the ability of set pair analysis to describe extensive relationships among different variables. Based on the EM–SPA model, the relative health levels of selected urban clusters and their related ecosystem health patterns were characterized. The health states of three typical Chinese urban clusters – Jing-Jin-Tang, Yangtze River Delta, and Pearl River Delta – were investigated using the model. The results showed that the health status of the Pearl River Delta was relatively good; the health for the Yangtze River Delta was poor. As for the specific health characteristics, the Pearl River Delta and Yangtze River Delta urban clusters were relatively strong in Vigor, Resilience, and Urban ecosystem service function maintenance, while the Jing-Jin-Tang was relatively strong in organizational structure and environmental impact. Guidelines for managing these different urban clusters were put forward based on the analysis of the results of this study. - Highlights: • The use of integrated emergy synthesis and set pair analysis model was standardized. • The integrated model was applied on the scale of an urban cluster. • Health patterns of different urban clusters were compared. • Policy suggestions were provided based on the health pattern analysis

  2. Traffic Accident, System Model and Cluster Analysis in GIS

    Directory of Open Access Journals (Sweden)

    Veronika Vlčková

    2015-07-01

    Full Text Available One of the many often frequented topics as normal journalism, so the professional public, is the problem of traffic accidents. This article illustrates the orientation of considerations to a less known context of accidents, with the help of constructive systems theory and its methods, cluster analysis and geoinformation engineering. Traffic accident is reframing the space-time, and therefore it can be to study with tools of technology of geographic information systems. The application of system approach enabling the formulation of the system model, grabbed by tools of geoinformation engineering and multicriterial and cluster analysis.

  3. Application of microarray analysis on computer cluster and cloud platforms.

    Science.gov (United States)

    Bernau, C; Boulesteix, A-L; Knaus, J

    2013-01-01

    Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.

  4. Visualizing data for environmental analysis

    International Nuclear Information System (INIS)

    Benson, J.

    1997-01-01

    The Environmental Restoration Project at Los Alamos National Laboratory (LANL) has over 11,000 sampling locations in a 44 square mile area. The sample analyses contain raw analytical chemistry values for over 2,300 analytes and compounds used to define and remediate contaminated areas at LANL. The data consist of 2.5 million records in an oracle database. Maps are often used to visualize the data. Problems arise when a client specifies a particular kind of map without fully understanding the limitations of the data or the map. The ability of maps to convey information is dependent on many factors, though all maps are data dependent. The quantity, spatial distribution, and numerical range of the data can limit use with certain kinds of maps. To address these issues and educate the clients, several types of statistical maps (e.g., choropleth, isarithm, and graduated symbol such as bubble and spike) used for environmental analysis were chosen to show the advantages, disadvantages, and data limitations of each. By examining both the complexity of the analytical data and the limitations of the map type, it is possible to consider how reality has been transformed through the map, and if that transformation accurately conveys the information present

  5. Identifying clinical course patterns in SMS data using cluster analysis.

    Science.gov (United States)

    Kent, Peter; Kongsted, Alice

    2012-07-02

    Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important subgroups in the outcomes of research studies. Two previous studies have investigated detailed clinical course patterns in SMS data obtained from people seeking care for low back pain. One used a visual analysis approach and the other performed a cluster analysis of SMS data that had first been transformed by spline analysis. However, cluster analysis of SMS data in its original untransformed form may be simpler and offer other advantages. Therefore, the aim of this study was to determine whether cluster analysis could be used for identifying clinical course patterns distinct from the pattern of the whole group, by including all SMS time points in their original form. It was a 'proof of concept' study to explore the potential, clinical relevance, strengths and weakness of such an approach. This was a secondary analysis of longitudinal SMS data collected in two randomised controlled trials conducted simultaneously from a single clinical population (n = 322). Fortnightly SMS data collected over a year on 'days of problematic low back pain' and on 'days of sick leave' were analysed using Two-Step (probabilistic) Cluster Analysis. Clinical course patterns were identified that were clinically interpretable and different from those of the whole group. Similar patterns were obtained when the number of SMS time points was reduced to monthly. The advantages and disadvantages of this method were contrasted to that of first transforming SMS data by spline analysis. This study showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there

  6. Applied Hierarchical Cluster Analysis with Average Linkage Algoritm

    Directory of Open Access Journals (Sweden)

    Cindy Cahyaning Astuti

    2017-11-01

    Full Text Available This research was conducted in Sidoarjo District where source of data used from secondary data contained in the book "Kabupaten Sidoarjo Dalam Angka 2016" .In this research the authors chose 12 variables that can represent sub-district characteristics in Sidoarjo. The variable that represents the characteristics of the sub-district consists of four sectors namely geography, education, agriculture and industry. To determine the equitable geographical conditions, education, agriculture and industry each district, it would require an analysis to classify sub-districts based on the sub-district characteristics. Hierarchical cluster analysis is the analytical techniques used to classify or categorize the object of each case into a relatively homogeneous group expressed as a cluster. The results are expected to provide information about dominant sub-district characteristics and non-dominant sub-district characteristics in four sectors based on the results of the cluster is formed.

  7. Examining lower urinary tract symptom constellations using cluster analysis.

    Science.gov (United States)

    Coyne, Karin S; Matza, Louis S; Kopp, Zoe S; Thompson, Christine; Henry, David; Irwin, Debra E; Artibani, Walter; Herschorn, Sender; Milsom, Ian

    2008-05-01

    To gain a better understanding of how patients experience lower urinary tract symptoms (LUTS) and to determine whether particular symptoms cluster together, as LUTS seldom occur alone. A secondary analysis of a cross-sectional, population-based survey of adults in Sweden, Italy, Germany, UK and Canada was undertaken to examine the presence of LUTS groups. Of the 19,165 telephone surveys, 13,519 respondents reported at least one LUTS and were included in the analysis. All respondents were asked about the presence of 14 LUTS (International Prostate Symptom Score plus seven additional LUTS). K-means cluster analyses, a statistical method for sorting objects into groups so that similar objects are grouped together, was used to identify groups of people based on their symptoms. Men and women were analysed separately. A split-half random sample was selected from the dataset so that exploratory analyses could be conducted in one half and confirmed in the second. On model confirmation, the sample was analysed in its entirety. Included in this analysis were 5014 men (mean age 49.8 years; 95% white) and 8505 women (mean age 50.4 years; 96% white). Among both men and women, six distinct symptom cluster groups were identified and the symptom patterns of each cluster were examined. For both, the largest cluster consisted of respondents with minimal symptoms (i.e. reporting essentially one symptom), 56% of men and 57% of women. The remaining five clusters for men and women were labelled based on their predominant symptoms. For men, the clusters were nocturia of twice or more per night (12%); terminal dribble (11%); urgency (10%); multiple symptoms (9%); and postvoid incontinence (5%). For women, the clusters were nocturia of twice or more per night (12%); terminal dribble (10%); urgency (8%); stress incontinence (8%); and multiple symptoms (5%). The multiple-symptom groups had several and varied LUTS, were older, and had more comorbidities. Clusters of terminal dribble and male

  8. Assessment of surface water quality using hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Dheeraj Kumar Dabgerwal

    2016-02-01

    Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016,  pp: 32-44

  9. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

    Science.gov (United States)

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

    2013-01-01

    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  10. OCEAN THERMAL ENERGY CONVERSION (OTEC) PROGRAMMATIC ENVIRONMENTAL ANALYSIS

    Energy Technology Data Exchange (ETDEWEB)

    Sands, M. D.

    1980-01-01

    This programmatic environmental analysis is an initial assessment of OTEC technology considering development, demonstration and commercialization; it is concluded that the OTEC development program should continue because the development, demonstration, and commercialization on a single-plant deployment basis should not present significant environmental impacts. However, several areas within the OTEC program require further investigation in order to assess the potential for environmental impacts from OTEC operation, particularly in large-scale deployments and in defining alternatives to closed-cycle biofouling control: (1) Larger-scale deployments of OTEC clusters or parks require further investigations in order to assess optimal platform siting distances necessary to minimize adverse environmental impacts. (2) The deployment and operation of the preoperational platform (OTEC-1) and future demonstration platforms must be carefully monitored to refine environmental assessment predictions, and to provide design modifications which may mitigate or reduce environmental impacts for larger-scale operations. These platforms will provide a valuable opportunity to fully evaluate the intake and discharge configurations, biofouling control methods, and both short-term and long-term environmental effects associated with platform operations. (3) Successful development of OTEC technology to use the maximal resource capabilities and to minimize environmental effects will require a concerted environmental management program, encompassing many different disciplines and environmental specialties.

  11. Ocean Thermal Energy Conversion (OTEC) Programmatic Environmental Analysis--Appendices

    Energy Technology Data Exchange (ETDEWEB)

    Authors, Various

    1980-01-01

    The programmatic environmental analysis is an initial assessment of Ocean Thermal Energy Conversion (OTEC) technology considering development, demonstration and commercialization. It is concluded that the OTEC development program should continue because the development, demonstration, and commercialization on a single-plant deployment basis should not present significant environmental impacts. However, several areas within the OTEC program require further investigation in order to assess the potential for environmental impacts from OTEC operation, particularly in large-scale deployments and in defining alternatives to closed-cycle biofouling control: (1) Larger-scale deployments of OTEC clusters or parks require further investigations in order to assess optimal platform siting distances necessary to minimize adverse environmental impacts. (2) The deployment and operation of the preoperational platform (OTEC-1) and future demonstration platforms must be carefully monitored to refine environmental assessment predictions, and to provide design modifications which may mitigate or reduce environmental impacts for larger-scale operations. These platforms will provide a valuable opportunity to fully evaluate the intake and discharge configurations, biofouling control methods, and both short-term and long-term environmental effects associated with platform operations. (3) Successful development of OTEC technology to use the maximal resource capabilities and to minimize environmental effects will require a concerted environmental management program, encompassing many different disciplines and environmental specialties. This volume contains these appendices: Appendix A -- Deployment Scenario; Appendix B -- OTEC Regional Characterization; and Appendix C -- Impact and Related Calculations.

  12. Cluster analysis as a prediction tool for pregnancy outcomes.

    Science.gov (United States)

    Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

    2015-03-01

    Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.

  13. Language Learner Motivational Types: A Cluster Analysis Study

    Science.gov (United States)

    Papi, Mostafa; Teimouri, Yasser

    2014-01-01

    The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…

  14. Characterization of population exposure to organochlorines: A cluster analysis application

    NARCIS (Netherlands)

    R.M. Guimarães (Raphael Mendonça); S. Asmus (Sven); A. Burdorf (Alex)

    2013-01-01

    textabstractThis study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues

  15. Cluster analysis for validated climatology stations using precipitation in Mexico

    NARCIS (Netherlands)

    Bravo Cabrera, J. L.; Azpra-Romero, E.; Zarraluqui-Such, V.; Gay-García, C.; Estrada Porrúa, F.

    2012-01-01

    Annual average of daily precipitation was used to group climatological stations into clusters using the k-means procedure and principal component analysis with varimax rotation. After a careful selection of the stations deployed in Mexico since 1950, we selected 349 characterized by having 35 to 40

  16. cluster

    Indian Academy of Sciences (India)

    has been investigated electrochemically in positive and negative microenvironments, both in solution and in film. Charge nature around the active centre ... in plants, bacteria and also in mammals. This cluster is also an important constituent of a ..... selection of non-cysteine amino acid in the active centre of Rieske proteins.

  17. K-means cluster analysis and seismicity partitioning for Pakistan

    Science.gov (United States)

    Rehman, Khaista; Burton, Paul W.; Weatherill, Graeme A.

    2014-07-01

    Pakistan and the western Himalaya is a region of high seismic activity located at the triple junction between the Arabian, Eurasian and Indian plates. Four devastating earthquakes have resulted in significant numbers of fatalities in Pakistan and the surrounding region in the past century (Quetta, 1935; Makran, 1945; Pattan, 1974 and the recent 2005 Kashmir earthquake). It is therefore necessary to develop an understanding of the spatial distribution of seismicity and the potential seismogenic sources across the region. This forms an important basis for the calculation of seismic hazard; a crucial input in seismic design codes needed to begin to effectively mitigate the high earthquake risk in Pakistan. The development of seismogenic source zones for seismic hazard analysis is driven by both geological and seismotectonic inputs. Despite the many developments in seismic hazard in recent decades, the manner in which seismotectonic information feeds the definition of the seismic source can, in many parts of the world including Pakistan and the surrounding regions, remain a subjective process driven primarily by expert judgment. Whilst much research is ongoing to map and characterise active faults in Pakistan, knowledge of the seismogenic properties of the active faults is still incomplete in much of the region. Consequently, seismicity, both historical and instrumental, remains a primary guide to the seismogenic sources of Pakistan. This study utilises a cluster analysis approach for the purposes of identifying spatial differences in seismicity, which can be utilised to form a basis for delineating seismogenic source regions. An effort is made to examine seismicity partitioning for Pakistan with respect to earthquake database, seismic cluster analysis and seismic partitions in a seismic hazard context. A magnitude homogenous earthquake catalogue has been compiled using various available earthquake data. The earthquake catalogue covers a time span from 1930 to 2007 and

  18. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    Directory of Open Access Journals (Sweden)

    Jessie J Hsu

    Full Text Available One goal of cluster analysis is to sort characteristics into groups (clusters so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes into groups of highly correlated genes that have the same effect on the outcome (recovery. We propose a random effects model where the genes within each group (cluster equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  19. EFFICIENCY OF SMES IN ROMANIA POST CRISIS. A CLUSTERING ANALYSIS

    Directory of Open Access Journals (Sweden)

    Cristina SUCIU

    2014-06-01

    Full Text Available Small and medium-sized enterprises (SMEs have had, even in the economic crisis, a major contribution to the achievement of gross domestic product, to create jobs, to increase economic efficiency by stimulating competition through speed of adaptation to conditions and the adoption of new strategies, the ability to adapt to market requirements. Although, at the beginning of the economic crisis in Romania have been suspended or canceled several hundred thousand companies, starting in 2012 it is observed a revival of SMEs. We could say that post crisis period, thanks to measures in support of SMEs, is the beginning of an economic boost of SMEs in Romania. Cluster analysis a multivariate analys is technique, which includes a number of algorithms for classifying objects in to homogeneous groups. Analysis of effectiveness of SMEs from Romania using cluster analysisis a new method of economic analysis which enables an analysis, mathematical methods, regional development of SMEs and increasing their competitiveness.

  20. Cosmological analysis of galaxy clusters surveys in X-rays

    International Nuclear Information System (INIS)

    Clerc, N.

    2012-01-01

    Clusters of galaxies are the most massive objects in equilibrium in our Universe. Their study allows to test cosmological scenarios of structure formation with precision, bringing constraints complementary to those stemming from the cosmological background radiation, supernovae or galaxies. They are identified through the X-ray emission of their heated gas, thus facilitating their mapping at different epochs of the Universe. This report presents two surveys of galaxy clusters detected in X-rays and puts forward a method for their cosmological interpretation. Thanks to its multi-wavelength coverage extending over 10 sq. deg. and after one decade of expertise, the XMM-LSS allows a systematic census of clusters in a large volume of the Universe. In the framework of this survey, the first part of this report describes the techniques developed to the purpose of characterizing the detected objects. A particular emphasis is placed on the most distant ones (z ≥ 1) through the complementarity of observations in X-ray, optical and infrared bands. Then the X-CLASS survey is fully described. Based on XMM archival data, it provides a new catalogue of 800 clusters detected in X-rays. A cosmological analysis of this survey is performed thanks to 'CR-HR' diagrams. This new method self-consistently includes selection effects and scaling relations and provides a means to bypass the computation of individual cluster masses. Propositions are made for applying this method to future surveys as XMM-XXL and eRosita. (author) [fr

  1. Fuzzy cluster analysis of air quality in Beijing district

    Science.gov (United States)

    Liu, Hongkai

    2018-02-01

    The principle of fuzzy clustering analysis is applied in this article, by using the method of transitive closure, the main air pollutants in 17 districts of Beijing from 2014 to 2016 were classified. The results of the analysis reflects the nearly three year’s changes of the main air pollutants in Beijing. This can provide the scientific for atmospheric governance in the Beijing area and digital support.

  2. DGA Clustering and Analysis: Mastering Modern, Evolving Threats, DGALab

    Directory of Open Access Journals (Sweden)

    Alexander Chailytko

    2016-05-01

    Full Text Available Domain Generation Algorithms (DGA is a basic building block used in almost all modern malware. Malware researchers have attempted to tackle the DGA problem with various tools and techniques, with varying degrees of success. We present a complex solution to populate DGA feed using reversed DGAs, third-party feeds, and a smart DGA extraction and clustering based on emulation of a large number of samples. Smart DGA extraction requires no reverse engineering and works regardless of the DGA type or initialization vector, while enabling a cluster-based analysis. Our method also automatically allows analysis of the whole malware family, specific campaign, etc. We present our system and demonstrate its abilities on more than 20 malware families. This includes showing connections between different campaigns, as well as comparing results. Most importantly, we discuss how to utilize the outcome of the analysis to create smarter protections against similar malware.

  3. Visual Analysis and Processing of Clusters Structures in Multidimensional Datasets

    Science.gov (United States)

    Bondarev, A. E.

    2017-05-01

    The article is devoted to problems of visual analysis of clusters structures for a multidimensional datasets. For visual analyzing an approach of elastic maps design [1,2] is applied. This approach is quite suitable for processing and visualizing of multidimensional datasets. To analyze clusters in original data volume the elastic maps are used as the methods of original data points mapping to enclosed manifolds having less dimensionality. Diminishing the elasticity parameters one can design map surface which approximates the multidimensional dataset in question much better. Then the points of dataset in question are projected to the map. The extension of designed map to a flat plane allows one to get an insight about the cluster structure of multidimensional dataset. The approach of elastic maps does not require any a priori information about data in question and does not depend on data nature, data origin, etc. Elastic maps are usually combined with PCA approach. Being presented in the space based on three first principal components the elastic maps provide quite good results. The article describes the results of elastic maps approach application to visual analysis of clusters for different multidimensional datasets including medical data.

  4. Full text clustering and relationship network analysis of biomedical publications.

    Directory of Open Access Journals (Sweden)

    Renchu Guan

    Full Text Available Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  5. Mobility in Europe: Recent Trends from a Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Ioana Manafi

    2017-08-01

    Full Text Available During the past decade, Europe was confronted with major changes and events offering large opportunities for mobility. The EU enlargement process, the EU policies regarding youth, the economic crisis affecting national economies on different levels, political instabilities in some European countries, high rates of unemployment or the increasing number of refugees are only a few of the factors influencing net migration in Europe. Based on a set of socio-economic indicators for EU/EFTA countries and cluster analysis, the paper provides an overview of regional differences across European countries, related to migration magnitude in the identified clusters. The obtained clusters are in accordance with previous studies in migration, and appear stable during the period of 2005-2013, with only some exceptions. The analysis revealed three country clusters: EU/EFTA center-receiving countries, EU/EFTA periphery-sending countries and EU/EFTA outlier countries, the names suggesting not only the geographical position within Europe, but the trends in net migration flows during the years. Therewith, the results provide evidence for the persistence of a movement from periphery to center countries, which is correlated with recent flows of mobility in Europe.

  6. The Productivity Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, E.

    2014-07-01

    Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.

  7. Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

    Science.gov (United States)

    Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

    2015-11-05

    Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (Pgait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    2009-09-01

    Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  9. The Quantitative Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, Ethirajan

    2016-07-01

    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  10. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

    Science.gov (United States)

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  11. Applications of Cluster Analysis to the Creation of Perfectionism Profiles: A Comparison of two Clustering Approaches

    Directory of Open Access Journals (Sweden)

    Jocelyn H Bolin

    2014-04-01

    Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  12. Environmental Gradient Analysis, Ordination, and Classification in Environmental Impact Assessments.

    Science.gov (United States)

    1987-09-01

    associations. The proposed methodology can also be applied to land-use management, maintenance and monitoring programs, and to environmental impact...analysis. Analytical results were then examined and the advantages and disadvantages of each stastistical analysis method used were noted. Mode of Technology...and S. Wearden, Statistics for Research (John Wiley & Sons, New York, NY, 1983). 20a. Draper, N. R., and H. Smith, Applied Regression Analysis, 2nd

  13. The environmental history of group and cluster galaxies in a Λ cold dark matter universe

    Science.gov (United States)

    De Lucia, Gabriella; Weinmann, Simone; Poggianti, Bianca M.; Aragón-Salamanca, Alfonso; Zaritsky, Dennis

    2012-06-01

    We use publicly available galaxy merger trees, obtained applying semi-analytic techniques to a large high-resolution cosmological simulation, to study the environmental history of group and cluster galaxies. Our results highlight the existence of an intrinsic history bias which makes the nature versus nurture (as well as the mass versus environment) debate inherently ill posed. In particular, we show that (i) surviving massive satellites were accreted later than their less massive counterparts, from more massive haloes and (ii) the mixing of galaxy populations is incomplete during halo assembly, which creates a correlation between the time a galaxy becomes satellite and its present distance from the parent halo centre. The weakest trends are found for the most massive satellites, as a result of efficient dynamical friction and late formation times of massive haloes. A large fraction of the most massive group/cluster members are accreted on to the main progenitor of the final halo as central galaxies, while about half of the galaxies with low and intermediate stellar masses are accreted as satellites. Large fractions of group and cluster galaxies (in particular those of low stellar mass) have therefore been ‘pre-processed’ as satellites of groups with mass ˜1013 M⊙. To quantify the relevance of hierarchical structure growth on the observed environmental trends, we have considered observational estimates of the passive galaxy fractions and their variation as a function of halo mass and clustercentric distance. Comparisons with our theoretical predictions require relatively long times (˜5-7 Gyr) for the suppression of star formation in group and cluster satellites. It is unclear how such a gentle mode of strangulation can be achieved by simply relaxing the assumption of instantaneous stripping of the hot gas reservoir associated with accreting galaxies, or if the difficulties encountered by recent galaxy formation models in reproducing the observed trends

  14. Poisson cluster analysis of cardiac arrest incidence in Columbus, Ohio.

    Science.gov (United States)

    Warden, Craig; Cudnik, Michael T; Sasson, Comilla; Schwartz, Greg; Semple, Hugh

    2012-01-01

    Scarce resources in disease prevention and emergency medical services (EMS) need to be focused on high-risk areas of out-of-hospital cardiac arrest (OHCA). Cluster analysis using geographic information systems (GISs) was used to find these high-risk areas and test potential predictive variables. This was a retrospective cohort analysis of EMS-treated adults with OHCAs occurring in Columbus, Ohio, from April 1, 2004, through March 31, 2009. The OHCAs were aggregated to census tracts and incidence rates were calculated based on their adult populations. Poisson cluster analysis determined significant clusters of high-risk census tracts. Both census tract-level and case-level characteristics were tested for association with high-risk areas by multivariate logistic regression. A total of 2,037 eligible OHCAs occurred within the city limits during the study period. The mean incidence rate was 0.85 OHCAs/1,000 population/year. There were five significant geographic clusters with 76 high-risk census tracts out of the total of 245 census tracts. In the case-level analysis, being in a high-risk cluster was associated with a slightly younger age (-3 years, adjusted odds ratio [OR] 0.99, 95% confidence interval [CI] 0.99-1.00), not being white, non-Hispanic (OR 0.54, 95% CI 0.45-0.64), cardiac arrest occurring at home (OR 1.53, 95% CI 1.23-1.71), and not receiving bystander cardiopulmonary resuscitation (CPR) (OR 0.77, 95% CI 0.62-0.96), but with higher survival to hospital discharge (OR 1.78, 95% CI 1.30-2.46). In the census tract-level analysis, high-risk census tracts were also associated with a slightly lower average age (-0.1 years, OR 1.14, 95% CI 1.06-1.22) and a lower proportion of white, non-Hispanic patients (-0.298, OR 0.04, 95% CI 0.01-0.19), but also a lower proportion of high-school graduates (-0.184, OR 0.00, 95% CI 0.00-0.00). This analysis identified high-risk census tracts and associated census tract-level and case-level characteristics that can be used to

  15. Fuzzy cluster analysis of high-field functional MRI data.

    Science.gov (United States)

    Windischberger, Christian; Barth, Markus; Lamm, Claus; Schroeder, Lee; Bauer, Herbert; Gur, Ruben C; Moser, Ewald

    2003-11-01

    Functional magnetic resonance imaging (fMRI) based on blood-oxygen level dependent (BOLD) contrast today is an established brain research method and quickly gains acceptance for complementary clinical diagnosis. However, neither the basic mechanisms like coupling between neuronal activation and haemodynamic response are known exactly, nor can the various artifacts be predicted or controlled. Thus, modeling functional signal changes is non-trivial and exploratory data analysis (EDA) may be rather useful. In particular, identification and separation of artifacts as well as quantification of expected, i.e. stimulus correlated, and novel information on brain activity is important for both, new insights in neuroscience and future developments in functional MRI of the human brain. After an introduction on fuzzy clustering and very high-field fMRI we present several examples where fuzzy cluster analysis (FCA) of fMRI time series helps to identify and locally separate various artifacts. We also present and discuss applications and limitations of fuzzy cluster analysis in very high-field functional MRI: differentiate temporal patterns in MRI using (a) a test object with static and dynamic parts, (b) artifacts due to gross head motion artifacts. Using a synthetic fMRI data set we quantitatively examine the influences of relevant FCA parameters on clustering results in terms of receiver-operator characteristics (ROC) and compare them with a commonly used model-based correlation analysis (CA) approach. The application of FCA in analyzing in vivo fMRI data is shown for (a) a motor paradigm, (b) data from multi-echo imaging, and (c) a fMRI study using mental rotation of three-dimensional cubes. We found that differentiation of true "neural" from false "vascular" activation is possible based on echo time dependence and specific activation levels, as well as based on their signal time-course. Exploratory data analysis methods in general and fuzzy cluster analysis in particular may

  16. Environmental analysis in small and medium enterprises

    International Nuclear Information System (INIS)

    Luciani, R.; Andriola, L.; Di Franco, N.

    2001-01-01

    An environmental analysis is considered to be one of the primary goals for an enterprise environmental management. Nevertheless the complexity of the environmental problems and of its regulations prevents the small enterprises the possibility to perform an environmental policy (EMAS, ISO 14001). One of the most correct evaluation instrument for creating a datum-point standard could be the filling up a questionnaire, built up in according to the industrial enterprises need. It has the function of creating a primary step for a subsequent environmental management [it

  17. Performance Based Clustering for Benchmarking of Container Ports: an Application of Dea and Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Jie Wu

    2010-12-01

    Full Text Available The operational performance of container ports has received more and more attentions in both academic and practitioner circles, the performance evaluation and process improvement of container ports have also been the focus of several studies. In this paper, Data Envelopment Analysis (DEA, an effective tool for relative efficiency assessment, is utilized for measuring the performances and benchmarking of the 77 world container ports in 2007. The used approaches in the current study consider four inputs (Capacity of Cargo Handling Machines, Number of Berths, Terminal Area and Storage Capacity and a single output (Container Throughput. The results for the efficiency scores are analyzed, and a unique ordering of the ports based on average cross efficiency is provided, also cluster analysis technique is used to select the more appropriate targets for poorly performing ports to use as benchmarks.

  18. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

    Science.gov (United States)

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

    2015-01-01

    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383

  19. Fractal Segmentation and Clustering Analysis for Seismic Time Slices

    Science.gov (United States)

    Ronquillo, G.; Oleschko, K.; Korvin, G.; Arizabalo, R. D.

    2002-05-01

    Fractal analysis has become part of the standard approach for quantifying texture on gray-tone or colored images. In this research we introduce a multi-stage fractal procedure to segment, classify and measure the clustering patterns on seismic time slices from a 3-D seismic survey. Five fractal classifiers (c1)-(c5) were designed to yield standardized, unbiased and precise measures of the clustering of seismic signals. The classifiers were tested on seismic time slices from the AKAL field, Cantarell Oil Complex, Mexico. The generalized lacunarity (c1), fractal signature (c2), heterogeneity (c3), rugosity of boundaries (c4) and continuity resp. tortuosity (c5) of the clusters are shown to be efficient measures of the time-space variability of seismic signals. The Local Fractal Analysis (LFA) of time slices has proved to be a powerful edge detection filter to detect and enhance linear features, like faults or buried meandering rivers. The local fractal dimensions of the time slices were also compared with the self-affinity dimensions of the corresponding parts of porosity-logs. It is speculated that the spectral dimension of the negative-amplitude parts of the time-slice yields a measure of connectivity between the formation's high-porosity zones, and correlates with overall permeability.

  20. Cluster analysis for DNA methylation profiles having a detection threshold

    Directory of Open Access Journals (Sweden)

    Siegmund Kimberly D

    2006-07-01

    Full Text Available Abstract Background DNA methylation, a molecular feature used to investigate tumor heterogeneity, can be measured on many genomic regions using the MethyLight technology. Due to the combination of the underlying biology of DNA methylation and the MethyLight technology, the measurements, while being generated on a continuous scale, have a large number of 0 values. This suggests that conventional clustering methodology may not perform well on this data. Results We compare performance of existing methodology (such as k-means with two novel methods that explicitly allow for the preponderance of values at 0. We also consider how the ability to successfully cluster such data depends upon the number of informative genes for which methylation is measured and the correlation structure of the methylation values for those genes. We show that when data is collected for a sufficient number of genes, our models do improve clustering performance compared to methods, such as k-means, that do not explicitly respect the supposed biological realities of the situation. Conclusion The performance of analysis methods depends upon how well the assumptions of those methods reflect the properties of the data being analyzed. Differing technologies will lead to data with differing properties, and should therefore be analyzed differently. Consequently, it is prudent to give thought to what the properties of the data are likely to be, and which analysis method might therefore be likely to best capture those properties.

  1. Monitoring Customer Satisfaction in Service Industry: A Cluster Analysis Approach

    Directory of Open Access Journals (Sweden)

    Matúš Horváth

    2012-11-01

    Full Text Available One of the key performance indicators of quality management system of an organization is customer satisfaction. The process of monitoring customer satisfaction is therefore an important part of the measuring processes of the quality management system. This paper deals with new ways how to analyse and monitor customer satisfaction using the analysis of data containing how the customers use the organisation services and customer leaving rates. The article used cluster analysis in this process for segmentation of customers with the aim to increase the accuracy of the results and on these results based decisions. The aplication example was created as a part of bachelor thesis.

  2. Monitoring Customer Satisfaction in Service Industry: A Cluster Analysis Approach

    Directory of Open Access Journals (Sweden)

    Matúš Horváth

    2012-10-01

    Full Text Available One of the key performance indicators of quality management system of an organization is customer satisfaction. The process of monitoring customer satisfaction is therefore an important part of the measuring processes of the quality management system. This paper deals with new ways how to analyse and monitor customer satisfaction using the analysis of data containing how the customers use the organisation services and customer leaving rates. The article used cluster analysis in this process for segmentation of customers with the aim to increase the accuracy of the results and on these results based decisions. The aplication example was created as a part of bachelor thesis.

  3. Market analysis of Serbia's raspberry sector and cluster development initiatives

    Directory of Open Access Journals (Sweden)

    Paraušić Vesna

    2016-01-01

    Full Text Available Authors analyze competitive strength and weakness of raspberry producers in Serbia and propose key prerequisites of which fulfilling will depend develop of successful cluster initiative in Serbian raspberry sector. The research results indicate that Serbian raspberry growers can develop successful cluster and they can keep leading position in the global market of raspberries, only with following many assumptions, like: (a better organized marketing channel through the vertically and horizontal integration of all actors in this sector,(b strengthening specialized cooperatives for raspberry production and associations of raspberry growers, and in the future setting up of producer organizations and associations; (c inclusion of producers of other berries and producers of processed berries; (d introducing innovations, scientific knowledge, and research and development in production, processing, packing, logistics, export of raspberries, etc. An analysis is based on case study in Šumadija and Western Serbia region, which is major region in raspberry production in Serbia.

  4. Image Registration Algorithm Based on Parallax Constraint and Clustering Analysis

    Science.gov (United States)

    Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song

    2018-01-01

    To resolve the problem of slow computation speed and low matching accuracy in image registration, a new image registration algorithm based on parallax constraint and clustering analysis is proposed. Firstly, Harris corner detection algorithm is used to extract the feature points of two images. Secondly, use Normalized Cross Correlation (NCC) function to perform the approximate matching of feature points, and the initial feature pair is obtained. Then, according to the parallax constraint condition, the initial feature pair is preprocessed by K-means clustering algorithm, which is used to remove the feature point pairs with obvious errors in the approximate matching process. Finally, adopt Random Sample Consensus (RANSAC) algorithm to optimize the feature points to obtain the final feature point matching result, and the fast and accurate image registration is realized. The experimental results show that the image registration algorithm proposed in this paper can improve the accuracy of the image matching while ensuring the real-time performance of the algorithm.

  5. Environmental Effects on Galaxy Evolution. II. Quantifying the Tidal Features in NIR Images of the Cluster Abell 85

    Science.gov (United States)

    Venkatapathy, Y.; Bravo-Alfaro, H.; Mayya, Y. D.; Lobo, C.; Durret, F.; Gamez, V.; Valerdi, M.; Granados-Contreras, A. P.; Navarro-Poupard, F.

    2017-12-01

    This work is part of a series of papers devoted to investigating the evolution of cluster galaxies during their infall. In the present article, we image in NIR a selected sample of galaxies throughout the massive cluster Abell 85 (z = 0.055). We obtain (JHK‧) photometry for 68 objects, reaching ˜1 mag arcsec-2 deeper than 2MASS. We use these images to unveil asymmetries in the outskirts of a sample of bright galaxies and develop a new asymmetry index, {α }{An}, which allows us to quantify the degree of disruption by the relative area occupied by the tidal features on the plane of the sky. We measure the asymmetries for a subsample of 41 large-area objects, finding clear asymmetries in 10 galaxies; most of these are in groups and pairs projected at different clustercentric distances, and some of them are located beyond R 500. Combining information on the H I gas content of blue galaxies and the distribution of substructures across Abell 85 with the present NIR asymmetry analysis, we obtain a very powerful tool to confirm that tidal mechanisms are indeed present and are currently affecting a fraction of galaxies in Abell 85. However, when comparing our deep NIR images with UV blue images of two very disrupted (jellyfish) galaxies in this cluster, we discard the presence of tidal interactions down to our detection limit. Our results suggest that ram-pressure stripping is at the origin of such spectacular disruptions. We conclude that across a complex cluster like Abell 85, environmental mechanisms, both gravitational and hydrodynamical, are playing an active role in driving galaxy evolution.

  6. Analysis of a cluster of cases of Wegener granulomatosis.

    Science.gov (United States)

    Albert, Daniel A; Albert, Alexis N; Vernace, Melchiore; Sebastian, Jodi K; Hsia, Elizabeth C

    2005-08-01

    Wegener granulomatosis is a chronic inflammatory autoimmune disease of unknown etiology. The sporadic occurrence, lack of familial or genetic associations, and rising incidence suggest possible exposure to environmental agents as causative for this disease. The objective of this study was to examine possible environmental triggers of Wegener granulomatosis. While conducting an environmental survey of potential precipitants of Wegener granulomatosis on a cohort of patients seen at Doylestown Hospital and at the University of Pennsylvania, we identified a cluster of cases in the Dublin, Pennsylvania, region. Through hospital records and patient contacts, we located 7 cases diagnosed in a 3-year period within a 10-mile radius of an Environmental Protection Agency (EPA) Superfund toxic waste site. The radius of inclusion represents a population of approximately 50,000 individuals. Assuming complete ascertainment of cases--which is unlikely given the methods used to acquire patients--the prevalence is 2- to 4-fold greater than the expected rate of 3 per 100,000. We identified toxins at or above "action level" within the demarcated geographic region using published data from the EPA. Furthermore, we queried patients regarding their particular chemical exposures. These patients with Wegener granulomatosis were possibly exposed to high levels of trichloroethylene (TCE), vinyl chloride, methyl tertiary-butyl ether (MTBE), dichloroethene (DCE), and chromic acid from several industrial waste sites within the area. Additionally, these patients reported a total of greater than 30 possible exposures, including the aforesaid chemical contaminants. Three of 5 patients whose water source is known had well water that exposed them to industrial runoff and necessitated EPA intervention. This data, along with other epidemiologic studies, suggest possible toxic exposures as potentially correctable risk factors for Wegener granulomatosis. We encourage clinicians to seek data that

  7. Steady state subchannel analysis of AHWR fuel cluster

    International Nuclear Information System (INIS)

    Dasgupta, A.; Chandraker, D.K.; Vijayan, P.K.; Saha, D.

    2006-09-01

    Subchannel analysis is a technique used to predict the thermal hydraulic behavior of reactor fuel assemblies. The rod cluster is subdivided into a number of parallel interacting flow subchannels. The conservation equations are solved for each of these subchannels, taking into account subchannel interactions. Subchannel analysis of AHWR D-5 fuel cluster has been carried out to determine the variations in thermal hydraulic conditions of coolant and fuel temperatures along the length of the fuel bundle. The hottest regions within the AHWR fuel bundle have been identified. The effect of creep on the fuel performance has also been studied. MCHFR has been calculated using Jansen-Levy correlation. The calculations have been backed by sensitivity analysis for parameters whose values are not known accurately. The sensitivity analysis showed the calculations to have a very low sensitivity to these parameters. Apart from the analysis, the report also includes a brief introduction of a few subchannel codes. A brief description of the equations and solution methodology used in COBRA-IIIC and COBRA-IV-I is also given. (author)

  8. Environmental filtering of eudicot lineages underlies phylogenetic clustering in tropical South American flooded forests.

    Science.gov (United States)

    Aldana, Ana M; Carlucci, Marcos B; Fine, Paul V A; Stevenson, Pablo R

    2017-02-01

    The phylogenetic community assembly approach has been used to elucidate the role of ecological and historical processes in shaping tropical tree communities. Recent studies have shown that stressful environments, such as seasonally dry, white-sand and flooded forests tend to be phylogenetically clustered, arguing for niche conservatism as the main driver for this pattern. Very few studies have attempted to identify the lineages that contribute to such assembly patterns. We aimed to improve our understanding of the assembly of flooded forest tree communities in Northern South America by asking the following questions: are seasonally flooded forests phylogenetically clustered? If so, which angiosperm lineages are over-represented in seasonally flooded forests? To assess our hypotheses, we investigated seasonally flooded and terra firme forests from the Magdalena, Orinoco and Amazon Basins, in Colombia. Our results show that, regardless of the river basin in which they are located, seasonally flooded forests of Northern South America tend to be phylogenetically clustered, which means that the more abundant taxa in these forests are more closely related to each other than expected by chance. Based on our alpha and beta phylodiversity analyses we interpret that eudicots are more likely to adapt to extreme environments such as seasonally flooded forests, which indicates the importance of environmental filtering in the assembly of the Neotropical flora.

  9. Environmental Management Strategy: Four Forces Analysis

    Science.gov (United States)

    Doyle, Martin W.; Von Windheim, Jesko

    2015-01-01

    We develop an analytical approach for more systematically analyzing environmental management problems in order to develop strategic plans. This approach can be deployed by agencies, non-profit organizations, corporations, or other organizations and institutions tasked with improving environmental quality. The analysis relies on assessing the underlying natural processes followed by articulation of the relevant societal forces causing environmental change: (1) science and technology, (2) governance, (3) markets and the economy, and (4) public behavior. The four forces analysis is then used to strategize which types of actions might be most effective at influencing environmental quality. Such strategy has been under-used and under-valued in environmental management outside of the corporate sector, and we suggest that this four forces analysis is a useful analytic to begin developing such strategy.

  10. 23 CFR 710.305 - Environmental analysis.

    Science.gov (United States)

    2010-04-01

    ... FEDERAL HIGHWAY ADMINISTRATION, DEPARTMENT OF TRANSPORTATION RIGHT-OF-WAY AND ENVIRONMENT RIGHT-OF-WAY AND REAL ESTATE Project Development § 710.305 Environmental analysis. The National Environmental Policy Act... agreement for acquisition of right-of-way. Where applicable, a State also must complete Clean Air Act (42 U...

  11. [The hierarchical clustering analysis of hyperspectral image based on probabilistic latent semantic analysis].

    Science.gov (United States)

    Yi, Wen-Bin; Shen, Li; Qi, Yin-Feng; Tang, Hong

    2011-09-01

    The paper introduces the Probabilistic Latent Semantic Analysis (PLSA) to the image clustering and an effective image clustering algorithm using the semantic information from PLSA is proposed which is used for hyperspectral images. Firstly, the ISODATA algorithm is used to obtain the initial clustering result of hyperspectral image and the clusters of the initial clustering result are considered as the visual words of the PLSA. Secondly, the object-oriented image segmentation algorithm is used to partition the hyperspectral image and segments with relatively pure pixels are regarded as documents in PLSA. Thirdly, a variety of identification methods which can estimate the best number of cluster centers is combined to get the number of latent semantic topics. Then the conditional distributions of visual words in topics and the mixtures of topics in different documents are estimated by using PLSA. Finally, the conditional probabilistic of latent semantic topics are distinguished using statistical pattern recognition method, the topic type for each visual in each document will be given and the clustering result of hyperspectral image are then achieved. Experimental results show the clusters of the proposed algorithm are better than K-MEANS and ISODATA in terms of object-oriented property and the clustering result is closer to the distribution of real spatial distribution of surface.

  12. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    Science.gov (United States)

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  13. Analysis of Learning Development With Sugeno Fuzzy Logic And Clustering

    Directory of Open Access Journals (Sweden)

    Maulana Erwin Saputra

    2017-06-01

    Full Text Available In the first journal, I made this attempt to analyze things that affect the achievement of students in each school of course vary. Because students are one of the goals of achieving the goals of successful educational organizations. The mental influence of students’ emotions and behaviors themselves in relation to learning performance. Fuzzy logic can be used in various fields as well as Clustering for grouping, as in Learning Development analyzes. The process will be performed on students based on the symptoms that exist. In this research will use fuzzy logic and clustering. Fuzzy is an uncertain logic but its excess is capable in the process of language reasoning so that in its design is not required complicated mathematical equations. However Clustering method is K-Means method is method where data analysis is broken down by group k (k = 1,2,3, .. k. To know the optimal number of Performance group. The results of the research is with a questionnaire entered into matlab will produce a value that means in generating the graph. And simplify the school in seeing Student performance in the learning process by using certain criteria. So from the system that obtained the results for a decision-making required by the school.

  14. Visualizing dynamical neural assemblies with a fuzzy synchronization clustering analysis.

    Science.gov (United States)

    Zhou, Shu; Wu, Yan; Dos Santos, Claudia C

    2009-12-01

    Phase synchrony has been proposed as a possible communication mechanism between cerebral regions. The participation index method (PIM) may be used to investigate integrating structures within an oscillatory network, based on the eigenvalue decomposition of matrix of bivariate synchronization indices. However, eigenvector orthogonality between clusters may result in categorization difficulties for hub oscillators and pseudoclustering phenomenon. Here, we propose a method of fuzzy synchronization clustering analysis (FSCA) to avoid the constraint of orthogonality by combining the fuzzy c-means algorithm with the phase-locking value. Following mathematical derivation, we cross-validated the FSCA and the PIM using the same multichannel phase time series of event-related EEG from a subject performing a working memory task. Both clustering methods produced consistent findings for the qualitatively salient configuration of the original network-illustrated here by a visualization technique. In contrast to PIM, use of common virtual oscillatory centroids enabled the FSCA to reveal multiple dynamical neural assemblies as well as the unitary phase information within each assembly.

  15. Using cluster analysis to examine dietary patterns: nutrient intakes, gender, and weight status differ across food pattern clusters.

    Science.gov (United States)

    Wirfält, A K; Jeffery, R W

    1997-03-01

    This study explored the usefulness of cluster analysis in identifying food choice patterns of three groups of adults in relation to their energy intake. Food frequency data were converted to percentage of total energy from 38 food groups and entered into a cluster analysis procedure. Subjects in the emerging food group patterns were compared in terms of weight status, demographics, and the nutrition composition of their usual diet. Data were collected as part of three studies in two US metropolitan areas using identical protocols. Participants were university employees (103 women and 99 men) who volunteered for a reliability study of health behavior questionnaires and moderately obese volunteers (223 women and 101 men) to two weight-loss studies who were recruited by newspaper advertisements. Subjects were clustered according to food energy sources using the FASTCLUS procedure in the Statistical Analysis System. One-way analysis of variance and chi 2 analysis were then performed to compared the weight status, nutrient intakes, and demographics of the food patterns. Six food pattern clusters were identified. Subjects in the two clusters associated with high consumption of pastry and meat had significantly higher fat intakes (P = .0001). Subjects in two other clusters, those associated with high intake of skim milk and a broad distribution of energy sources had significantly higher micronutrient levels (P = .0001). Body mass index and the distribution of gender were also significantly different across clusters. The success of cluster analysis in identifying dietary exposure categories with unique demographic and nutritional correlates suggests that the approach may be useful in epidemiologic studies that examine conditions such as obesity, and in the design of nutrition interventions.

  16. Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters

    Science.gov (United States)

    Muraoka, Masae; Okuda, Hiroshi

    With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.

  17. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing...

  18. Minimum Information Loss Cluster Analysis for Cathegorical Data

    Czech Academy of Sciences Publication Activity Database

    Grim, Jiří; Hora, Jan

    2007-01-01

    Roč. 2007, Č. 4571 (2007), s. 233-247 ISSN 0302-9743. [International Conference on Machine Learning and Data Mining MLDM 2007 /5./. Leipzig, 18.07.2007-20.07.2007] R&D Projects: GA MŠk 1M0572; GA ČR GA102/07/1594 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Cluster Analysis * Cathegorical Data * EM algorithm Subject RIV: BD - The ory of Information Impact factor: 0.402, year: 2005

  19. A cluster analysis on road traffic accidents using genetic algorithms

    Science.gov (United States)

    Saharan, Sabariah; Baragona, Roberto

    2017-04-01

    The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.

  20. Individual differences in reading skill and language lateralisation: a cluster analysis.

    Science.gov (United States)

    Chiarello, Christine; Welcome, Suzanne E; Leonard, Christiana M

    2012-01-01

    Individual differences in reading and cerebral lateralisation were investigated in 200 college students who completed reading assessments and divided visual field word recognition tasks, and received a structural MRI scan. Prior studies on this data set indicated that little variance in brain-behaviour correlations could be attributed to the effects of sex and handedness variables (Chiarello, Welcome, Halderman, & Leonard, 2009; Chiarello, Welcome, Halderman, Towler, et al., 2009; Welcome et al., 2009). Here a more bottom-up approach to behavioural classification (cluster analysis) was used to explore individual differences that need not depend on a priori decisions about relevant subgroups. The cluster solution identified four subgroups of college age readers with differing reading skill and visual field lateralisation profiles. These findings generalised to measures that were not included in the cluster analysis. Poorer reading skill was associated with somewhat reduced VF asymmetry, while average readers demonstrated exaggerated RVF/left hemisphere advantages. Skilled readers had either reduced asymmetries, or asymmetries that varied by task. The clusters did not differ by sex or handedness, suggesting that there are identifiable sources of variance among individuals that are not captured by these standard participant variables. All clusters had typical leftward asymmetry of the planum temporale. However, the size of areas in the posterior corpus callosum distinguished the two subgroups with high reading skill. A total of 17 participants, identified as multivariate outliers, had unusual behavioural profiles and differed from the remainder of the sample in not having significant leftward asymmetry of the planum temporale. A less buffered type of neurodevelopment that is more open to the effects of random genetic and environmental influences may characterise such individuals.

  1. Comparing Distributions of Environmental Outcomes for Regulatory Environmental Justice Analysis

    Directory of Open Access Journals (Sweden)

    Glenn Sheriff

    2011-05-01

    Full Text Available Economists have long been interested in measuring distributional impacts of policy interventions. As environmental justice (EJ emerged as an ethical issue in the 1970s, the academic literature has provided statistical analyses of the incidence and causes of various environmental outcomes as they relate to race, income, and other demographic variables. In the context of regulatory impacts, however, there is a lack of consensus regarding what information is relevant for EJ analysis, and how best to present it. This paper helps frame the discussion by suggesting a set of questions fundamental to regulatory EJ analysis, reviewing past approaches to quantifying distributional equity, and discussing the potential for adapting existing tools to the regulatory context.

  2. Physicochemical properties of different corn varieties by principal components analysis and cluster analysis

    International Nuclear Information System (INIS)

    Zeng, J.; Li, G.; Sun, J.

    2013-01-01

    Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)

  3. Environmental exergy analysis of wastewater treatment plants

    Energy Technology Data Exchange (ETDEWEB)

    Mora Bejarano, C.H.; Oliveira Junior, S. de [Universidade de Sao Paulo (USP), SP (Brazil). Dept. de Engenharia Mecanica]. E-mail: carlos.bejarano@poli.usp.br; silvio.oliveira@poli.usp.br

    2006-12-15

    This work evaluates the environmental impact of Wastewater Treatment Plants (WTP) based on data generated by the exergy analysis, calculating and applying environmental impact indexes for two WTP located in the Metropolitan Area of Sao Paulo. The environmental impact of the waste water treatment plants was done by means of evaluating two environmental impact exergy based indexes: the environmental exergy efficiency and the total pollution rate (Rpol,t). The environmental exergy efficiency is defined as the ratio of the exergy of the useful effect of the WTP to the total exergy consumed by human and natural resources, including all the exergy inputs. That relation is an indication of the theoretical potential of future improvements of the process. Besides the environmental exergy efficiency, it is also used the total pollution rate, based on the definition done by Makarytchev (1997), as the ratio of the destroyed exergy associated to the process wastes to the exergy of the useful effect of the process. The analysis of the results shows that this method can be used to quantify and also optimise the environmental performance of Wastewater Treatment Plants. (author)

  4. Investigations of environmental conditions during cluster indicate probable vectors of unknown exogenous agent(s) of multiple sclerosis.

    Science.gov (United States)

    McStreet, G H; Elkunk, R B; Latiwonk, Q I

    1992-01-01

    During the tail-end of an active cluster several environmental investigations indicated that wildbirds were very probably the vectors of the unknown exogenous agent of MS. Canine distemper and genetic-autoimmune theories were very definitely eliminated because of the unusual pattern of the cluster. Studies of several avian pathogens unveiled Marek's (MDV) and/or IBD (Gumboro) as the most likely candidates for exogenous agent of MS.

  5. Data analysis and interpretation for environmental surveillance

    International Nuclear Information System (INIS)

    1992-06-01

    The Data Analysis and Interpretation for Environmental Surveillance Conference was held in Lexington, Kentucky, February 5--7, 1990. The conference was sponsored by what is now the Office of Environmental Compliance and Documentation, Oak Ridge National Laboratory. Participants included technical professionals from all Martin Marietta Energy Systems facilities, Westinghouse Materials Company of Ohio, Pacific Northwest Laboratory, and several technical support contractors. Presentations at the conference ranged the full spectrum of issues that effect the analysis and interpretation of environmental data. Topics included tracking systems for samples and schedules associated with ongoing programs; coalescing data from a variety of sources and pedigrees into integrated data bases; methods for evaluating the quality of environmental data through empirical estimates of parameters such as charge balance, pH, and specific conductance; statistical applications to the interpretation of environmental information; and uses of environmental information in risk and dose assessments. Hearing about and discussing this wide variety of topics provided an opportunity to capture the subtlety of each discipline and to appreciate the continuity that is required among the disciplines in order to perform high-quality environmental information analysis

  6. Shape Analysis of HII Regions - I. Statistical Clustering

    Science.gov (United States)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

    2018-04-01

    We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  7. Time series clustering analysis of health-promoting behavior

    Science.gov (United States)

    Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

    2013-10-01

    Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

  8. Cluster analysis of autoantibodies in 852 patients with systemic lupus erythematosus from a single center.

    Science.gov (United States)

    Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat

    2014-07-01

    Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.

  9. [Typologies of Madrid's citizens (Spain) at the end-of-life: cluster analysis].

    Science.gov (United States)

    Ortiz-Gonçalves, Belén; Perea-Pérez, Bernardo; Labajo González, Elena; Albarrán Juan, Elena; Santiago-Sáez, Andrés

    2018-03-06

    To establish typologies within Madrid's citizens (Spain) with regard to end-of-life by cluster analysis. The SPAD 8 programme was implemented in a sample from a health care centre in the autonomous region of Madrid (Spain). A multiple correspondence analysis technique was used, followed by a cluster analysis to create a dendrogram. A cross-sectional study was made beforehand with the results of the questionnaire. Five clusters stand out. Cluster 1: a group who preferred not to answer numerous questions (5%). Cluster 2: in favour of receiving palliative care and euthanasia (40%). Cluster 3: would oppose assisted suicide and would not ask for spiritual assistance (15%). Cluster 4: would like to receive palliative care and assisted suicide (16%). Cluster 5: would oppose assisted suicide and would ask for spiritual assistance (24%). The following four clusters stood out. Clusters 2 and 4 would like to receive palliative care, euthanasia (2) and assisted suicide (4). Clusters 4 and 5 regularly practiced their faith and their family members did not receive palliative care. Clusters 3 and 5 would be opposed to euthanasia and assisted suicide in particular. Clusters 2, 4 and 5 had not completed an advance directive document (2, 4 and 5). Clusters 2 and 3 seldom practiced their faith. This study could be taken into consideration to improve the quality of end-of-life care choices. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  10. Coxiella burnetii transcriptional analysis reveals serendipity clusters of regulation in intracellular bacteria.

    Directory of Open Access Journals (Sweden)

    Quentin Leroy

    Full Text Available Coxiella burnetii, the causative agent of the zoonotic disease Q fever, is mainly transmitted to humans through an aerosol route. A spore-like form allows C. burnetii to resist different environmental conditions. Because of this, analysis of the survival strategies used by this bacterium to adapt to new environmental conditions is critical for our understanding of C. burnetii pathogenicity. Here, we report the early transcriptional response of C. burnetii under temperature stresses. Our data show that C. burnetii exhibited minor changes in gene regulation under short exposure to heat or cold shock. While small differences were observed, C. burnetii seemed to respond similarly to cold and heat shock. The expression profiles obtained using microarrays produced in-house were confirmed by quantitative RT-PCR. Under temperature stresses, 190 genes were differentially expressed in at least one condition, with a fold change of up to 4. Globally, the differentially expressed genes in C. burnetii were associated with bacterial division, (pppGpp synthesis, wall and membrane biogenesis and, especially, lipopolysaccharide and peptidoglycan synthesis. These findings could be associated with growth arrest and witnessed transformation of the bacteria to a spore-like form. Unexpectedly, clusters of neighboring genes were differentially expressed. These clusters do not belong to operons or genetic networks; they have no evident associated functions and are not under the control of the same promoters. We also found undescribed but comparable clusters of regulation in previously reported transcriptomic analyses of intracellular bacteria, including Rickettsia sp. and Listeria monocytogenes. The transcriptomic patterns of C. burnetii observed under temperature stresses permits the recognition of unpredicted clusters of regulation for which the trigger mechanism remains unidentified but which may be the result of a new mechanism of epigenetic regulation.

  11. Sensitization trajectories in childhood revealed by using a cluster analysis

    DEFF Research Database (Denmark)

    Schoos, Ann-Marie M.; Chawes, Bo L.; Melen, Erik

    2017-01-01

    BACKGROUND: Assessment of sensitization at a single time point during childhood provides limited clinical information. We hypothesized that sensitization develops as specific patterns with respect to age at debut, development over time, and involved allergens and that such patterns might be more...... biologically and clinically relevant. OBJECTIVE: We sought to explore latent patterns of sensitization during the first 6 years of life and investigate whether such patterns associate with the development of asthma, rhinitis, and eczema. METHODS: We investigated 398 children from the at-risk Copenhagen...... Prospective Studies on Asthma in Childhood 2000 (COPSAC2000) birth cohort with specific IgE against 13 common food and inhalant allergens at the ages of ½, 1½, 4, and 6 years. An unsupervised cluster analysis for 3-dimensional data (nonnegative sparse parallel factor analysis) was used to extract latent...

  12. Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.

    Science.gov (United States)

    Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D

    2017-06-01

    Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.

  13. Analysis procedure for americium in environmental samples

    International Nuclear Information System (INIS)

    Holloway, R.W.; Hayes, D.W.

    1982-01-01

    Several methods for the analysis of 241 Am in environmental samples were evaluated and a preferred method was selected. This method was modified and used to determine the 241 Am content in sediments, biota, and water. The advantages and limitations of the method are discussed. The method is also suitable for 244 Cm analysis

  14. Determining wood chip size: image analysis and clustering methods

    Directory of Open Access Journals (Sweden)

    Paolo Febbi

    2013-09-01

    Full Text Available One of the standard methods for the determination of the size distribution of wood chips is the oscillating screen method (EN 15149- 1:2010. Recent literature demonstrated how image analysis could return highly accurate measure of the dimensions defined for each individual particle, and could promote a new method depending on the geometrical shape to determine the chip size in a more accurate way. A sample of wood chips (8 litres was sieved through horizontally oscillating sieves, using five different screen hole diameters (3.15, 8, 16, 45, 63 mm; the wood chips were sorted in decreasing size classes and the mass of all fractions was used to determine the size distribution of the particles. Since the chip shape and size influence the sieving results, Wang’s theory, which concerns the geometric forms, was considered. A cluster analysis on the shape descriptors (Fourier descriptors and size descriptors (area, perimeter, Feret diameters, eccentricity was applied to observe the chips distribution. The UPGMA algorithm was applied on Euclidean distance. The obtained dendrogram shows a group separation according with the original three sieving fractions. A comparison has been made between the traditional sieve and clustering results. This preliminary result shows how the image analysis-based method has a high potential for the characterization of wood chip size distribution and could be further investigated. Moreover, this method could be implemented in an online detection machine for chips size characterization. An improvement of the results is expected by using supervised multivariate methods that utilize known class memberships. The main objective of the future activities will be to shift the analysis from a 2-dimensional method to a 3- dimensional acquisition process.

  15. Integrating PROOF Analysis in Cloud and Batch Clusters

    International Nuclear Information System (INIS)

    Rodríguez-Marrero, Ana Y; Fernández-del-Castillo, Enol; López García, Álvaro; Marco de Lucas, Jesús; Matorras Weinig, Francisco; González Caballero, Isidro; Cuesta Noriega, Alberto

    2012-01-01

    High Energy Physics (HEP) analysis are becoming more complex and demanding due to the large amount of data collected by the current experiments. The Parallel ROOT Facility (PROOF) provides researchers with an interactive tool to speed up the analysis of huge volumes of data by exploiting parallel processing on both multicore machines and computing clusters. The typical PROOF deployment scenario is a permanent set of cores configured to run the PROOF daemons. However, this approach is incapable of adapting to the dynamic nature of interactive usage. Several initiatives seek to improve the use of computing resources by integrating PROOF with a batch system, such as Proof on Demand (PoD) or PROOF Cluster. These solutions are currently in production at Universidad de Oviedo and IFCA and are positively evaluated by users. Although they are able to adapt to the computing needs of users, they must comply with the specific configuration, OS and software installed at the batch nodes. Furthermore, they share the machines with other workloads, which may cause disruptions in the interactive service for users. These limitations make PROOF a typical use-case for cloud computing. In this work we take profit from Cloud Infrastructure at IFCA in order to provide a dynamic PROOF environment where users can control the software configuration of the machines. The Proof Analysis Framework (PAF) facilitates the development of new analysis and offers a transparent access to PROOF resources. Several performance measurements are presented for the different scenarios (PoD, SGE and Cloud), showing a speed improvement closely correlated with the number of cores used.

  16. CLUSTER ANALYSIS OF NATURAL DISASTER LOSSES IN POLISH AGRICULTURE

    Directory of Open Access Journals (Sweden)

    Grzegorz STRUPCZEWSKI

    2015-04-01

    Full Text Available Agricultural production risk is of special nature due to a great number of hazards, relative weakness of production entities on the market and high ambiguity which is greater than in industrial production. Natural disasters occurring very frequently, at simultaneous low percentage of insured farmers, cause damage of such sizes that force the state to organise current financial aid (for instance in the form of preferential natural disaster loans. This aid is usually not sufficient. On the other hand, regional diversity of the risk level does not positively affect the development of insurance. From the perspective of insurance companies and policymakers it becomes highly important to investigate the spatial structure of losses in agriculture caused by natural disasters. The purpose of the research is to classify the 16 Polish voivodeships into clusters in order to show differences between them according to the criterion of level of damage in agricultural farms caused by natural disasters. On the basis of the cluster analysis it was demonstrated that 11 voivodeships form quite a homogeneous group in terms of size of damage in agriculture (the value of damage in cultivations and the acreage of destroyed cultivations are two most important factors determining affiliation to the cluster, however, the profile of loss occurring in other five voivodeships has a very individual course and requires separate handling in the actuarial sense. It was also proved that high value of losses in agriculture in the absolute sense in given voivodeships do not have to mean high vulnerability of agricultural farms from these voivodeships to natural risks.

  17. Isotope dilution analysis of environmental samples

    International Nuclear Information System (INIS)

    Tolgyessy, J.; Lesny, J.; Korenova, Z.; Klas, J.; Klehr, E.H.

    1986-01-01

    Isotope dilution analysis has been used for the determination of several trace elements - especially metals - in a variety of environmental samples, including aerosols, water, soils, biological materials and geological materials. Variations of the basic concept include classical IDA, substoichiometric IDA, and more recently, sub-superequivalence IDA. Each variation has its advantages and limitations. A periodic chart has been used to identify those elements which have been measured in environmental samples using one or more of these methods. (author)

  18. The Analysis of a Simple k-Means Clustering Algorithm

    National Research Council Canada - National Science Library

    Kanungo, T; Mount, D. M; Netanyahu, N. S; Piatko, C; Silverman, R; Wu, A. Y

    2000-01-01

    .... A popular heuristic for k-means clustering is Lloyd's algorithm. In this paper, we present a simple and efficient implementation of Lloyd's k-means clustering algorithm, which we call the filtering algorithm...

  19. Eating or meeting? Cluster analysis reveals intricacies of white shark (Carcharodon carcharias migration and offshore behavior.

    Directory of Open Access Journals (Sweden)

    Salvador J Jorgensen

    Full Text Available Elucidating how mobile ocean predators utilize the pelagic environment is vital to understanding the dynamics of oceanic species and ecosystems. Pop-up archival transmitting (PAT tags have emerged as an important tool to describe animal migrations in oceanic environments where direct observation is not feasible. Available PAT tag data, however, are for the most part limited to geographic position, swimming depth and environmental temperature, making effective behavioral observation challenging. However, novel analysis approaches have the potential to extend the interpretive power of these limited observations. Here we developed an approach based on clustering analysis of PAT daily time-at-depth histogram records to distinguish behavioral modes in white sharks (Carcharodon carcharias. We found four dominant and distinctive behavioral clusters matching previously described behavioral patterns, including two distinctive offshore diving modes. Once validated, we mapped behavior mode occurrence in space and time. Our results demonstrate spatial, temporal and sex-based structure in the diving behavior of white sharks in the northeastern Pacific previously unrecognized including behavioral and migratory patterns resembling those of species with lek mating systems. We discuss our findings, in combination with available life history and environmental data, and propose specific testable hypotheses to distinguish between mating and foraging in northeastern Pacific white sharks that can provide a framework for future work. Our methodology can be applied to similar datasets from other species to further define behaviors during unobservable phases.

  20. Cluster analysis of BI-RADS descriptions of biopsy-proven breast lesions

    Science.gov (United States)

    Markey, Mia K.; Lo, Joseph Y.; Tourassi, Georgia D.; Floyd, Carey E., Jr.

    2002-05-01

    The purpose of this study was to identify and characterize clusters in a heterogeneous breast cancer computer-aided diagnosis database. Identification of subgroups within the database could help elucidate clinical trends and facilitate future model building. Agglomerative hierarchical clustering and k-means clustering were used to identify clusters in a large, heterogeneous computer-aided diagnosis database based on mammographic findings (BI-RADS) and patient age. The clusters were examined in terms of their feature distributions. The clusters showed logical separation of distinct clinical subtypes such as architectural distortions, masses, and calcifications. Moreover, the common subtypes of masses and calcifications were stratified into clusters based on age groupings. The percent of the cases that were malignant was notably different among the clusters. Cluster analysis can provide a powerful tool in discerning the subgroups present in a large, heterogeneous computer-aided diagnosis database.

  1. CHOOSING A HEALTH INSTITUTION WITH MULTIPLE CORRESPONDENCE ANALYSIS AND CLUSTER ANALYSIS IN A POPULATION BASED STUDY

    Directory of Open Access Journals (Sweden)

    ASLI SUNER

    2013-06-01

    Full Text Available Multiple correspondence analysis is a method making easy to interpret the categorical variables given in contingency tables, showing the similarities, associations as well as divergences among these variables via graphics on a lower dimensional space. Clustering methods are helped to classify the grouped data according to their similarities and to get useful summarized data from them. In this study, interpretations of multiple correspondence analysis are supported by cluster analysis; factors affecting referred health institute such as age, disease group and health insurance are examined and it is aimed to compare results of the methods.

  2. Cluster analysis of rural, urban, and curbside atmospheric particle size data.

    Science.gov (United States)

    Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M

    2009-07-01

    Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.

  3. MMPI profiles of males accused of severe crimes: a cluster analysis

    NARCIS (Netherlands)

    Spaans, M.; Barendregt, M.; Muller, E.; Beurs, E. de; Nijman, H.L.I.; Rinne, T.

    2009-01-01

    In studies attempting to classify criminal offenders by cluster analysis of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) data, the number of clusters found varied between 10 (the Megargee System) and two (one cluster indicating no psychopathology and one exhibiting serious

  4. Environmentally based Cost-Benefit Analysis

    International Nuclear Information System (INIS)

    Magnell, M.

    1993-11-01

    The fundamentals of the basic elements of a new comprehensive economic assessment, MILA, developed in Sweden with inspiration from the Total Cost Assessment-model are presented. The core of the MILA approach is an expanded cost and benefit inventory. But MILA also includes a complementary addition of an internal waste stream analysis, a tool for evaluation of environmental conflicts in monetary terms, an extended time horizon and direct allocation of costs and revenues to products and processes. However, MILA does not ensure profitability for environmentally sound projects. Essentially, MILA is an approach of refining investment and profitability analysis of a project, investment or product. 109 refs., 38 figs

  5. Independent component analysis to detect clustered microcalcification breast cancers.

    Science.gov (United States)

    Gallardo-Caballero, R; García-Orellana, C J; García-Manso, A; González-Velasco, H M; Macías-Macías, M

    2012-01-01

    The presence of clustered microcalcifications is one of the earliest signs in breast cancer detection. Although there exist many studies broaching this problem, most of them are nonreproducible due to the use of proprietary image datasets. We use a known subset of the currently largest publicly available mammography database, the Digital Database for Screening Mammography (DDSM), to develop a computer-aided detection system that outperforms the current reproducible studies on the same mammogram set. This proposal is mainly based on the use of extracted image features obtained by independent component analysis, but we also study the inclusion of the patient's age as a nonimage feature which requires no human expertise. Our system achieves an average of 2.55 false positives per image at a sensitivity of 81.8% and 4.45 at a sensitivity of 91.8% in diagnosing the BCRP_CALC_1 subset of DDSM.

  6. Higgs Pair Production: Choosing Benchmarks With Cluster Analysis

    CERN Document Server

    Carvalho, Alexandra; Dorigo, Tommaso; Goertz, Florian; Gottardo, Carlo A.; Tosi, Mia

    2016-01-01

    New physics theories often depend on a large number of free parameters. The precise values of those parameters in some cases drastically affect the resulting phenomenology of fundamental physics processes, while in others finite variations can leave it basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics of different models; a clustering algorithm using that metric may then allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmark points are then guaranteed to be sensitive to a large area of the parameter space. In this doc...

  7. Entropy-rate clustering: cluster analysis via maximizing a submodular function subject to a matroid constraint.

    Science.gov (United States)

    Liu, Ming-Yu; Tuzel, Oncel; Ramalingam, Srikumar; Chellappa, Rama

    2014-01-01

    We propose a new objective function for clustering. This objective function consists of two components: the entropy rate of a random walk on a graph and a balancing term. The entropy rate favors formation of compact and homogeneous clusters, while the balancing function encourages clusters with similar sizes and penalizes larger clusters that aggressively group samples. We present a novel graph construction for the graph associated with the data and show that this construction induces a matroid--a combinatorial structure that generalizes the concept of linear independence in vector spaces. The clustering result is given by the graph topology that maximizes the objective function under the matroid constraint. By exploiting the submodular and monotonic properties of the objective function, we develop an efficient greedy algorithm. Furthermore, we prove an approximation bound of (1/2) for the optimality of the greedy solution. We validate the proposed algorithm on various benchmarks and show its competitive performances with respect to popular clustering algorithms. We further apply it for the task of superpixel segmentation. Experiments on the Berkeley segmentation data set reveal its superior performances over the state-of-the-art superpixel segmentation algorithms in all the standard evaluation metrics.

  8. The relationship between supplier networks and industrial clusters: an analysis based on the cluster mapping method

    Directory of Open Access Journals (Sweden)

    Ichiro IWASAKI

    2010-06-01

    Full Text Available Michael Porter’s concept of competitive advantages emphasizes the importance of regional cooperation of various actors in order to gain competitiveness on globalized markets. Foreign investors may play an important role in forming such cooperation networks. Their local suppliers tend to concentrate regionally. They can form, together with local institutions of education, research, financial and other services, development agencies, the nucleus of cooperative clusters. This paper deals with the relationship between supplier networks and clusters. Two main issues are discussed in more detail: the interest of multinational companies in entering regional clusters and the spillover effects that may stem from their participation. After the discussion on the theoretical background, the paper introduces a relatively new analytical method: “cluster mapping” - a method that can spot regional hot spots of specific economic activities with cluster building potential. Experience with the method was gathered in the US and in the European Union. After the discussion on the existing empirical evidence, the authors introduce their own cluster mapping results, which they obtained by using a refined version of the original methodology.

  9. Cluster Analysis in Nursing Research: An Introduction, Historical Perspective, and Future Directions.

    Science.gov (United States)

    Dunn, Heather; Quinn, Laurie; Corbridge, Susan J; Eldeirawi, Kamal; Kapella, Mary; Collins, Eileen G

    2017-05-01

    The use of cluster analysis in the nursing literature is limited to the creation of classifications of homogeneous groups and the discovery of new relationships. As such, it is important to provide clarity regarding its use and potential. The purpose of this article is to provide an introduction to distance-based, partitioning-based, and model-based cluster analysis methods commonly utilized in the nursing literature, provide a brief historical overview on the use of cluster analysis in nursing literature, and provide suggestions for future research. An electronic search included three bibliographic databases, PubMed, CINAHL and Web of Science. Key terms were cluster analysis and nursing. The use of cluster analysis in the nursing literature is increasing and expanding. The increased use of cluster analysis in the nursing literature is positioning this statistical method to result in insights that have the potential to change clinical practice.

  10. Performance analysis of clustering techniques over microarray data: A case study

    Science.gov (United States)

    Dash, Rasmita; Misra, Bijan Bihari

    2018-03-01

    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  11. Depth data research of GIS based on clustering analysis algorithm

    Science.gov (United States)

    Xiong, Yan; Xu, Wenli

    2018-03-01

    The data of GIS have spatial distribution. Geographic data has both spatial characteristics and attribute characteristics, and also changes with time. Therefore, the amount of data is very large. Nowadays, many industries and departments in the society are using GIS. However, without proper data analysis and mining scheme, GIS will not exert its maximum effectiveness and will waste a lot of data. In this paper, we use the geographic information demand of a national security department as the experimental object, combining the characteristics of GIS data, taking into account the characteristics of time, space, attributes and so on, and using cluster analysis algorithm. We further study the mining scheme for depth data, and get the algorithm model. This algorithm can automatically classify sample data, and then carry out exploratory analysis. The research shows that the algorithm model and the information mining scheme can quickly find hidden depth information from the surface data of GIS, thus improving the efficiency of the security department. This algorithm can also be extended to other fields.

  12. Assessment of Heavy Metal Pollution in Macrophytes, Water and Sediment of a Tropical Wetland System Using Hierarchical Cluster Analysis Technique

    OpenAIRE

    , N. Kumar J.I.; , M. Das; , R. Mukherji; , R.N. Kumar

    2011-01-01

    Heavy metal pollution in aquatic ecosystems is becoming a global phenomenon because these metals are indestructible and most of them have toxic effects on living organisms. Most of the fresh water bodies all over the world are getting contaminated thus declining their suitability. Therefore, monitoring and assessment of such freshwater systems has become an environmental concern. This study aims to elucidate the useful role of the cluster analysis to assess the relationship and interdependenc...

  13. Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data.

    Directory of Open Access Journals (Sweden)

    Marco Borri

    Full Text Available To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment.The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4. Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters.The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4, determined with cluster validation, produced the best separation between reducing and non-reducing clusters.The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.

  14. Analysis of the dynamical cluster approximation for the Hubbard model

    OpenAIRE

    Aryanpour, K.; Hettler, M. H.; Jarrell, M.

    2002-01-01

    We examine a central approximation of the recently introduced Dynamical Cluster Approximation (DCA) by example of the Hubbard model. By both analytical and numerical means we study non-compact and compact contributions to the thermodynamic potential. We show that approximating non-compact diagrams by their cluster analogs results in a larger systematic error as compared to the compact diagrams. Consequently, only the compact contributions should be taken from the cluster, whereas non-compact ...

  15. Spectral clustering for water body spectral types analysis

    Science.gov (United States)

    Huang, Leping; Li, Shijin; Wang, Lingli; Chen, Deqing

    2017-11-01

    In order to study the spectral types of water body in the whole country, the key issue of reservoir research is to obtain and to analyze the information of water body in the reservoir quantitatively and accurately. A new type of weight matrix is constructed by utilizing the spectral features and spatial features of the spectra from GF-1 remote sensing images comprehensively. Then an improved spectral clustering algorithm is proposed based on this weight matrix to cluster representative reservoirs in China. According to the internal clustering validity index which called Davies-Bouldin(DB) index, the best clustering number 7 is obtained. Compared with two clustering algorithms, the spectral clustering algorithm based only on spectral features and the K-means algorithm based on spectral features and spatial features, simulation results demonstrate that the proposed spectral clustering algorithm based on spectral features and spatial features has a higher clustering accuracy, which can better reflect the spatial clustering characteristics of representative reservoirs in various provinces in China - similar spectral properties and adjacent geographical locations.

  16. X-Ray Morphological Analysis of the Planck ESZ Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Andrade-Santos, Felipe; Randall, Scott; Kraft, Ralph [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Ettori, Stefano [INAF, Osservatorio Astronomico di Bologna, via Ranzani 1, I-40127 Bologna (Italy); Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W. [Laboratoire AIM, IRFU/Service d’Astrophysique—CEA/DRF—CNRS—Université Paris Diderot, Bât. 709, CEA-Saclay, F-91191 Gif-sur-Yvette Cedex (France)

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.

  17. Fuzzy and hard clustering analysis for thyroid disease.

    Science.gov (United States)

    Azar, Ahmad Taher; El-Said, Shaimaa Ahmed; Hassanien, Aboul Ella

    2013-07-01

    Thyroid hormones produced by the thyroid gland help regulation of the body's metabolism. A variety of methods have been proposed in the literature for thyroid disease classification. As far as we know, clustering techniques have not been used in thyroid diseases data set so far. This paper proposes a comparison between hard and fuzzy clustering algorithms for thyroid diseases data set in order to find the optimal number of clusters. Different scalar validity measures are used in comparing the performances of the proposed clustering systems. To demonstrate the performance of each algorithm, the feature values that represent thyroid disease are used as input for the system. Several runs are carried out and recorded with a different number of clusters being specified for each run (between 2 and 11), so as to establish the optimum number of clusters. To find the optimal number of clusters, the so-called elbow criterion is applied. The experimental results revealed that for all algorithms, the elbow was located at c=3. The clustering results for all algorithms are then visualized by the Sammon mapping method to find a low-dimensional (normally 2D or 3D) representation of a set of points distributed in a high dimensional pattern space. At the end of this study, some recommendations are formulated to improve determining the actual number of clusters present in the data set. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  18. Environmental impact analysis for typical office facades

    Energy Technology Data Exchange (ETDEWEB)

    Kolokotroni, M.; Robinson-Gayle, S.; Tanno, S.; Cripps, A.

    2004-02-01

    The design of a building facade influences internal thermal and lighting conditions and use associated with the provision of these conditions. Key decisions about the building are usually taken during the concept design stage of a building, while decisions about method of providing the environmental conditions are often made later in the design. This dilemma is addressed by the development of a concept design tool that allows the team to investigate the effect of facade design on the resulting internal environmental conditions, energy use and environmental impact. The concept design tool was developed performing detailed thermal, lighting and environmental modelling for a number of office building facade designs and a range of parameters that affect directly the environmental performance of an office building. The results are presented in a user-friendly interface as a minimum number of inputs. Key parameter outputs (such as temperature, lighting, heating/cooling energy demand, embodied energy and eco-points) can then be viewed, more detailed analysis can also be created for specified facade designs. A parametric analysis of the summary result outputs for selected facade parameters indicates that natural ventilation and cooling can reduce the environmental impact of offices by up to 16%, although the energy demand could increase significantly. Improving the construction standard and reducing the internal heat loads can reduce the environmental impact by up to 2 this tool at early design stages will benefit the design team through an improved understanding of the dynamics between facade design and building services and assist with a more integrated approach. (author)

  19. Sensitization trajectories in childhood revealed by using a cluster analysis.

    Science.gov (United States)

    Schoos, Ann-Marie M; Chawes, Bo L; Melén, Erik; Bergström, Anna; Kull, Inger; Wickman, Magnus; Bønnelykke, Klaus; Bisgaard, Hans; Rasmussen, Morten A

    2017-12-01

    Assessment of sensitization at a single time point during childhood provides limited clinical information. We hypothesized that sensitization develops as specific patterns with respect to age at debut, development over time, and involved allergens and that such patterns might be more biologically and clinically relevant. We sought to explore latent patterns of sensitization during the first 6 years of life and investigate whether such patterns associate with the development of asthma, rhinitis, and eczema. We investigated 398 children from the at-risk Copenhagen Prospective Studies on Asthma in Childhood 2000 (COPSAC 2000 ) birth cohort with specific IgE against 13 common food and inhalant allergens at the ages of ½, 1½, 4, and 6 years. An unsupervised cluster analysis for 3-dimensional data (nonnegative sparse parallel factor analysis) was used to extract latent patterns explicitly characterizing temporal development of sensitization while clustering allergens and children. Subsequently, these patterns were investigated in relation to asthma, rhinitis, and eczema. Verification was sought in an independent unselected birth cohort (BAMSE) constituting 3051 children with specific IgE against the same allergens at 4 and 8 years of age. The nonnegative sparse parallel factor analysis indicated a complex latent structure involving 7 age- and allergen-specific patterns in the COPSAC 2000 birth cohort data: (1) dog/cat/horse, (2) timothy grass/birch, (3) molds, (4) house dust mites, (5) peanut/wheat flour/mugwort, (6) peanut/soybean, and (7) egg/milk/wheat flour. Asthma was solely associated with pattern 1 (odds ratio [OR], 3.3; 95% CI, 1.5-7.2), rhinitis with patterns 1 to 4 and 6 (OR, 2.2-4.3), and eczema with patterns 1 to 3 and 5 to 7 (OR, 1.6-2.5). All 7 patterns were verified in the independent BAMSE cohort (R 2  > 0.89). This study suggests the presence of specific sensitization patterns in early childhood differentially associated with development of

  20. Space-time analysis of testicular cancer clusters using residential histories: a case-control study in Denmark.

    Directory of Open Access Journals (Sweden)

    Chantel D Sloan

    Full Text Available Though the etiology is largely unknown, testicular cancer incidence has seen recent significant increases in northern Europe and throughout many Western regions. The most common cancer in males under age 40, age period cohort models have posited exposures in the in utero environment or in early childhood as possible causes of increased risk of testicular cancer. Some of these factors may be tied to geography through being associated with behavioral, cultural, sociodemographic or built environment characteristics. If so, this could result in detectable geographic clusters of cases that could lead to hypotheses regarding environmental targets for intervention. Given a latency period between exposure to an environmental carcinogen and testicular cancer diagnosis, mobility histories are beneficial for spatial cluster analyses. Nearest-neighbor based Q-statistics allow for the incorporation of changes in residency in spatial disease cluster detection. Using these methods, a space-time cluster analysis was conducted on a population-wide case-control population selected from the Danish Cancer Registry with mobility histories since 1971 extracted from the Danish Civil Registration System. Cases (N=3297 were diagnosed between 1991 and 2003, and two sets of controls (N=3297 for each set matched on sex and date of birth were included in the study. We also examined spatial patterns in maternal residential history for those cases and controls born in 1971 or later (N= 589 case-control pairs. Several small clusters were detected when aligning individuals by year prior to diagnosis, age at diagnosis and calendar year of diagnosis. However, the largest of these clusters contained only 2 statistically significant individuals at their center, and were not replicated in SaTScan spatial-only analyses which are less susceptible to multiple testing bias. We found little evidence of local clusters in residential histories of testicular cancer cases in this Danish

  1. Space-time analysis of testicular cancer clusters using residential histories: a case-control study in Denmark.

    Science.gov (United States)

    Sloan, Chantel D; Nordsborg, Rikke B; Jacquez, Geoffrey M; Raaschou-Nielsen, Ole; Meliker, Jaymie R

    2015-01-01

    Though the etiology is largely unknown, testicular cancer incidence has seen recent significant increases in northern Europe and throughout many Western regions. The most common cancer in males under age 40, age period cohort models have posited exposures in the in utero environment or in early childhood as possible causes of increased risk of testicular cancer. Some of these factors may be tied to geography through being associated with behavioral, cultural, sociodemographic or built environment characteristics. If so, this could result in detectable geographic clusters of cases that could lead to hypotheses regarding environmental targets for intervention. Given a latency period between exposure to an environmental carcinogen and testicular cancer diagnosis, mobility histories are beneficial for spatial cluster analyses. Nearest-neighbor based Q-statistics allow for the incorporation of changes in residency in spatial disease cluster detection. Using these methods, a space-time cluster analysis was conducted on a population-wide case-control population selected from the Danish Cancer Registry with mobility histories since 1971 extracted from the Danish Civil Registration System. Cases (N=3297) were diagnosed between 1991 and 2003, and two sets of controls (N=3297 for each set) matched on sex and date of birth were included in the study. We also examined spatial patterns in maternal residential history for those cases and controls born in 1971 or later (N= 589 case-control pairs). Several small clusters were detected when aligning individuals by year prior to diagnosis, age at diagnosis and calendar year of diagnosis. However, the largest of these clusters contained only 2 statistically significant individuals at their center, and were not replicated in SaTScan spatial-only analyses which are less susceptible to multiple testing bias. We found little evidence of local clusters in residential histories of testicular cancer cases in this Danish population.

  2. Frailty phenotypes in the elderly based on cluster analysis

    DEFF Research Database (Denmark)

    Dato, Serena; Montesanto, Alberto; Lagani, Vincenzo

    2012-01-01

    Frailty is a physiological state characterized by the deregulation of multiple physiologic systems of an aging organism determining the loss of homeostatic capacity, which exposes the elderly to disability, diseases, and finally death. An operative definition of frailty, useful for the classifica......Frailty is a physiological state characterized by the deregulation of multiple physiologic systems of an aging organism determining the loss of homeostatic capacity, which exposes the elderly to disability, diseases, and finally death. An operative definition of frailty, useful...... genetic background on the frailty status is still questioned. We investigated the applicability of a cluster analysis approach based on specific geriatric parameters, previously set up and validated in a southern Italian population, to two large longitudinal Danish samples. In both cohorts, we identified...... groups of subjects homogeneous for their frailty status and characterized by different survival patterns. A subsequent survival analysis availing of Accelerated Failure Time models allowed us to formulate an operative index able to correlate classification variables with survival probability. From...

  3. Adapting Spectral Co-clustering to Documents and Terms Using Latent Semantic Analysis

    Science.gov (United States)

    Park, Laurence A. F.; Leckie, Christopher A.; Ramamohanarao, Kotagiri; Bezdek, James C.

    Spectral co-clustering is a generic method of computing co-clusters of relational data, such as sets of documents and their terms. Latent semantic analysis is a method of document and term smoothing that can assist in the information retrieval process. In this article we examine the process behind spectral clustering for documents and terms, and compare it to Latent Semantic Analysis. We show that both spectral co-clustering and LSA follow the same process, using different normalisation schemes and metrics. By combining the properties of the two co-clustering methods, we obtain an improved co-clustering method for document-term relational data that provides an increase in the cluster quality of 33.0%.

  4. Cluster Analysis: Unsupervised Learning via Supervised Learning with a Non-convex Penalty.

    Science.gov (United States)

    Pan, Wei; Shen, Xiaotong; Liu, Binghui

    2013-07-01

    Clustering analysis is widely used in many fields. Traditionally clustering is regarded as unsupervised learning for its lack of a class label or a quantitative response variable, which in contrast is present in supervised learning such as classification and regression. Here we formulate clustering as penalized regression with grouping pursuit. In addition to the novel use of a non-convex group penalty and its associated unique operating characteristics in the proposed clustering method, a main advantage of this formulation is its allowing borrowing some well established results in classification and regression, such as model selection criteria to select the number of clusters, a difficult problem in clustering analysis. In particular, we propose using the generalized cross-validation (GCV) based on generalized degrees of freedom (GDF) to select the number of clusters. We use a few simple numerical examples to compare our proposed method with some existing approaches, demonstrating our method's promising performance.

  5. Identification and validation of asthma phenotypes in Chinese population using cluster analysis.

    Science.gov (United States)

    Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang

    2017-10-01

    Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  6. An effective fuzzy kernel clustering analysis approach for gene expression data.

    Science.gov (United States)

    Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao

    2015-01-01

    Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.

  7. ANALYSIS OF DEVELOPING BATIK INDUSTRY CLUSTER IN BAKARAN VILLAGE CENTRAL JAVA PROVINCE

    Directory of Open Access Journals (Sweden)

    Hermanto Hermanto

    2017-06-01

    Full Text Available SMEs grow in a cluster in a certain geographical area. The entrepreneurs grow and thrive through the business cluster. Central Java Province has a lot of business clusters in improving the regional economy, one of which is batik industry cluster. Pati Regency is one of regencies / city in Central Java that has the lowest turnover. Batik industy cluster in Pati develops quite well, which can be seen from the increasing number of batik industry incorporated in the cluster. This research examines the strategy of developing the batik industry cluster in Pati Regency. The purpose of this research is to determine the proper strategy for developing the batik industry clusters in Pati. The method of research is quantitative. The analysis tool of this research is the Strengths, Weakness, Opportunity, Threats (SWOT analysis. The result of SWOT analysis in this research shows that the proper strategy for developing the batik industry cluster in Pati is optimizing the management of batik business cluster in Bakaran Village; the local government provides information of the facility of business capital loans; the utilization of labors from Bakaran Village while improving the quality of labors by training, and marketing the Bakaran batik to the broader markets while maintaining the quality of batik. Advice that can be given from this research is that the parties who have a role in batik industry cluster development in Bakaran Village, Pati Regency, such as the Local Government.

  8. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    Science.gov (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results

    Directory of Open Access Journals (Sweden)

    Medvedovic Mario

    2011-01-01

    Full Text Available Abstract Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  10. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results.

    Science.gov (United States)

    Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario

    2011-01-17

    Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  11. Participant intimacy: A cluster analysis of the intranuclear cascade

    International Nuclear Information System (INIS)

    Cugnon, J.; Knoll, J.; Randrup, J.

    1981-01-01

    The intranuclear cascade for relativistic nuclear collisions is analyzed in terms of clusters consisting of groups of nucleons which are dynamically linked to each other by violent interactions. The formation cross sections for the different cluster types as well as their intrinsic dynamics are studied and compared with the predictions of the linear cascade model ( rows-on-rows ). (orig.)

  12. Participant intimacy A cluster analysis of the intranuclear cascadet

    Science.gov (United States)

    Cugnon, J.; Knoll, J.; Randrup, J.

    1981-05-01

    The intranuclear cascade for relativistic nuclear collisions is analyzed in terms of "clusters" consisting of groups of nucleons which are dynamically linked to each other by violent interactions. The formation cross sections for the different cluster types as well as their intrinsic dynamics are studied and compared with the predictions of the linear cascade model ("rows-on-rows").

  13. An evaluation of centrality measures used in cluster analysis

    Science.gov (United States)

    Engström, Christopher; Silvestrov, Sergei

    2014-12-01

    Clustering of data into groups of similar objects plays an important part when analysing many types of data, especially when the datasets are large as they often are in for example bioinformatics, social networks and computational linguistics. Many clustering algorithms such as K-means and some types of hierarchical clustering need a number of centroids representing the 'center' of the clusters. The choice of centroids for the initial clusters often plays an important role in the quality of the clusters. Since a data point with a high centrality supposedly lies close to the 'center' of some cluster, this can be used to assign centroids rather than through some other method such as picking them at random. Some work have been done to evaluate the use of centrality measures such as degree, betweenness and eigenvector centrality in clustering algorithms. The aim of this article is to compare and evaluate the usefulness of a number of common centrality measures such as the above mentioned and others such as PageRank and related measures.

  14. Cluster analysis of HZE particle tracks as applied to space radiobiology problems

    International Nuclear Information System (INIS)

    Batmunkh, M.; Bayarchimeg, L.; Lkhagva, O.; Belov, O.

    2013-01-01

    A cluster analysis is performed of ionizations in tracks produced by the most abundant nuclei in the charge and energy spectra of the galactic cosmic rays. The frequency distribution of clusters is estimated for cluster sizes comparable to the DNA molecule at different packaging levels. For this purpose, an improved K-mean-based algorithm is suggested. This technique allows processing particle tracks containing a large number of ionization events without setting the number of clusters as an input parameter. Using this method, the ionization distribution pattern is analyzed depending on the cluster size and particle's linear energy transfer

  15. Application of cluster analysis and unsupervised learning to multivariate tissue characterization

    International Nuclear Information System (INIS)

    Momenan, R.; Insana, M.F.; Wagner, R.F.; Garra, B.S.; Loew, M.H.

    1987-01-01

    This paper describes a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis. These techniques are being applied to unsupervised ultrasonic image segmentation and tissue characterization. The performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier. In these comparisons two objectives are sought: a) How well does the clustering method group the data?; b) Do the clusters correspond to known tissue classes? The first question is investigated by a measure of cluster similarity and dispersion. The second question involves a comparison with a supervised technique using labeled data

  16. Environmental Resources Analysis System, A Prototype DSS

    Science.gov (United States)

    Flug, M.; Campbell, S.G.; Bizier, P.; DeBarry, P.

    2003-01-01

    Since the 1960's, an increase in the public's environmental ethics, federal species preservation, water quality protection, and interest in free flowing rivers have evolved to the current concern for stewardship and conservation of natural resources. This heightened environmental awareness creates an appetite for data, models, information management, and systematic analysis of multiple scientific disciplines. A good example of this information and analysis need resides in the Green and Yampa Rivers, tributary to the Upper Colorado River. These rivers are home to endangered native fish species including the pikeminnow and razorback sucker. Two dams, Fontenelle and Flaming Gorge, impound the Green River headwaters. The respective reservoirs store water supplies as well as generate hydropower. Conversely, the Yampa River is considered unregulated and encompasses most of Dinosaur National Monument. Recreation is highly regarded on both rivers including fishing, whitewater rafting, and aesthetic values. Vast areas of irrigated agriculture, forestry, and mineral extraction also surround these rivers. To address this information need, we developed a prototype Environmental Resources Analysis System (ERAS) spreadsheet-based decision support system (DSS). ERAS provides access to historic data sets, scientific information, statistical analysis, model outputs, and comparative methods all in a familiar and user-friendly format. This research project demonstrates a simplified decision support system for use by a diverse mix of resource managers, special interest groups, and individuals concerned about the sustainability of the Green and Yampa River ecosystem.

  17. Cluster analysis in soft X-ray spectromicroscopy: finding the patterns in complex specimens

    International Nuclear Information System (INIS)

    Lerotic, M.; Jacobsen, C.

    2004-01-01

    Full text: Soft x-ray spectromicroscopy provides spectral data on the chemical speciation of light elements at sub-100 nanometer spatial resolution. When all chemical species in a specimen are known and separately characterized, existing approaches can be used to measure the concentration of each component at each pixel. In other situations such as in biology or environmental science, this approach may not be possible. A method to find natural groupings of data without prior knowledge of the spectra of all components will be presented. Principal component analysis is used to orthogonalize spectromicroscopy data, and discard much of the noise present in data set. Then cluster analysis is used to find a hierarchical classification of pixels with similar spectra, to extract representative, cluster-averaged spectra with good signal-to-noise ratio, and to obtain gradations of concentration of these representative spectra at each pixel. The method is illustrated with a simulated data set of organic compounds, and a mixture of lutetium in hematite used to understand colloidal transport properties of radionuclides. We gratefully acknowledge funding from the National Institutes for Health under contract R01 EB00479-01A1, and from the National Science Foundation under contracts OCE-0221029 and CHE-0221934

  18. A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

    Science.gov (United States)

    Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

    2016-02-01

    Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.

  19. Multidimensional cluster stability analysis from a Brazilian Bradyrhizobium sp. RFLP/PCR data set

    Science.gov (United States)

    Milagre, S. T.; Maciel, C. D.; Shinoda, A. A.; Hungria, M.; Almeida, J. R. B.

    2009-05-01

    The taxonomy of the N2-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradyrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster.

  20. Performance Analysis of Cluster Formation in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Edgar Romo Montiel

    2017-12-01

    Full Text Available Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  1. Performance Analysis of Cluster Formation in Wireless Sensor Networks.

    Science.gov (United States)

    Montiel, Edgar Romo; Rivero-Angeles, Mario E; Rubino, Gerardo; Molina-Lozano, Heron; Menchaca-Mendez, Rolando; Menchaca-Mendez, Ricardo

    2017-12-13

    Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN) use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  2. Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach

    Science.gov (United States)

    Brown, S. J.; White, S.; Power, N.

    2015-01-01

    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…

  3. Cluster Computing For Real Time Seismic Array Analysis.

    Science.gov (United States)

    Martini, M.; Giudicepietro, F.

    A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by

  4. Phenotypic clustering: a novel method for microglial morphology analysis.

    Science.gov (United States)

    Verdonk, Franck; Roux, Pascal; Flamant, Patricia; Fiette, Laurence; Bozza, Fernando A; Simard, Sébastien; Lemaire, Marc; Plaud, Benoit; Shorte, Spencer L; Sharshar, Tarek; Chrétien, Fabrice; Danckaert, Anne

    2016-06-17

    Microglial cells are tissue-resident macrophages of the central nervous system. They are extremely dynamic, sensitive to their microenvironment and present a characteristic complex and heterogeneous morphology and distribution within the brain tissue. Many experimental clues highlight a strong link between their morphology and their function in response to aggression. However, due to their complex "dendritic-like" aspect that constitutes the major pool of murine microglial cells and their dense network, precise and powerful morphological studies are not easy to realize and complicate correlation with molecular or clinical parameters. Using the knock-in mouse model CX3CR1(GFP/+), we developed a 3D automated confocal tissue imaging system coupled with morphological modelling of many thousands of microglial cells revealing precise and quantitative assessment of major cell features: cell density, cell body area, cytoplasm area and number of primary, secondary and tertiary processes. We determined two morphological criteria that are the complexity index (CI) and the covered environment area (CEA) allowing an innovative approach lying in (i) an accurate and objective study of morphological changes in healthy or pathological condition, (ii) an in situ mapping of the microglial distribution in different neuroanatomical regions and (iii) a study of the clustering of numerous cells, allowing us to discriminate different sub-populations. Our results on more than 20,000 cells by condition confirm at baseline a regional heterogeneity of the microglial distribution and phenotype that persists after induction of neuroinflammation by systemic injection of lipopolysaccharide (LPS). Using clustering analysis, we highlight that, at resting state, microglial cells are distributed in four microglial sub-populations defined by their CI and CEA with a regional pattern and a specific behaviour after challenge. Our results counteract the classical view of a homogenous regional resting

  5. Global classification of human facial healthy skin using PLS discriminant analysis and clustering analysis.

    Science.gov (United States)

    Guinot, C; Latreille, J; Tenenhaus, M; Malvy, D J

    2001-04-01

    Today's classifications of healthy skin are predominantly based on a very limited number of skin characteristics, such as skin oiliness or susceptibility to sun exposure. The aim of the present analysis was to set up a global classification of healthy facial skin, using mathematical models. This classification is based on clinical, biophysical skin characteristics and self-reported information related to the skin, as well as the results of a theoretical skin classification assessed separately for the frontal and the malar zones of the face. In order to maximize the predictive power of the models with a minimum of variables, the Partial Least Square (PLS) discriminant analysis method was used. The resulting PLS components were subjected to clustering analyses to identify the plausible number of clusters and to group the individuals according to their proximities. Using this approach, four PLS components could be constructed and six clusters were found relevant. So, from the 36 hypothetical combinations of the theoretical skin types classification, we tended to a strengthened six classes proposal. Our data suggest that the association of the PLS discriminant analysis and the clustering methods leads to a valid and simple way to classify healthy human skin and represents a potentially useful tool for cosmetic and dermatological research.

  6. Clustering analysis of malware behavior using Self Organizing Map

    DEFF Research Database (Denmark)

    Pirscoveanu, Radu-Stefan; Stevanovic, Matija; Pedersen, Jens Myrup

    2016-01-01

    For the time being, malware behavioral classification is performed by means of Anti-Virus (AV) generated labels. The paper investigates the inconsistencies associated with current practices by evaluating the identified differences between current vendors. In this paper we rely on Self Organizing...... Map, an unsupervised machine learning algorithm, for generating clusters that capture the similarities between malware behavior. A data set of approximately 270,000 samples was used to generate the behavioral profile of malicious types in order to compare the outcome of the proposed clustering...... accurate results based on the clusters created by competitive and cooperative algorithms like Self Organizing Map that better describe the behavioral profile of malware....

  7. Trace-element analysis in environmental sciences

    International Nuclear Information System (INIS)

    Valkovic, V.; Moschini, G.

    1988-01-01

    The use of charged-particle accelerators in trace-element analysis in the field of environmental sciences is described in this article. Nuclear reactions, charged-particle-induced X-ray emission as well as other nuclear and atomic processes can be used individually, or combined, in developing adequate analytical systems. In addition to concentration levels, concentration levels, concentration profiles can be measured, resulting in unique information. Some examples of experiments performed are described together with the suggestions for future measurements [pt

  8. Nuclear techniques for analysis of environmental samples

    International Nuclear Information System (INIS)

    1986-12-01

    The main purposes of this meeting were to establish the state-of-the-art in the field, to identify new research and development that is required to provide an adequate framework for analysis of environmental samples and to assess needs and possibilities for international cooperation in problem areas. This technical report was prepared on the subject based on the contributions made by the participants. A separate abstract was prepared for each of the 9 papers

  9. Application and research of fuzzy clustering analysis algorithm under “micro-lecture” English teaching mode

    Directory of Open Access Journals (Sweden)

    Shi Ying

    2016-01-01

    Full Text Available The fuzzy clustering algorithm is to classify the data or indicators with a greater degree of similarity based on the principle of the same type of individuals possessing a greater similarity, and different types of individuals possessing differences, establish clear category boundaries, form any shape of relationship clusters in the solving process, and input the research indicators at random, in order to accurately analyze the significance of the indicators in the algorithm. The evaluation value of the clustering analysis can be obtained by the establishment of the fuzzy factor set based on the membership analysis, and the evaluation result can be analyzed through reference to the evaluation indicators of the fuzzy clustering analysis. The “micro-lecture” English teaching mode can be estimated and the analysis indicators can be rationally established based on the fuzzy clustering analysis algorithm, with better algorithm applicability.

  10. Visual cluster analysis and pattern recognition template and methods

    Science.gov (United States)

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    1999-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  11. Improving hierarchical clustering of genotypic data via principal component analysis

    NARCIS (Netherlands)

    Odong, T.L.; Heerwaarden, van J.; Hintum, van T.J.L.; Eeuwijk, van F.A.; Jansen, J.

    2013-01-01

    Understanding the genetic structure of germplasm collections is a prerequisite for effective and efficient use of crop genetic resources in genebanks. Currently, hierarchical clustering techniques are most popular for describing genetic structure in germplasm collections. Traditionally performed

  12. Visual cluster analysis and pattern recognition template and methods

    Energy Technology Data Exchange (ETDEWEB)

    Osbourn, G.C.; Martinez, R.F.

    1993-12-31

    This invention is comprised of a method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  13. Cluster decay analysis and related structure effects of fissionable ...

    Indian Academy of Sciences (India)

    2015-08-01

    Aug 1, 2015 ... Keywords. Collective clusterization; deformations and orientations; fission; heavy and superheavy nuclei. ... Author Affiliations. Manoj K Sharma1 Gurvinder Kaur1. School of Physics and Materials Science, Thapar University, Patiala 147 004, India ...

  14. Coherent Energy and Environmental System Analysis

    DEFF Research Database (Denmark)

    Hvelplund, Frede; Mathiesen, Brian Vad; Østergaard, Poul Alberg

    This report presents a summary of results of the strategic research project “Coherent Energy and Environmental System Analysis” (CEESA) which was conducted in the period 2007-2011 and funded by the Danish Strategic Research Council together with the participating parties. The project was interdis......This report presents a summary of results of the strategic research project “Coherent Energy and Environmental System Analysis” (CEESA) which was conducted in the period 2007-2011 and funded by the Danish Strategic Research Council together with the participating parties. The project...... energy and environmental analysis tools as well as analyses of the design and implementation of future renewable energy systems. For practical reasons, the work has been carried out as an interaction between five work packages, and a number of reports, papers and tools have been reported separately from...... of the different project parts in a coherent way by presenting tools and methodologies as well as analyses of the design and implementation of renewable energy systems – including both energy and environmental aspects. The authors listed in the report represent those who have contributed directly as well...

  15. The quantitative analysis of silicon carbide surface smoothing by Ar and Xe cluster ions

    Science.gov (United States)

    Ieshkin, A. E.; Kireev, D. S.; Ermakov, Yu. A.; Trifonov, A. S.; Presnov, D. E.; Garshev, A. V.; Anufriev, Yu. V.; Prokhorova, I. G.; Krupenin, V. A.; Chernysh, V. S.

    2018-04-01

    The gas cluster ion beam technique was used for the silicon carbide crystal surface smoothing. The effect of processing by two inert cluster ions, argon and xenon, was quantitatively compared. While argon is a standard element for GCIB, results for xenon clusters were not reported yet. Scanning probe microscopy and high resolution transmission electron microscopy techniques were used for the analysis of the surface roughness and surface crystal layer quality. The gas cluster ion beam processing results in surface relief smoothing down to average roughness about 1 nm for both elements. It was shown that xenon as the working gas is more effective: sputtering rate for xenon clusters is 2.5 times higher than for argon at the same beam energy. High resolution transmission electron microscopy analysis of the surface defect layer gives values of 7 ± 2 nm and 8 ± 2 nm for treatment with argon and xenon clusters.

  16. Approaches to natural resource inventory and analysis on the Oak Ridge Environmental Research Park

    Energy Technology Data Exchange (ETDEWEB)

    Kitchings, J. T.; Mann, L. K.; Joslin, D. J.; Bunnell, R. C.

    1977-01-01

    The principal effort of the Department of Energy's Environmental Research Park program on the Oak Ridge Reservation is directed at identification and preservation of a diverse assortment of natural communities representative of the Appalachian region of East Tennessee. Designation of natural areas provides a degree of protection for unique plant and animal species. Concommitantly, establishment of research reference areas provides sites which will be used to evaluate changes brought about in similar natural communities as a result of activities related to energy-producing technologies. Agglomerative cluster analysis of 184 continuous forest inventory (CFI) plots on the Reservation initially was used to objectively define forest types. Thus, types identified by cluster analysis formed a basis for determining what forest elements were present and which were representative of the Appalachian region. Subsequently, cluster analysis similarly was used within these research areas to define the overstory, understory, and shrub structure of the particular forest community.

  17. Cluster analysis of fruit and vegetable-related perceptions: an alternative approach of consumer segmentation.

    Science.gov (United States)

    Simunaniemi, A-M; Nydahl, M; Andersson, A

    2013-02-01

    Audience segmentation optimises health communication aimed to promote healthy dietary habits, such as fruit and vegetable (F&V) consumption. The present study aimed to segment respondents into clusters based on F&V-related perceptions, and to describe these clusters with respect to F&V consumption and sex. The cross-sectional study was conducted using a semi-structured questionnaire. The respondents were randomly selected among Swedish adults (n = 1304; response rate 51%; 56% women). A two-step cluster analysis was conducted followed by a binary logistic regression with cluster membership as a dependent variable. The clusters were compared using t-tests and chi-squared tests. P vegetables (both sexes) and fruit (women only), whereas men in the Indifferent cluster (n = 715) consumed more juice. Indifferent cluster reported more F&V consumption preventing factors, such as storage and preparation difficulties and low satisfaction with F&V selection and price. Not liking or not having a habit of F&V consumption, laziness, forgetting and a lack of time were mentioned as main barriers to F&V consumption. The Indifferent cluster reports more practical and life-style related difficulties. The Positive cluster consumes more vegetables, perceives fewer F&V-related difficulties, and looks for more dietary information. The findings confirm that cluster analysis is an appropriate way of identifying consumer subgroups for targeted health and nutrition communication. © 2012 The Authors. Journal of Human Nutrition and Dietetics © 2012 The British Dietetic Association Ltd.

  18. Global myeloma research clusters, output, and citations: a bibliometric mapping and clustering analysis.

    Directory of Open Access Journals (Sweden)

    Jens Peter Andersen

    Full Text Available International collaborative research is a mechanism for improving the development of disease-specific therapies and for improving health at the population level. However, limited data are available to assess the trends in research output related to orphan diseases.We used bibliometric mapping and clustering methods to illustrate the level of fragmentation in myeloma research and the development of collaborative efforts. Publication data from Thomson Reuters Web of Science were retrieved for 2005-2009 and followed until 2013. We created a database of multiple myeloma publications, and we analysed impact and co-authorship density to identify scientific collaborations, developments, and international key players over time. The global annual publication volume for studies on multiple myeloma increased from 1,144 in 2005 to 1,628 in 2009, which represents a 43% increase. This increase is high compared to the 24% and 14% increases observed for lymphoma and leukaemia. The major proportion (>90% of publications was from the US and EU over the study period. The output and impact in terms of citations, identified several successful groups with a large number of intra-cluster collaborations in the US and EU. The US-based myeloma clusters clearly stand out as the most productive and highly cited, and the European Myeloma Network members exhibited a doubling of collaborative publications from 2005 to 2009, still increasing up to 2013.Multiple myeloma research output has increased substantially in the past decade. The fragmented European myeloma research activities based on national or regional groups are progressing, but they require a broad range of targeted research investments to improve multiple myeloma health care.

  19. Topic modeling for cluster analysis of large biological and medical datasets.

    Science.gov (United States)

    Zhao, Weizhong; Zou, Wen; Chen, James J

    2014-01-01

    The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting

  20. Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis.

    Science.gov (United States)

    Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik

    2017-11-01

    Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two. Copyright © 2017 The Korean Academy of Asthma, Allergy and Clinical Immunology · The Korean Academy of Pediatric Allergy and Respiratory Disease

  1. Method for exploratory cluster analysis and visualisation of single-trial ERP ensembles.

    Science.gov (United States)

    Williams, N J; Nasuto, S J; Saddy, J D

    2015-07-30

    The validity of ensemble averaging on event-related potential (ERP) data has been questioned, due to its assumption that the ERP is identical across trials. Thus, there is a need for preliminary testing for cluster structure in the data. We propose a complete pipeline for the cluster analysis of ERP data. To increase the signal-to-noise (SNR) ratio of the raw single-trials, we used a denoising method based on Empirical Mode Decomposition (EMD). Next, we used a bootstrap-based method to determine the number of clusters, through a measure called the Stability Index (SI). We then used a clustering algorithm based on a Genetic Algorithm (GA) to define initial cluster centroids for subsequent k-means clustering. Finally, we visualised the clustering results through a scheme based on Principal Component Analysis (PCA). After validating the pipeline on simulated data, we tested it on data from two experiments - a P300 speller paradigm on a single subject and a language processing study on 25 subjects. Results revealed evidence for the existence of 6 clusters in one experimental condition from the language processing study. Further, a two-way chi-square test revealed an influence of subject on cluster membership. Our analysis operates on denoised single-trials, the number of clusters are determined in a principled manner and the results are presented through an intuitive visualisation. Given the cluster structure in some experimental conditions, we suggest application of cluster analysis as a preliminary step before ensemble averaging. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Environmental systems analysis of wastewater management

    International Nuclear Information System (INIS)

    Kaerrman, Erik

    2000-01-01

    The history of wastewater management tells us that efforts have been made at solving only one problem at the time; sanitation during the first half of the 20th Century followed by eutrophication of lakes and sea and, for the past ten years, recycling of nutrients. After the 'Brundtland Report', 1987, a reversal of the debate occurred where water management was discussed in a more holistic manner than before. The concept sustainable development became widely accepted and was put into practice. This thesis suggests a framework for evaluating the sustainability of wastewater systems, which contains the use of criteria and system analytical evaluation methods matching each criterion. The main categories of criteria are identified as: Health and Hygiene, Social and Cultural, Environmental, Economic and Functional and Technical. The usability of different concepts of Environmental Systems Analysis for evaluating environmental criteria of wastewater systems is also investigated. These studies show that a substance-flow model combined with evaluation methods from Life Cycle Assessment (LCA), sometimes complemented with Exergy Analysis or Analysis of Primary Energy, is a beneficial approach for evaluating environmental impacts and the usage of resources. The substance-flow model ORWARE (ORganic WAste REsearch) combined with LCA was used to compare four systems structures for the management of household wastewater and solid organic waste, namely Conventional System, Irrigation of Energy Forests, Liquid Composting and Urine Separation. This study shows a potential for further development of the three alternative systems. The comparative study also included some development of system analytical methods. This thesis shows how the contribution from oxidation of ammonia should be included in the eutrophication impact category. Furthermore, a method is given for prioritization of the most relevant impacts from wastewater management by using normalisation of these impacts in

  3. The young star cluster population of M51 with LEGUS - II. Testing environmental dependencies

    Science.gov (United States)

    Messa, Matteo; Adamo, A.; Calzetti, D.; Reina-Campos, M.; Colombo, D.; Schinnerer, E.; Chandar, R.; Dale, D. A.; Gouliermis, D. A.; Grasha, K.; Grebel, E. K.; Elmegreen, B. G.; Fumagalli, M.; Johnson, K. E.; Kruijssen, J. M. D.; Östlin, G.; Shabani, F.; Smith, L. J.; Whitmore, B. C.

    2018-03-01

    It has recently been established that the properties of young star clusters (YSCs) can vary as a function of the galactic environment in which they are found. We use the cluster catalogue produced by the Legacy Extragalactic UV Survey (LEGUS) collaboration to investigate cluster properties in the spiral galaxy M51. We analyse the cluster population as a function of galactocentric distance and in arm and inter-arm regions. The cluster mass function exhibits a similar shape at all radial bins, described by a power law with a slope close to -2 and an exponential truncation around 105 M⊙. While the mass functions of the YSCs in the spiral arm and inter-arm regions have similar truncation masses, the inter-arm region mass function has a significantly steeper slope than the one in the arm region; a trend that is also observed in the giant molecular cloud mass function and predicted by simulations. The age distribution of clusters is dependent on the region considered, and is consistent with rapid disruption only in dense regions, while little disruption is observed at large galactocentric distances and in the inter-arm region. The fraction of stars forming in clusters does not show radial variations, despite the drop in the H2 surface density measured as function of galactocentric distance. We suggest that the higher disruption rate observed in the inner part of the galaxy is likely at the origin of the observed flat cluster formation efficiency radial profile.

  4. XMM-Newton view of X-ray overdensities from nearby galaxy clusters : the environmental dependencies

    NARCIS (Netherlands)

    Caglar,; T.; Hudaverdi,; M.,

    2017-01-01

    In this work, we studied ten nearby (z≤0.038) galaxy clusters to understand possible interactions between hot plasma and member galaxies. A multi-band source detection was applied to detect point-like structures within the intra-cluster medium. We examined spectral properties of a total of 391 X-ray

  5. Methodology сomparative statistical analysis of Russian industry based on cluster analysis

    Directory of Open Access Journals (Sweden)

    Sergey S. Shishulin

    2017-01-01

    Full Text Available The article is devoted to researching of the possibilities of applying multidimensional statistical analysis in the study of industrial production on the basis of comparing its growth rates and structure with other developed and developing countries of the world. The purpose of this article is to determine the optimal set of statistical methods and the results of their application to industrial production data, which would give the best access to the analysis of the result.Data includes such indicators as output, output, gross value added, the number of employed and other indicators of the system of national accounts and operational business statistics. The objects of observation are the industry of the countrys of the Customs Union, the United States, Japan and Erope in 2005-2015. As the research tool used as the simplest methods of transformation, graphical and tabular visualization of data, and methods of statistical analysis. In particular, based on a specialized software package (SPSS, the main components method, discriminant analysis, hierarchical methods of cluster analysis, Ward’s method and k-means were applied.The application of the method of principal components to the initial data makes it possible to substantially and effectively reduce the initial space of industrial production data. Thus, for example, in analyzing the structure of industrial production, the reduction was from fifteen industries to three basic, well-interpreted factors: the relatively extractive industries (with a low degree of processing, high-tech industries and consumer goods (medium-technology sectors. At the same time, as a result of comparison of the results of application of cluster analysis to the initial data and data obtained on the basis of the principal components method, it was established that clustering industrial production data on the basis of new factors significantly improves the results of clustering.As a result of analyzing the parameters of

  6. Analysis of DOE international environmental management activities

    Energy Technology Data Exchange (ETDEWEB)

    Ragaini, R.C.

    1995-09-01

    The Department of Energy`s (DOE) Strategic Plan (April 1994) states that DOE`s long-term vision includes world leadership in environmental restoration and waste management activities. The activities of the DOE Office of Environmental Management (EM) can play a key role in DOE`s goals of maintaining U.S. global competitiveness and ensuring the continuation of a world class science and technology community. DOE`s interest in attaining these goals stems partly from its participation in organizations like the Trade Policy Coordinating Committee (TPCC), with its National Environmental Export Promotion Strategy, which seeks to strengthen U.S. competitiveness and the building of public-private partnerships as part of U.S. industrial policy. The International Interactions Field Office task will build a communication network which will facilitate the efficient and effective communication between DOE Headquarters, Field Offices, and contractors. Under this network, Headquarters will provide the Field Offices with information on the Administration`s policies and activities (such as the DOE Strategic Plan), interagency activities, as well as relevant information from other field offices. Lawrence Livermore National Laboratory (LLNL) will, in turn, provide Headquarters with information on various international activities which, when appropriate, will be included in reports to groups like the TPCC and the EM Focus Areas. This task provides for the collection, review, and analysis of information on the more significant international environmental restoration and waste management initiatives and activities which have been used or are being considered at LLNL. Information gathering will focus on efforts and accomplishments in meeting the challenges of providing timely and cost effective cleanup of its environmentally damaged sites and facilities, especially through international technical exchanges and/or the implementation of foreign-development technologies.

  7. Coherent Energy and Environmental System Analysis

    DEFF Research Database (Denmark)

    Hvelplund, Frede Kloster; Mathiesen, Brian Vad; Østergaard, Poul Alberg

    This report presents a summary of results of the strategic research project “Coherent Energy and Environmental System Analysis” (CEESA) which was conducted in the period 2007-2011 and funded by the Danish Strategic Research Council together with the participating parties. The project...... energy and environmental analysis tools as well as analyses of the design and implementation of future renewable energy systems. For practical reasons, the work has been carried out as an interaction between five work packages, and a number of reports, papers and tools have been reported separately from...... each part of the project. A list of the separate work package reports is given at the end of this foreword while a complete list of all papers and reports can be found at the end of the report as well as at the following website: www.ceesa.dk. This report provides a summary of the results...

  8. Periorbital melasma: Hierarchical cluster analysis of clinical features in Asian patients.

    Science.gov (United States)

    Jung, Y S; Bae, J M; Kim, B J; Kang, J-S; Cho, S B

    2017-11-01

    Studies have shown melasma lesions to be distributed across the face in centrofacial, malar, and mandibular patterns. Meanwhile, however, melasma lesions of the periorbital area have yet to be thoroughly described. We analyzed normal and ultraviolet light-exposed photographs of patients with melasma. The periorbital melasma lesions were measured according to anatomical reference points and a hierarchical cluster analysis was performed. The periorbital melasma lesions showed clinical features of fine and homogenous melasma pigmentation, involving both the upper and lower eyelids that extended to other anatomical sites with a darker and coarser appearance. The hierarchical cluster analysis indicated that patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. Significant differences between cluster 1 and cluster 2 were found in lateral distance and inferolateral distance, but not in medial distance and superior distance. Comparing the two clusters, patients in cluster 2 were found to be significantly older and more commonly accompanied by melasma lesions of the temple and medial cheek. Our hierarchical cluster analysis of periorbital melasma lesions demonstrated that Asian patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  9. Cluster Analysis of Acute Care Use Yields Insights for Tailored Pediatric Asthma Interventions.

    Science.gov (United States)

    Abir, Mahshid; Truchil, Aaron; Wiest, Dawn; Nelson, Daniel B; Goldstick, Jason E; Koegel, Paul; Lozon, Marie M; Choi, Hwajung; Brenner, Jeffrey

    2017-09-01

    We undertake this study to understand patterns of pediatric asthma-related acute care use to inform interventions aimed at reducing potentially avoidable hospitalizations. Hospital claims data from 3 Camden city facilities for 2010 to 2014 were used to perform cluster analysis classifying patients aged 0 to 17 years according to their asthma-related hospital use. Clusters were based on 2 variables: asthma-related ED visits and hospitalizations. Demographics and a number of sociobehavioral and use characteristics were compared across clusters. Children who met the criteria (3,170) were included in the analysis. An examination of a scree plot showing the decline in within-cluster heterogeneity as the number of clusters increased confirmed that clusters of pediatric asthma patients according to hospital use exist in the data. Five clusters of patients with distinct asthma-related acute care use patterns were observed. Cluster 1 (62% of patients) showed the lowest rates of acute care use. These patients were least likely to have a mental health-related diagnosis, were less likely to have visited multiple facilities, and had no hospitalizations for asthma. Cluster 2 (19% of patients) had a low number of asthma ED visits and onetime hospitalization. Cluster 3 (11% of patients) had a high number of ED visits and low hospitalization rates, and the highest rates of multiple facility use. Cluster 4 (7% of patients) had moderate ED use for both asthma and other illnesses, and high rates of asthma hospitalizations; nearly one quarter received care at all facilities, and 1 in 10 had a mental health diagnosis. Cluster 5 (1% of patients) had extreme rates of acute care use. Differences observed between groups across multiple sociobehavioral factors suggest these clusters may represent children who differ along multiple dimensions, in addition to patterns of service use, with implications for tailored interventions. Copyright © 2017 American College of Emergency Physicians

  10. Assessment of Random Assignment in Training and Test Sets using Generalized Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Sorana D. BOLBOACĂ

    2011-06-01

    Full Text Available Aim: The properness of random assignment of compounds in training and validation sets was assessed using the generalized cluster technique. Material and Method: A quantitative Structure-Activity Relationship model using Molecular Descriptors Family on Vertices was evaluated in terms of assignment of carboquinone derivatives in training and test sets during the leave-many-out analysis. Assignment of compounds was investigated using five variables: observed anticancer activity and four structure descriptors. Generalized cluster analysis with K-means algorithm was applied in order to investigate if the assignment of compounds was or not proper. The Euclidian distance and maximization of the initial distance using a cross-validation with a v-fold of 10 was applied. Results: All five variables included in analysis proved to have statistically significant contribution in identification of clusters. Three clusters were identified, each of them containing both carboquinone derivatives belonging to training as well as to test sets. The observed activity of carboquinone derivatives proved to be normal distributed on every. The presence of training and test sets in all clusters identified using generalized cluster analysis with K-means algorithm and the distribution of observed activity within clusters sustain a proper assignment of compounds in training and test set. Conclusion: Generalized cluster analysis using the K-means algorithm proved to be a valid method in assessment of random assignment of carboquinone derivatives in training and test sets.

  11. Analysis of Factors and Development Potential of Economic Clusters by Economic Activities in Mari El Republic

    Directory of Open Access Journals (Sweden)

    Viktor Aleksandrovich Golovin

    2017-12-01

    Full Text Available This article analyzes the factors that drive the development of economic clusters in Mari El Republic (Russia. This analysis allowed to reveal the potential of those clusters further development. I consider a shift-share method as one of the major methods to identify the factors that determine the expansion of economic clusters. The author proposes the modification of shift-share method using relative performance indicators to evaluate the intensity and qualitaty of clustering processes in the region. The article presents the results of empirical research of the economy of Mari El Republic by shift-share method (2005–2015 years in the context of economic activities according to the Federal State Statistics Service. After the analysis of three basic indicators, the leading and lagging economic activities were revealed for the period of 10 years. I paid special attention to the analysis of clustering potential of the Mari El Republic in the context of economic activities based on the Clustering Potential Index. This analysis shows promising economic activities and industries that may form cluster. The author discusses the compliance and possible conflicts of two methods used in the study. Further research of this field can focus on the of system analysis and identifying specific companies and production chains that form the basis of clustering

  12. Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

    Directory of Open Access Journals (Sweden)

    Martinez Fernando J

    2010-03-01

    Full Text Available Abstract Background Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity. Methods We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster. Results We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1 post-bronchodilator FEV1 percent predicted, 2 percent bronchodilator responsiveness, and quantitative CT measurements of 3 apical emphysema and 4 airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1 emphysema predominant, 2 bronchodilator responsive, with higher FEV1; 3 discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4 airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant was associated with TGFB1 SNP rs1800470. Conclusions Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies.

  13. Cluster Analysis of Customer Reviews Extracted from Web Pages

    Directory of Open Access Journals (Sweden)

    S. Shivashankar

    2010-01-01

    Full Text Available As e-commerce is gaining popularity day by day, the web has become an excellent source for gathering customer reviews / opinions by the market researchers. The number of customer reviews that a product receives is growing at very fast rate (It could be in hundreds or thousands. Customer reviews posted on the websites vary greatly in quality. The potential customer has to read necessarily all the reviews irrespective of their quality to make a decision on whether to purchase the product or not. In this paper, we make an attempt to assess are view based on its quality, to help the customer make a proper buying decision. The quality of customer review is assessed as most significant, more significant, significant and insignificant.A novel and effective web mining technique is proposed for assessing a customer review of a particular product based on the feature clustering techniques, namely, k-means method and fuzzy c-means method. This is performed in three steps : (1Identify review regions and extract reviews from it, (2 Extract and cluster the features of reviews by a clustering technique and then assign weights to the features belonging to each of the clusters (groups and (3 Assess the review by considering the feature weights and group belongingness. The k-means and fuzzy c-means clustering techniques are implemented and tested on customer reviews extracted from web pages. Performance of these techniques are analyzed.

  14. Identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard.

    Directory of Open Access Journals (Sweden)

    Xiao-Juan Jiang

    Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles

  15. Post-construction environmental and social impact analysis (ESIA ...

    African Journals Online (AJOL)

    harcourt – Eungu highway in Southeastern Nigeria. The impact study focused on the Environmental and Social Impact Analysis (ESIA) because of the social and environmental impacts of road projects. The basic tool used in this analysis is ...

  16. The diamond model analysis of ICT cluster in Thailand

    Directory of Open Access Journals (Sweden)

    Danuvasin Charoen, Ph.D.

    2013-07-01

    Full Text Available Information and Communication Technology (ICT has become an integral part of national competitiveness. Thailand was ranked 38th (out of 134 countries in the global competitiveness report conducted by the World Economic Forum. It also was ranked well below the world average on all of the factors related to technology, despite the fact that information technology and telecommunications had been a major factor driving the competitiveness of the country. The main purpose of this study is to investigate the various issues related to ICT cluster in Thailand. The diamond model was used to analyze the ICT cluster in Thailand. The results from this study can be used to guide the policy to enhance the competitiveness of ICT cluster.

  17. Analysis of protein profiles using fuzzy clustering methods

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Ukendt, Sujatha; Rai, Lavanya

    clustering methods for their classification followed by various validation  measures.    The  clustering  algorithms  used  for  the  study  were  K-  means,  K- medoid, Fuzzy C-means, Gustafson-Kessel, and Gath-Geva.  The results presented in this study  conclude  that  the  protein  profiles  of  tissue......  samples  recorded  by  using  the  HPLC- LIF  system  and  the  data  analyzed  by  clustering  algorithms  quite  successfully  classifies them as belonging from normal and malignant conditions....

  18. Functional clustering algorithm for the analysis of dynamic network data

    Science.gov (United States)

    Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.

    2009-05-01

    We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.

  19. Marketing Mix Formulation for Higher Education: An Integrated Analysis Employing Analytic Hierarchy Process, Cluster Analysis and Correspondence Analysis

    Science.gov (United States)

    Ho, Hsuan-Fu; Hung, Chia-Chi

    2008-01-01

    Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…

  20. Hierarchical cluster analysis of ignitable liquids based on the total ion spectrum.

    Science.gov (United States)

    Waddell, Erin E; Frisch-Daiello, Jessica L; Williams, Mary R; Sigman, Michael E

    2014-09-01

    Gas chromatography-mass spectrometry (GC-MS) data of ignitable liquids in the Ignitable Liquids Reference Collection (ILRC) database were processed to obtain 445 total ion spectra (TIS), that is, average mass spectra across the chromatographic profile. Hierarchical cluster analysis, an unsupervised learning technique, was applied to find features useful for classification of ignitable liquids. A combination of the correlation distance and average linkage was utilized for grouping ignitable liquids with similar chemical composition. This study evaluated whether hierarchical cluster analysis of the TIS would cluster together ignitable liquids of the same ASTM class assignment, as designated in the ILRC database. The ignitable liquids clustered based on their chemical composition, and the ignitable liquids within each cluster were predominantly from one ASTM E1618-11 class. These results reinforce use of the TIS as a tool to aid in forensic fire debris analysis. © 2014 American Academy of Forensic Sciences.

  1. Cluster analysis in kinetic modelling of the brain: A noninvasive alternative to arterial sampling

    DEFF Research Database (Denmark)

    Liptrot, Matthew George; Adams, K.H.; Martiny, L.

    2004-01-01

    extracted from the PET data set. Hierarchical K-means cluster analysis was performed on the PET time series to extract a cerebral vasculature ROI. The number of clusters was varied from K = 1 to 10 for the second of the two-stage method. Determination of the correct number of clusters was performed...... blood sampling, the Simplified Reference Tissue Model (SRTM) and Logan analysis with cerebellar TAC as an input. There was a good agreement (P K-means-clustered input function and those from the arterial blood samples. This work......) extracted directly from dynamic positron emission tomography (PET) scans by cluster analysis. Five healthy subjects were injected with the 5HT2A- receptor ligand [18F]-altanserin and blood samples were subsequently taken from the radial artery and cubital vein. Eight regions-of-interest (ROI) TACs were...

  2. Analysis of protein profiles using fuzzy clustering methods

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Ukendt, Sujatha; Rai, Lavanya

    clustering methods for their classification followed by various validation  measures.    The  clustering  algorithms  used  for  the  study  were  K-  means,  K- medoid, Fuzzy C-means, Gustafson-Kessel, and Gath-Geva.  The results presented in this study  conclude  that  the  protein  profiles  of  tissue...

  3. Analysis of the Advantages of Creating Border Clusters

    Directory of Open Access Journals (Sweden)

    Liudmila Rosca-Sadurschi

    2015-08-01

    Full Text Available In a changing environment and rapid globalization, competitiveness of a country or region depends increasingly more effective in innovation. The main challenge for research and innovation is to facilitate the networking of companies and research laboratories. These networks can take the form of a highly integrated cross-border economic group, but may consist of action to facilitate business linkages and inter-laboratory, or cross-border clusters. The creation of these clusters requires performing several conditions but bring significant benefits to all stakeholders.

  4. Dynamic analysis of clustered building structures using substructures methods

    International Nuclear Information System (INIS)

    Leimbach, K.R.; Krutzik, N.J.

    1989-01-01

    The dynamic substructure approach to the building cluster on a common base mat starts with the generation of Ritz-vectors for each building on a rigid foundation. The base mat plus the foundation soil is subjected to kinematic constraint modes, for example constant, linear, quadratic or cubic constraints. These constraint modes are also imposed on the buildings. By enforcing kinematic compatibility of the complete structural system on the basis of the constraint modes a reduced Ritz model of the complete cluster is obtained. This reduced model can now be analyzed by modal time history or response spectrum methods

  5. Applying Clustering to Statistical Analysis of Student Reasoning about Two-Dimensional Kinematics

    Science.gov (United States)

    Springuel, R. Padraic; Wittman, Michael C.; Thompson, John R.

    2007-01-01

    We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and…

  6. Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

    Science.gov (United States)

    Chan, Julia Y. K.; Bauer, Christopher F.

    2014-01-01

    The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…

  7. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  8. Social Learning Network Analysis Model to Identify Learning Patterns Using Ontology Clustering Techniques and Meaningful Learning

    Science.gov (United States)

    Firdausiah Mansur, Andi Besse; Yusof, Norazah

    2013-01-01

    Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…

  9. Distinct Phenotypes of Smokers with Fixed Airflow Limitation Identified by Cluster Analysis of Severe Asthma.

    Science.gov (United States)

    Konno, Satoshi; Taniguchi, Natsuko; Makita, Hironi; Nakamaru, Yuji; Shimizu, Kaoruko; Shijubo, Noriharu; Fuke, Satoshi; Takeyabu, Kimihiro; Oguri, Mitsuru; Kimura, Hirokazu; Maeda, Yukiko; Suzuki, Masaru; Nagai, Katsura; Ito, Yoichi M; Wenzel, Sally E; Nishimura, Masaharu

    2018-01-01

    Smoking may have multifactorial effects on asthma phenotypes, particularly in severe asthma. Cluster analysis has been applied to explore novel phenotypes, which are not based on any a priori hypotheses. To explore novel severe asthma phenotypes by cluster analysis when including smoking patients with asthma. We recruited a total of 127 subjects with severe asthma, including 59 current or ex-smokers, from our university hospital and its 29 affiliated hospitals/pulmonary clinics. Clinical variables obtained during a 2-day hospital stay were used for cluster analysis. After clustering using clinical variables, the sputum levels of 14 molecules were measured to biologically characterize the clinical clusters. Five clinical clusters, including two characterized by low forced expiratory volume in 1 second/forced vital capacity, were identified. When characteristics of smoking subjects in these two clusters were compared, there were marked differences between the two groups: one had high levels of circulating eosinophils, high immunoglobulin E levels, and a high sinus score, and the other was characterized by low levels of the same parameters. Sputum analysis revealed intriguing differences of cytokine/chemokine pattern in these two groups. The other three clusters were similar to those previously reported: young onset/atopic, nonsmoker/less eosinophilic, and female/obese. Key clinical variables were confirmed to be stable and consistent 3 years later. This study reveals two distinct phenotypes with potentially different biological pathways contributing to fixed airflow limitation in cigarette smokers with severe asthma.

  10. Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis.

    Science.gov (United States)

    Conley, Samantha

    2017-12-01

    The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.

  11. Learning from environmental data: Methods for analysis of forest nutrition time series

    Energy Technology Data Exchange (ETDEWEB)

    Sulkava, M. (Helsinki Univ. of Technology, Espoo (Finland). Computer and Information Science)

    2008-07-01

    Data analysis methods play an important role in increasing our knowledge of the environment as the amount of data measured from the environment increases. This thesis fits under the scope of environmental informatics and environmental statistics. They are fields, in which data analysis methods are developed and applied for the analysis of environmental data. The environmental data studied in this thesis are time series of nutrient concentration measurements of pine and spruce needles. In addition, there are data of laboratory quality and related environmental factors, such as the weather and atmospheric depositions. The most important methods used for the analysis of the data are based on the self-organizing map and linear regression models. First, a new clustering algorithm of the self-organizing map is proposed. It is found to provide better results than two other methods for clustering of the self-organizing map. The algorithm is used to divide the nutrient concentration data into clusters, and the result is evaluated by environmental scientists. Based on the clustering, the temporal development of the forest nutrition is modeled and the effect of nitrogen and sulfur deposition on the foliar mineral composition is assessed. Second, regression models are used for studying how much environmental factors and properties of the needles affect the changes in the nutrient concentrations of the needles between their first and second year of existence. The aim is to build understandable models with good prediction capabilities. Sparse regression models are found to outperform more traditional regression models in this task. Third, fusion of laboratory quality data from different sources is performed to estimate the precisions of the analytical methods. Weighted regression models are used to quantify how much the precision of observations can affect the time needed to detect a trend in environmental time series. The results of power analysis show that improving the

  12. FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.

    Science.gov (United States)

    Fu, Limin; Medico, Enzo

    2007-01-04

    Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. To this aim, existing clustering approaches, mainly developed in computer science, have been adapted to microarray data analysis. However, previous studies revealed that microarray datasets have very diverse structures, some of which may not be correctly captured by current clustering methods. We therefore approached the problem from a new starting point, and developed a clustering algorithm designed to capture dataset-specific structures at the beginning of the process. The clustering algorithm is named Fuzzy clustering by Local Approximation of MEmbership (FLAME). Distinctive elements of FLAME are: (i) definition of the neighborhood of each object (gene or sample) and identification of objects with "archetypal" features named Cluster Supporting Objects, around which to construct the clusters; (ii) assignment to each object of a fuzzy membership vector approximated from the memberships of its neighboring objects, by an iterative converging process in which membership spreads from the Cluster Supporting Objects through their neighbors. Comparative analysis with K-means, hierarchical, fuzzy C-means and fuzzy self-organizing maps (SOM) showed that data partitions generated by FLAME are not superimposable to those of other methods and, although different types of datasets are better partitioned by different algorithms, FLAME displays the best overall performance. FLAME is implemented, together with all the above-mentioned algorithms, in a C++ software with graphical interface for Linux and Windows, capable of handling very large datasets, named Gene Expression Data Analysis Studio (GEDAS), freely available under GNU General Public License. The FLAME algorithm has intrinsic advantages, such as the ability to capture non-linear relationships and non-globular clusters, the automated definition of the number of clusters, and the

  13. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    Science.gov (United States)

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  14. Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability.

    Science.gov (United States)

    Miller, Christopher B; Bartlett, Delwyn J; Mullins, Anna E; Dodds, Kirsty L; Gordon, Christopher J; Kyle, Simon D; Kim, Jong Won; D'Rozario, Angela L; Lee, Rico S C; Comas, Maria; Marshall, Nathaniel S; Yee, Brendon J; Espie, Colin A; Grunstein, Ronald R

    2016-11-01

    To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative ( q )-EEG and heart rate variability (HRV). Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q -EEG. Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. © 2016 Associated Professional Sleep Societies, LLC.

  15. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis

    Directory of Open Access Journals (Sweden)

    Huanhuan Li

    2017-08-01

    Full Text Available The Shipboard Automatic Identification System (AIS is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW, a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our

  16. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis.

    Science.gov (United States)

    Li, Huanhuan; Liu, Jingxian; Liu, Ryan Wen; Xiong, Naixue; Wu, Kefeng; Kim, Tai-Hoon

    2017-08-04

    The Shipboard Automatic Identification System (AIS) is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW), a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA) is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our proposed method with

  17. Profiling physical activity motivation based on self-determination theory: a cluster analysis approach.

    Science.gov (United States)

    Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian

    2015-01-01

    In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.

  18. Arguments for a Cluster Analysis of Nasal Consonant Sequences of ...

    African Journals Online (AJOL)

    Bantu language scholars, have among other things, debated over the issue of whether nasal and consonant sequences (NC sequences) in various Bantu languages should be considered as clusters or single segments (prenasalised stops). This paper examines these sequences as they occur in Sukwa nouns. Sukwa is a ...

  19. Comparing clustering and pre-processing in taxonomy analysis

    NARCIS (Netherlands)

    Bonder, M.J.; Abeln, S.; Zaura, E.; Brandt, B.W.

    2012-01-01

    Motivation: Massively parallel sequencing allows for rapid sequencing of large numbers of sequences in just a single run. Thus, 16S ribosomal RNA (rRNA) amplicon sequencing of complex microbial communities has become possible. The sequenced 16S rRNA fragments (reads) are clustered into operational

  20. Environmental analysis for pipeline gas demonstration plants

    Energy Technology Data Exchange (ETDEWEB)

    Stinton, L.H.

    1978-09-01

    The Department of Energy (DOE) has implemented programs for encouraging the development and commercialization of coal-related technologies, which include coal gasification demonstration-scale activities. In support of commercialization activities the Environmental Analysis for Pipeline Gas Demonstration Plants has been prepared as a reference document to be used in evaluating potential environmental and socioeconomic effects from construction and operation of site- and process-specific projects. Effluents and associated impacts are identified for six coal gasification processes at three contrasting settings. In general, impacts from construction of a high-Btu gas demonstration plant are similar to those caused by the construction of any chemical plant of similar size. The operation of a high-Btu gas demonstration plant, however, has several unique aspects that differentiate it from other chemical plants. Offsite development (surface mining) and disposal of large quantities of waste solids constitute important sources of potential impact. In addition, air emissions require monitoring for trace metals, polycyclic aromatic hydrocarbons, phenols, and other emissions. Potential biological impacts from long-term exposure to these emissions are unknown, and additional research and data analysis may be necessary to determine such effects. Possible effects of pollutants on vegetation and human populations are discussed. The occurrence of chemical contaminants in liquid effluents and the bioaccumulation of these contaminants in aquatic organisms may lead to adverse ecological impact. Socioeconomic impacts are similar to those from a chemical plant of equivalent size and are summarized and contrasted for the three surrogate sites.

  1. Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

    Science.gov (United States)

    Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

    2005-05-01

    The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.

  2. Transient identification by clustering based on Integrated Deterministic and Probabilistic Safety Analysis outcomes

    International Nuclear Information System (INIS)

    Di Maio, Francesco; Vagnoli, Matteo; Zio, Enrico

    2016-01-01

    Highlights: • We develop an Integrated Deterministic and Probabilistic Safety Analysis (IDPSA). • We present a transient identification approach for retrieving IDPSA scenarios information. • We post-process the IDPSA scenarios for clustering Prime Implicants and Near Misses. • The approach is useful for an on-line cluster assignment of an unknown developing scenario. • We apply the approach to the accidental scenarios of a dynamic Steam Generator of a NPP. - Abstract: In this work, we present a transient identification approach that utilizes clustering for retrieving scenarios information from an Integrated Deterministic and Probabilistic Safety Analysis (IDPSA). The approach requires: (i) creation of a database of scenarios by IDPSA; (ii) scenario post-processing for clustering Prime Implicants (PIs), i.e., minimum combinations of failure events that are capable of leading the system into a fault state, and Near Misses, i.e., combinations of failure events that lead the system to a quasi-fault state; (iii) on-line cluster assignment of an unknown developing scenario. In the step (ii), we adopt a visual interactive method and risk-based clustering to identify PIs and Near Misses, respectively; in the on-line step (iii), to assign a scenario to a cluster we consider the sequence of events in the scenario and evaluate the Hamming similarity to the sequences of the previously clustered scenarios. The feasibility of the analysis is shown with respect to the accidental scenarios of a dynamic Steam Generator (SG) of a NPP.

  3. Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Rita Ismayilova

    2014-01-01

    Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.

  4. Instrumental neutron activation analysis in environmental research

    International Nuclear Information System (INIS)

    Bruin, M. de.

    1985-01-01

    The main characteristics of instrumental neutron activation analysis (INAA),relevant for environmental research and monitoring, was reviewed and discussed-sensitivity, suitable for detection of many toxic elements, the low risks of contamination of element loss, lack of matrix effects, lack of light element interference except for 24 Na, capability for multi-element determination, comparatively low costs. A detailed description of the IRI analysis system for routine INAA is given. The system is based on the single comparator method of standartization to take full advantage of multi-element without preparation and use the trace element standards. Zinc was used as mono element standard, the element concentrations are calculated on the basis of 65 Zn and 69m Zn-activities. The irradiations were carried out in a thermal neutron flux of 1.10 13 n/cm 2 .s. The gamma spectra is converted into element concentrations using a set of dedicated software, performing the following functions: spectrum analysis and interpretation, comparison and combination of the intermediate results from different decay times, generation of the final report, bookkeeping of the results obtained. The main applications of the INAA system mentioned are: identification of sources of heavy metal air pollution using air filters or biological indicators such as mosses, lichens, toe-nails, bird feathers, molusks and waterplants; and study of the uptake and translocation of heavy element in plants. Special attention was paid to mathematical techniques for a reliable interpretation of the element concentration patterns observed in sets of lichen samples. Future developments in INAA in environmental science are briefly mentioned

  5. Functional Analysis of the Fusarielin Biosynthetic Gene Cluster

    Directory of Open Access Journals (Sweden)

    Aida Droce

    2016-12-01

    Full Text Available Fusarielins are polyketides with a decalin core produced by various species of Aspergillus and Fusarium. Although the responsible gene cluster has been identified, the biosynthetic pathway remains to be elucidated. In the present study, members of the gene cluster were deleted individually in a Fusarium graminearum strain overexpressing the local transcription factor. The results suggest that a trans-acting enoyl reductase (FSL5 assists the polyketide synthase FSL1 in biosynthesis of a polyketide product, which is released by hydrolysis by a trans-acting thioesterase (FSL2. Deletion of the epimerase (FSL3 resulted in accumulation of an unstable compound, which could be the released product. A novel compound, named prefusarielin, accumulated in the deletion mutant of the cytochrome P450 monooxygenase FSL4. Unlike the known fusarielins from Fusarium, this compound does not contain oxygenized decalin rings, suggesting that FSL4 is responsible for the oxygenation.

  6. Nannoplankton from the Bombay-Saurashtra continental shelf of India: An appraisal using cluster analysis

    Digital Repository Service at National Institute of Oceanography (India)

    Guptha, M.V.S.; Nigam, R.

    Nannoplankton data from 28 stations in the northwestern continental shelf of India was subjected to Q-mode cluster analysis. Two biotopes A and B were identified. Although, Gephyrocapsa oceanica was by far, the most abundant species in both...

  7. Statistical Techniques Applied to Aerial Radiometric Surveys (STAARS): cluster analysis. National Uranium Resource Evaluation

    International Nuclear Information System (INIS)

    Pirkle, F.L.; Stablein, N.K.; Howell, J.A.; Wecksung, G.W.; Duran, B.S.

    1982-11-01

    One objective of the aerial radiometric surveys flown as part of the US Department of Energy's National Uranium Resource Evaluation (NURE) program was to ascertain the regional distribution of near-surface radioelement abundances. Some method for identifying groups of observations with similar radioelement values was therefore required. It is shown in this report that cluster analysis can identify such groups even when no a priori knowledge of the geology of an area exists. A method of convergent k-means cluster analysis coupled with a hierarchical cluster analysis is used to classify 6991 observations (three radiometric variables at each observation location) from the Precambrian rocks of the Copper Mountain, Wyoming, area. Another method, one that combines a principal components analysis with a convergent k-means analysis, is applied to the same data. These two methods are compared with a convergent k-means analysis that utilizes available geologic knowledge. All three methods identify four clusters. Three of the clusters represent background values for the Precambrian rocks of the area, and one represents outliers (anomalously high 214 Bi). A segmentation of the data corresponding to geologic reality as discovered by other methods has been achieved based solely on analysis of aerial radiometric data. The techniques employed are composites of classical clustering methods designed to handle the special problems presented by large data sets. 20 figures, 7 tables

  8. Hierarchical clustering analysis of blood plasma lipidomics profiles from mono- and dizygotic twin families

    NARCIS (Netherlands)

    Draisma, H.H.M.; Reijmers, Th.H.; Meulman, J.J.; van der Greef, J.; Hankemeier, Th.; Boomsma, D.I.

    2013-01-01

    Twin and family studies are typically used to elucidate the relative contribution of genetic and environmental variation to phenotypic variation. Here, we apply a quantitative genetic method based on hierarchical clustering, to blood plasma lipidomics data obtained in a healthy cohort consisting of

  9. NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways

    Science.gov (United States)

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Sand, Olivier; Janky, Rekin's; Vanderstocken, Gilles; Deville, Yves; van Helden, Jacques

    2008-01-01

    The network analysis tools (NeAT) (http://rsat.ulb.ac.be/neat/) provide a user-friendly web access to a collection of modular tools for the analysis of networks (graphs) and clusters (e.g. microarray clusters, functional classes, etc.). A first set of tools supports basic operations on graphs (comparison between two graphs, neighborhood of a set of input nodes, path finding and graph randomization). Another set of programs makes the connection between networks and clusters (graph-based clustering, cliques discovery and mapping of clusters onto a network). The toolbox also includes programs for detecting significant intersections between clusters/classes (e.g. clusters of co-expression versus functional classes of genes). NeAT are designed to cope with large datasets and provide a flexible toolbox for analyzing biological networks stored in various databases (protein interactions, regulation and metabolism) or obtained from high-throughput experiments (two-hybrid, mass-spectrometry and microarrays). The web interface interconnects the programs in predefined analysis flows, enabling to address a series of questions about networks of interest. Each tool can also be used separately by entering custom data for a specific analysis. NeAT can also be used as web services (SOAP/WSDL interface), in order to design programmatic workflows and integrate them with other available resources. PMID:18524799

  10. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    We present an approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways. We have also proposed a buffer size and worst case queuing delay analysis for the gateways......, responsible for routing inter-cluster traffic. Optimization heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of our approaches....

  11. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    An approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways, is presented. A buffer size and worst case queuing delay analysis for the gateways, responsible for routing...... inter-cluster traffic, is also proposed. Optimisation heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of the approaches....

  12. Clustering applications in financial and economic analysis of the crop production in the Russian regions

    Directory of Open Access Journals (Sweden)

    Gromov Vladislav Vladimirovich

    2013-08-01

    Full Text Available We used the complex mathematical modeling, multivariate statistical-analysis, fuzzy sets to analyze the financial and economic state of the crop production in Russian regions. We developed a system of indicators, detecting the state agricultural sector in the region, based on the results of correlation, factor, cluster analysis and statistics of the Federal State Statistics Service. We performed clustering analyses to divide regions of Russia on selected factors into five groups. A qualitative and quantitative characteristics of each cluster was received.

  13. Subtypes of autism by cluster analysis based on structural MRI data.

    Science.gov (United States)

    Hrdlicka, Michal; Dudova, Iva; Beranova, Irena; Lisy, Jiri; Belsan, Tomas; Neuwirth, Jiri; Komarek, Vladimir; Faladova, Ludvika; Havlovicova, Marketa; Sedlacek, Zdenek; Blatny, Marek; Urbanek, Tomas

    2005-05-01

    The aim of our study was to subcategorize Autistic Spectrum Disorders (ASD) using a multidisciplinary approach. Sixty four autistic patients (mean age 9.4+/-5.6 years) were entered into a cluster analysis. The clustering analysis was based on MRI data. The clusters obtained did not differ significantly in the overall severity of autistic symptomatology as measured by the total score on the Childhood Autism Rating Scale (CARS). The clusters could be characterized as showing significant differences: Cluster 1: showed the largest sizes of the genu and splenium of the corpus callosum (CC), the lowest pregnancy order and the lowest frequency of facial dysmorphic features. Cluster 2: showed the largest sizes of the amygdala and hippocampus (HPC), the least abnormal visual response on the CARS, the lowest frequency of epilepsy and the least frequent abnormal psychomotor development during the first year of life. Cluster 3: showed the largest sizes of the caput of the nucleus caudatus (NC), the smallest sizes of the HPC and facial dysmorphic features were always present. Cluster 4: showed the smallest sizes of the genu and splenium of the CC, as well as the amygdala, and caput of the NC, the most abnormal visual response on the CARS, the highest frequency of epilepsy, the highest pregnancy order, abnormal psychomotor development during the first year of life was always present and facial dysmorphic features were always present. This multidisciplinary approach seems to be a promising method for subtyping autism.

  14. Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis

    Science.gov (United States)

    2017-10-13

    applied to the resting-state data to identify tinnitus subgroups within the patient population and pair them with specific behavioral ...and behavioral data  Specific Aim 2: Determine tinnitus subgroups using automated cluster analysis of resting state data and associate the subgroups...data analysis and clustering method previously developed to apply to current tinnitus data set o Percentage of completion at end of Year 2 (24 months

  15. Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis

    Science.gov (United States)

    2015-01-01

    discussion of its application to the network of network scientists. Each partitioning step in this spectral scheme either bipartitions or tripartitions a...University of California Los Angeles Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis A dissertation...00-00-2015 to 00-00-2015 4. TITLE AND SUBTITLE Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis 5a

  16. FLOCK cluster analysis of mast cell event clustering by high-sensitivity flow cytometry predicts systemic mastocytosis.

    Science.gov (United States)

    Dorfman, David M; LaPlante, Charlotte D; Pozdnyakova, Olga; Li, Betty

    2015-11-01

    In our high-sensitivity flow cytometric approach for systemic mastocytosis (SM), we identified mast cell event clustering as a new diagnostic criterion for the disease. To objectively characterize mast cell gated event distributions, we performed cluster analysis using FLOCK, a computational approach to identify cell subsets in multidimensional flow cytometry data in an unbiased, automated fashion. FLOCK identified discrete mast cell populations in most cases of SM (56/75 [75%]) but only a minority of non-SM cases (17/124 [14%]). FLOCK-identified mast cell populations accounted for 2.46% of total cells on average in SM cases and 0.09% of total cells on average in non-SM cases (P < .0001) and were predictive of SM, with a sensitivity of 75%, a specificity of 86%, a positive predictive value of 76%, and a negative predictive value of 85%. FLOCK analysis provides useful diagnostic information for evaluating patients with suspected SM, and may be useful for the analysis of other hematopoietic neoplasms. Copyright© by the American Society for Clinical Pathology.

  17. Environmental effects on stellar populations of star clusters and dwarf galaxies

    Science.gov (United States)

    Pasetto, Stefano; Cropper, Mark; Fujita, Yutaka; Chiosi, Cesare; Grebel, Eva K.

    2017-03-01

    We investigate the competitive role of the different dissipative phenomena acting on the onset of star formation of gravitationally bound systems in an external environment. Ram pressure, Kelvin-Helmholtz and Rayleigh-Taylor instabilities, and tidal forces are accounted for separately in an analytical framework and compared in their role in influencing the star forming regions. We present an analytical criterion to elucidate the dependence of star formation in a spherical stellar system on its surrounding environment. We consider the different signatures of these phenomena in synthetically realized colour-magnitude diagrams (CMDs) of the orbiting system thus investigating the detectability limits of these different effects for future observational projects and their relevance. The developed theoretical framework has direct applications to the cases of massive star clusters, dwarf galaxies in galaxy clusters and dwarf galaxies orbiting our Milky Way system, as well as any primordial gas-rich cluster of stars orbiting within its host galaxy.

  18. Undergraduate ALFALFA Team: Analysis of Spatially-Resolved Star-Formation in Nearby Galaxy Groups and Clusters

    Science.gov (United States)

    Finn, Rose; Collova, Natasha; Spicer, Sandy; Whalen, Kelly; Koopmann, Rebecca A.; Durbala, Adriana; Haynes, Martha P.; Undergraduate ALFALFA Team

    2017-01-01

    As part of the Undergraduate ALFALFA Team, we are conducting a survey of the gas and star-formation properties of galaxies in 36 groups and clusters in the local universe. The galaxies in our sample span a large range of galactic environments, from the centers of galaxy groups and clusters to the surrounding infall regions. One goal of the project is to map the spatial distribution of star-formation; the relative extent of the star-forming and stellar disks provides important information about the internal and external processes that deplete gas and thus drive galaxy evolution. We obtained wide-field H-alpha observations with the WIYN 0.9m telescope at Kitt Peak National Observatory for galaxies in the vicinity of the MKW11 and NRGb004 galaxy groups and the Abell 1367 cluster. We present a preliminary analysis of the relative size of the star-forming and stellar disks as a function of galaxy morphology and local galaxy density, and we calculate gas depletion times using star-formation rates and HI gas mass. We will combine these results with those from other UAT members to determine if and how environmentally-driven gas depletion varies with the mass and X-ray properties of the host group or cluster. This work has supported by NSF grants AST-0847430, AST-1211005 and AST-1637339.

  19. 10 CFR 503.13 - Environmental impact analysis.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Environmental impact analysis. 503.13 Section 503.13... Exemptions § 503.13 Environmental impact analysis. In order to enable OFE to comply with NEPA, a petitioner..., and land resources; (3) Direct and indirect environmental impacts of the proposed action including...

  20. Integrating health and environmental impact analysis.

    Science.gov (United States)

    Reis, S; Morris, G; Fleming, L E; Beck, S; Taylor, T; White, M; Depledge, M H; Steinle, S; Sabel, C E; Cowie, H; Hurley, F; Dick, J McP; Smith, R I; Austen, M

    2015-10-01

    Scientific investigations have progressively refined our understanding of the influence of the environment on human health, and the many adverse impacts that human activities exert on the environment, from the local to the planetary level. Nonetheless, throughout the modern public health era, health has been pursued as though our lives and lifestyles are disconnected from ecosystems and their component organisms. The inadequacy of the societal and public health response to obesity, health inequities, and especially global environmental and climate change now calls for an ecological approach which addresses human activity in all its social, economic and cultural complexity. The new approach must be integral to, and interactive, with the natural environment. We see the continuing failure to truly integrate human health and environmental impact analysis as deeply damaging, and we propose a new conceptual model, the ecosystems-enriched Drivers, Pressures, State, Exposure, Effects, Actions or 'eDPSEEA' model, to address this shortcoming. The model recognizes convergence between the concept of ecosystems services which provides a human health and well-being slant to the value of ecosystems while equally emphasizing the health of the environment, and the growing calls for 'ecological public health' as a response to global environmental concerns now suffusing the discourse in public health. More revolution than evolution, ecological public health will demand new perspectives regarding the interconnections among society, the economy, the environment and our health and well-being. Success must be built on collaborations between the disparate scientific communities of the environmental sciences and public health as well as interactions with social scientists, economists and the legal profession. It will require outreach to political and other stakeholders including a currently largely disengaged general public. The need for an effective and robust science-policy interface has

  1. PRINCIPAL COMPONENT ANALYSIS AND CLUSTER ANALYSIS IN MULTIVARIATE ASSESSMENT OF WATER QUALITY

    Directory of Open Access Journals (Sweden)

    Elzbieta Radzka

    2017-03-01

    Full Text Available This paper deals with the use of multivariate methods in drinking water analysis. During a five-year project, from 2008 to 2012, selected chemical parameters in 11 water supply networks of the Siedlce County were studied. Throughout that period drinking water was of satisfactory quality, with only iron and manganese ions exceeding the limits (21 times and 12 times, respectively. In accordance with the results of cluster analysis, all water networks were put into three groups of different water quality. A high concentration of chlorides, sulphates, and manganese and a low concentration of copper and sodium was found in the water of Group 1 supply networks. The water in Group 2 had a high concentration of copper and sodium, and a low concentration of iron and sulphates. The water from Group 3 had a low concentration of chlorides and manganese, but a high concentration of fluorides. Using principal component analysis and cluster analysis, multivariate correlation between the studied parameters was determined, helping to put water supply networks into groups according to similar water quality.

  2. Advancing the diagnostic analysis of environmental problems

    Directory of Open Access Journals (Sweden)

    Michael Cox

    2011-09-01

    Full Text Available Social-ecological systems exhibit patterns across multiple levels along spatial, temporal, and functional scales. The outcomes that are produced in these systems result from complex, non-additive interactions between different types of social and biophysical components, some of which are common to many systems, and some of which are relatively unique to a particular system. These properties, along with the mostly non-experimental nature of the analysis, make it difficult to construct theories regarding the sustainability of social-ecological systems. This paper builds on previous work that has initiated a diagnostic approach to the analysis of these systems. The process of diagnosis involves asking a series of questions of a system at increasing levels of specificity based on the answers to previous questions. The answer to each question further unpacks the complexity of a system, allowing an analyst to explore patterns of interactions that produce outcomes. An important feature of this approach is the use of multilevel analysis. This paper explores this concept and introduces another – multilevel causation – to further develop the diagnostic approach. It demonstrates that these concepts can be used to analyze a diversity of environmental problems.

  3. Full mode and attribution mode in environmental analysis

    NARCIS (Netherlands)

    De Udo Haes, Helias A.; Heijungs, Reinout; Huppes, Gjalt; Van Der Voet, Ester; Hettelingh, Jean Paul

    2000-01-01

    Several tools exist for the analysis of the environmental impacts of chains or networks of processes. These relatively simple tools include materials flow accounting (MFA), substance flow analysis (SFA), life-cycle assessment (LCA), energy analysis, and environmentally extended input-output analysis

  4. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  5. Statistical cluster analysis of the British Thoracic Society Severe refractory Asthma Registry: clinical outcomes and phenotype stability.

    Directory of Open Access Journals (Sweden)

    Chris Newby

    Full Text Available Severe refractory asthma is a heterogeneous disease. We sought to determine statistical clusters from the British Thoracic Society Severe refractory Asthma Registry and to examine cluster-specific outcomes and stability.Factor analysis and statistical cluster modelling was undertaken to determine the number of clusters and their membership (N = 349. Cluster-specific outcomes were assessed after a median follow-up of 3 years. A classifier was programmed to determine cluster stability and was validated in an independent cohort of new patients recruited to the registry (n = 245.Five clusters were identified. Cluster 1 (34% were atopic with early onset disease, cluster 2 (21% were obese with late onset disease, cluster 3 (15% had the least severe disease, cluster 4 (15% were the eosinophilic with late onset disease and cluster 5 (15% had significant fixed airflow obstruction. At follow-up, the proportion of subjects treated with oral corticosteroids increased in all groups with an increase in body mass index. Exacerbation frequency decreased significantly in clusters 1, 2 and 4 and was associated with a significant fall in the peripheral blood eosinophil count in clusters 2 and 4. Stability of cluster membership at follow-up was 52% for the whole group with stability being best in cluster 2 (71% and worst in cluster 4 (25%. In an independent validation cohort, the classifier identified the same 5 clusters with similar patient distribution and characteristics.Statistical cluster analysis can identify distinct phenotypes with specific outcomes. Cluster membership can be determined using a classifier, but when treatment is optimised, cluster stability is poor.

  6. Suicide in the oldest old: an observational study and cluster analysis.

    Science.gov (United States)

    Sinyor, Mark; Tan, Lynnette Pei Lin; Schaffer, Ayal; Gallagher, Damien; Shulman, Kenneth

    2016-01-01

    The older population are at a high risk for suicide. This study sought to learn more about the characteristics of suicide in the oldest-old and to use a cluster analysis to determine if oldest-old suicide victims assort into clinically meaningful subgroups. Data were collected from a coroner's chart review of suicide victims in Toronto from 1998 to 2011. We compared two age groups (65-79 year olds, n = 335, and 80+ year olds, n = 191) and then conducted a hierarchical agglomerative cluster analysis using Ward's method to identify distinct clusters in the 80+ group. The younger and older age groups differed according to marital status, living circumstances and pattern of stressors. The cluster analysis identified three distinct clusters in the 80+ group. Cluster 1 was the largest (n = 124) and included people who were either married or widowed who had significantly more depression and somewhat more medical health stressors. In contrast, cluster 2 (n = 50) comprised people who were almost all single and living alone with significantly less identified depression and slightly fewer medical health stressors. All members of cluster 3 (n = 17) lived in a retirement residence or nursing home, and this group had the highest rates of depression, dementia, other mental illness and past suicide attempts. This is the first study to use the cluster analysis technique to identify meaningful subgroups among suicide victims in the oldest-old. The results reveal different patterns of suicide in the older population that may be relevant for clinical care. Copyright © 2015 John Wiley & Sons, Ltd.

  7. Clustered Xenopus keratin genes: A genomic, transcriptomic, and proteomic analysis.

    Science.gov (United States)

    Suzuki, Ken-Ichi T; Suzuki, Miyuki; Shigeta, Mitsuki; Fortriede, Joshua D; Takahashi, Shuji; Mawaribuchi, Shuuji; Yamamoto, Takashi; Taira, Masanori; Fukui, Akimasa

    2017-06-15

    Keratin genes belong to the intermediate filament superfamily and their expression is altered following morphological and physiological changes in vertebrate epithelial cells. Keratin genes are divided into two groups, type I and II, and are clustered on vertebrate genomes, including those of Xenopus species. Various keratin genes have been identified and characterized by their unique expression patterns throughout ontogeny in Xenopus laevis; however, compilation of previously reported and newly identified keratin genes in two Xenopus species is required for our further understanding of keratin gene evolution, not only in amphibians but also in all terrestrial vertebrates. In this study, 120 putative type I and II keratin genes in total were identified based on the genome data from two Xenopus species. We revealed that most of these genes are highly clustered on two homeologous chromosomes, XLA9_10 and XLA2 in X. laevis, and XTR10 and XTR2 in X. tropicalis, which are orthologous to those of human, showing conserved synteny among tetrapods. RNA-Seq data from various embryonic stages and adult tissues highlighted the unique expression profiles of orthologous and homeologous keratin genes in developmental stage- and tissue-specific manners. Moreover, we identified dozens of epidermal keratin proteins from the whole embryo, larval skin, tail, and adult skin using shotgun proteomics. In light of our results, we discuss the radiation, diversification, and unique expression of the clustered keratin genes, which are closely related to epidermal development and terrestrial adaptation during amphibian evolution, including Xenopus speciation. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Clustering-based analysis for residential district heating data

    DEFF Research Database (Denmark)

    Gianniou, Panagiota; Liu, Xiufeng; Heller, Alfred

    2018-01-01

    residential heating consumption data and evaluate information included in national building databases. The proposed method uses the K-means algorithm to segment consumption groups based on consumption intensity and representative patterns and ranks the groups according to daily consumption. This paper also......The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze...

  9. Communication Base Station Log Analysis Based on Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Shao-Hua

    2017-01-01

    Full Text Available Communication base stations generate massive data every day, these base station logs play an important value in mining of the business circles. This paper use data mining technology and hierarchical clustering algorithm to group the scope of business circle for the base station by recording the data of these base stations.Through analyzing the data of different business circle based on feature extraction and comparing different business circle category characteristics, which can choose a suitable area for operators of commercial marketing.

  10. A cluster analysis of Basic Personality Inventory (BPI) adolescent profiles.

    Science.gov (United States)

    Bonynge, E R

    1994-03-01

    Basic Personality Inventory profiles of 95 male and 118 female adolescent admissions to a crisis intervention unit were subjected to a cluster analytic procedure. For both males and females, four subgroups were identified: Mental Health Maladjustment, Interpersonal Maladjustment, High-risk Rebellion, and Adjustment. Subgroups differed significantly on alternative markers of psychopathology (SCL-90-R and Diagnoses). Subgroups identified were consistent with groupings identified previously. The subgroups also corresponded with broad-band syndromes that are conventional within the literature on adolescent psychopathology. Subgroup characteristics and implications for adolescent assessment are discussed.

  11. Student academic performance analysis using fuzzy C-means clustering

    Science.gov (United States)

    Rosadi, R.; Akamal; Sudrajat, R.; Kharismawan, B.; Hambali, Y. A.

    2017-01-01

    Grade Point Average (GPA) is commonly used as an indicator of academic performance. Academic performance evaluations is a basic way to evaluate the progression of student performance, when evaluating student’s academic performance, there are occasion where the student data is grouped especially when the amounts of data is large. Thus, the pattern of data relationship within and among groups can be revealed. Grouping data can be done by using clustering method, where one of the methods is the Fuzzy C-Means algorithm. Furthermore, this algorithm is then applied to a set of student data form the Faculty of Mathematics and Natural Sciences, Padjadjaran University.

  12. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

    Science.gov (United States)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

    2017-12-01

    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  13. Design and analysis of clinical trials with clustering effects due to treatment.

    Science.gov (United States)

    Roberts, Chris; Roberts, Stephen A

    2005-01-01

    Where patients receive therapy as a group, there are good theoretical reasons to believe that variation in the outcome will be smaller for patients treated in the same group than for patients treated in different groups. Similarly, where different therapists treat different groups of patients, outcome for patients treated by the same therapist may differ less than outcome for patients treated by different therapists. Clinical trials evaluating such therapies need to consider this potential lack of independence. As with cluster-randomized trials, this has implications for the precision of treatment effects estimates and statistical power. There are nevertheless differences between clustering due to the organization of treatment and that due to randomization. In cluster-randomized trials the distribution of cluster sizes in each treatment arm should be similar as a consequence of randomization unless there is differential loss to follow-up. With clustering due to therapy group or therapist, cluster size may differ systematically between treatment arms, due to size of therapy groups or differing health professional caseload. Intra-cluster correlation may also differ between treatment arms. The implications of differential cluster size and intracluster correlation for design and analysis will be illustrated by data from two trials, the first comparing nurse practitioner care with general practitioner care, and the second comparing a group therapy with individual treatment as usual. The special case where a group therapy or therapist is compared with an unclustered treatment is examined in detail using a simulation study. The implications of differential clustering effects for sample size and power are addressed. It is argued that the design and analysis of this type of trial should take account of possible heterogeneity in cluster size and intracluster correlation.

  14. Detection of secondary structure elements in proteins by hydrophobic cluster analysis.

    Science.gov (United States)

    Woodcock, S; Mornon, J P; Henrissat, B

    1992-10-01

    Hydrophobic cluster analysis (HCA) is a protein sequence comparison method based on alpha-helical representations of the sequences where the size, shape and orientation of the clusters of hydrophobic residues are primarily compared. The effectiveness of HCA has been suggested to originate from its potential ability to focus on the residues forming the hydrophobic core of globular proteins. We have addressed the robustness of the bidimensional representation used for HCA in its ability to detect the regular secondary structure elements of proteins. Various parameters have been studied such as those governing cluster size and limits, the hydrophobic residues constituting the clusters as well as the potential shift of the cluster positions with respect to the position of the regular secondary structure elements. The following results have been found to support the alpha-helical bidimensional representation used in HCA: (i) there is a positive correlation (clearly above background noise) between the hydrophobic clusters and the regular secondary structure elements in proteins; (ii) the hydrophobic clusters are centred on the regular secondary structure elements; (iii) the pitch of the helical representation which gives the best correspondence is that of an alpha-helix. The correspondence between hydrophobic clusters and regular secondary structure elements suggests a way to implement variable gap penalties during the automatic alignment of protein sequences.

  15. Point Cluster Analysis Using a 3D Voronoi Diagram with Applications in Point Cloud Segmentation

    Directory of Open Access Journals (Sweden)

    Shen Ying

    2015-08-01

    Full Text Available Three-dimensional (3D point analysis and visualization is one of the most effective methods of point cluster detection and segmentation in geospatial datasets. However, serious scattering and clotting characteristics interfere with the visual detection of 3D point clusters. To overcome this problem, this study proposes the use of 3D Voronoi diagrams to analyze and visualize 3D points instead of the original data item. The proposed algorithm computes the cluster of 3D points by applying a set of 3D Voronoi cells to describe and quantify 3D points. The decompositions of point cloud of 3D models are guided by the 3D Voronoi cell parameters. The parameter values are mapped from the Voronoi cells to 3D points to show the spatial pattern and relationships; thus, a 3D point cluster pattern can be highlighted and easily recognized. To capture different cluster patterns, continuous progressive clusters and segmentations are tested. The 3D spatial relationship is shown to facilitate cluster detection. Furthermore, the generated segmentations of real 3D data cases are exploited to demonstrate the feasibility of our approach in detecting different spatial clusters for continuous point cloud segmentation.

  16. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis

    Directory of Open Access Journals (Sweden)

    Chao Zhang

    2017-09-01

    Full Text Available A wireless-powered sensor network (WPSN consisting of one hybrid access point (HAP, a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  17. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis.

    Science.gov (United States)

    Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan

    2017-09-27

    A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  18. Using the Cluster Analysis and the Principal Component Analysis in Evaluating the Quality of a Destination

    Directory of Open Access Journals (Sweden)

    Ida Vajčnerová

    2016-01-01

    Full Text Available The objective of the paper is to explore possibilities of evaluating the quality of a tourist destination by means of the principal components analysis (PCA and the cluster analysis. In the paper both types of analysis are compared on the basis of the results they provide. The aim is to identify advantage and limits of both methods and provide methodological suggestion for their further use in the tourism research. The analyses is based on the primary data from the customers’ satisfaction survey with the key quality factors of a destination. As output of the two statistical methods is creation of groups or cluster of quality factors that are similar in terms of respondents’ evaluations, in order to facilitate the evaluation of the quality of tourist destinations. Results shows the possibility to use both tested methods. The paper is elaborated in the frame of wider research project aimed to develop a methodology for the quality evaluation of tourist destinations, especially in the context of customer satisfaction and loyalty.

  19. Crowd Analysis by Using Optical Flow and Density Based Clustering

    DEFF Research Database (Denmark)

    Santoro, Francesco; Pedro, Sergio; Tan, Zheng-Hua

    2010-01-01

    In this paper, we present a system to detect and track crowds in a video sequence captured by a camera. In a first step, we compute optical flows by means of pyramidal Lucas-Kanade feature tracking. Afterwards, a density based clustering is used to group similar vectors. In the last step, it is a......In this paper, we present a system to detect and track crowds in a video sequence captured by a camera. In a first step, we compute optical flows by means of pyramidal Lucas-Kanade feature tracking. Afterwards, a density based clustering is used to group similar vectors. In the last step......, it is applied a crowd tracker in every frame, allowing us to detect and track the crowds. Our system gives the output as a graphic overlay, i.e it adds arrows and colors to the original frame sequence, in order to identify crowds and their movements. For the evaluation, we check when our system detect certains...

  20. Fuzzy subtractive clustering based prediction model for brand association analysis

    Directory of Open Access Journals (Sweden)

    Widodo Imam Djati

    2018-01-01

    Full Text Available The brand is one of the crucial elements that determine the success of a product. Consumers in determining the choice of a product will always consider product attributes (such as features, shape, and color, however consumers are also considering the brand. Brand will guide someone to associate a product with specific attributes and qualities. This study was designed to identify the product attributes and predict brand performance with those attributes. A survey was run to obtain the attributes affecting the brand. Subtractive Fuzzy Clustering was used to classify and predict product brand association based aspects of the product under investigation. The result indicates that the five attributes namely shape, ease, image, quality and price can be used to classify and predict the brand. Training step gives best FSC model with radii (ra = 0.1. It develops 70 clusters/rules with MSE (Training is 9.7093e-016. By using 14 data testing, the model can predict brand very well (close to the target with MSE is 0.6005 and its’ accuracy rate is 71%.

  1. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    Science.gov (United States)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  2. Cluster cosmological analysis with X ray instrumental observables: introduction and testing of AsPIX method

    International Nuclear Information System (INIS)

    Valotti, Andrea

    2016-01-01

    Cosmology is one of the fundamental pillars of astrophysics, as such it contains many unsolved puzzles. To investigate some of those puzzles, we analyze X-ray surveys of galaxy clusters. These surveys are possible thanks to the bremsstrahlung emission of the intra-cluster medium. The simultaneous fit of cluster counts as a function of mass and distance provides an independent measure of cosmological parameters such as Ω_m, σ_s, and the dark energy equation of state w0. A novel approach to cosmological analysis using galaxy cluster data, called top-down, was developed in N. Clerc et al. (2012). This top-down approach is based purely on instrumental observables that are considered in a two-dimensional X-ray color-magnitude diagram. The method self-consistently includes selection effects and scaling relationships. It also provides a means of bypassing the computation of individual cluster masses. My work presents an extension of the top-down method by introducing the apparent size of the cluster, creating a three-dimensional X-ray cluster diagram. The size of a cluster is sensitive to both the cluster mass and its angular diameter, so it must also be included in the assessment of selection effects. The performance of this new method is investigated using a Fisher analysis. In parallel, I have studied the effects of the intrinsic scatter in the cluster size scaling relation on the sample selection as well as on the obtained cosmological parameters. To validate the method, I estimate uncertainties of cosmological parameters with MCMC method Amoeba minimization routine and using two simulated XMM surveys that have an increasing level of complexity. The first simulated survey is a set of toy catalogues of 100 and 10000 deg"2, whereas the second is a 1000 deg"2 catalogue that was generated using an Aardvark semi-analytical N-body simulation. This comparison corroborates the conclusions of the Fisher analysis. In conclusion, I find that a cluster diagram that accounts for

  3. Comparative Genomic Analysis of Clinical and Environmental Vibrio Vulnificus Isolates Revealed Biotype 3 Evolutionary Relationships

    Directory of Open Access Journals (Sweden)

    Yael eKotton

    2015-01-01

    Full Text Available In 1996 a common-source outbreak of severe soft tissue and bloodstream infections erupted among Israeli fish farmers and fish consumers due to changes in fish marketing policies. The causative pathogen was a new strain of Vibrio vulnificus, named biotype 3, which displayed a unique biochemical and genotypic profile. Initial observations suggested that the pathogen erupted as a result of genetic recombination between two distinct populations. We applied a whole genome shotgun sequencing approach using several V. vulnificus strains from Israel in order to study the pan genome of V. vulnificus and determine the phylogenetic relationship of biotype 3 with existing populations. The core genome of V. vulnificus based on 16 draft and complete genomes consisted of 3068 genes, representing between 59% and 78% of the whole genome of 16 strains. The accessory genome varied in size from 781 kbp to 2044 kbp. Phylogenetic analysis based on whole, core, and accessory genomes displayed similar clustering patterns with two main clusters, clinical (C and environmental (E, all biotype 3 strains formed a distinct group within the E cluster. Annotation of accessory genomic regions found in biotype 3 strains and absent from the core genome yielded 1732 genes, of which the vast majority encoded hypothetical proteins, phage-related proteins, and mobile element proteins. A total of 1916 proteins (including 713 hypothetical proteins were present in all human pathogenic strains (both biotype 3 and non-biotype 3 and absent from the environmental strains. Clustering analysis of the non-hypothetical proteins revealed 148 protein clusters shared by all human pathogenic strains; these included transcriptional regulators, arylsulfatases, methyl-accepting chemotaxis proteins, acetyltransferases, GGDEF family proteins, transposases, type IV secretory system (T4SS proteins, and integrases. Our study showed that V. vulnificus biotype 3 evolved from environmental populations and

  4. Application of Cluster Analysis in Assessment of Dietary Habits of Secondary School Students

    Directory of Open Access Journals (Sweden)

    Zalewska Magdalena

    2014-12-01

    Full Text Available Maintenance of proper health and prevention of diseases of civilization are now significant public health problems. Nutrition is an important factor in the development of youth, as well as the current and future state of health. The aim of the study was to show the benefits of the application of cluster analysis to assess the dietary habits of high school students. The survey was carried out on 1,631 eighteen-year-old students in seven randomly selected secondary schools in Bialystok using a self-prepared anonymous questionnaire. An evaluation of the time of day meals were eaten and the number of meals consumed was made for the surveyed students. The cluster analysis allowed distinguishing characteristic structures of dietary habits in the observed population. Four clusters were identified, which were characterized by relative internal homogeneity and substantial variation in terms of the number of meals during the day and the time of their consumption. The most important characteristics of cluster 1 were cumulated food ration in 2 or 3 meals and long intervals between meals. Cluster 2 was characterized by eating the recommended number of 4 or 5 meals a day. In the 3rd cluster, students ate 3 meals a day with large intervals between them, and in the 4th they had four meals a day while maintaining proper intervals between them. In all clusters dietary mistakes occurred, but most of them were related to clusters 1 and 3. Cluster analysis allowed for the identification of major flaws in nutrition, which may include irregular eating and skipping meals, and indicated possible connections between eating patterns and disturbances of body weight in the examined population.

  5. Nuclear spectrometry for environmental analysis and mapping

    International Nuclear Information System (INIS)

    Simon, Aliz

    2013-01-01

    visits, and provision of equipment. This talk gives an overview of the lAEA Physics Section activities with a special emphasis on the following activities: 1)Improving the analytical performance of PIXE and other IBA techniques; 2)Networking for environmental analysis; 3)Radioisotope environmental mapping; 4)Future perspectives for new IBA methods, especially Heavy lon PIXE combined with MeVSIMS. (author)

  6. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Directory of Open Access Journals (Sweden)

    Valentina Meuti

    2014-01-01

    Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  7. Application of Geostatistical Methods and Machine Learning for spatio-temporal Earthquake Cluster Analysis

    Science.gov (United States)

    Schaefer, A. M.; Daniell, J. E.; Wenzel, F.

    2014-12-01

    Earthquake clustering tends to be an increasingly important part of general earthquake research especially in terms of seismic hazard assessment and earthquake forecasting and prediction approaches. The distinct identification and definition of foreshocks, aftershocks, mainshocks and secondary mainshocks is taken into account using a point based spatio-temporal clustering algorithm originating from the field of classic machine learning. This can be further applied for declustering purposes to separate background seismicity from triggered seismicity. The results are interpreted and processed to assemble 3D-(x,y,t) earthquake clustering maps which are based on smoothed seismicity records in space and time. In addition, multi-dimensional Gaussian functions are used to capture clustering parameters for spatial distribution and dominant orientations. Clusters are further processed using methodologies originating from geostatistics, which have been mostly applied and developed in mining projects during the last decades. A 2.5D variogram analysis is applied to identify spatio-temporal homogeneity in terms of earthquake density and energy output. The results are mitigated using Kriging to provide an accurate mapping solution for clustering features. As a case study, seismic data of New Zealand and the United States is used, covering events since the 1950s, from which an earthquake cluster catalogue is assembled for most of the major events, including a detailed analysis of the Landers and Christchurch sequences.

  8. Somatosensory nociceptive characteristics differentiate subgroups in people with chronic low back pain: a cluster analysis.

    Science.gov (United States)

    Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne

    2015-10-01

    The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.

  9. MMPI-2: cluster analysis of personality profiles in perinatal depression—preliminary evidence.

    Science.gov (United States)

    Meuti, Valentina; Marini, Isabella; Grillo, Alessandra; Lauriola, Marco; Leone, Carlo; Giacchetti, Nicoletta; Aceti, Franca

    2014-01-01

    To assess personality characteristics of women who develop perinatal depression. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. The analysis identified three clusters of personality profile: two "clinical" clusters (1 and 3) and an "apparently common" one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a "psychasthenic" depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop "dysphoric" depression; the second cluster (46.5%) shows a normal profile with a "defensive" attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  10. Clustering and summarising association rules mined from phenotype, genotype and environmental data concerning age-related hearing impairment.

    Science.gov (United States)

    Iltanen, Kati; Kiviharju, Sami; Ao, Lida; Juhola, Martti; Pyykkö, Ilmari

    2013-01-01

    In this study, we examine the applicability of association rules for analysing high-dimensional data concerning age-related hearing impairment (ARHI). The ARHI data of the study contain hundreds of variables concerning phenotype, genotype and environmental factors. The number of association rules produced from the data is too large for manual exploration in the raw and furthermore, the rules are overlapping. Thus, the focus of our study is to develop an approach to cluster association rules into subsets and to summarise and represent the found rule subsets for easier exploration of rules. The results show that it is possible to efficiently extract rules representing interesting environmental factor-gene or gene-gene interactions. Finding suitable parameters for the association rule mining and the possibility to post-process the mined rules is essential. The developed approach facilitates rule exploration by grouping rules with items concerning the same phenomenon to the same subset and byrevealing overlapping rules.

  11. Environmental Technology (Laboratory Analysis and Environmental Sampling) Curriculum Development Project. Final Report.

    Science.gov (United States)

    Hinojosa, Oscar V.; Guillen, Alfonso

    A project assessed the need and developed a curriculum for environmental technology (laboratory analysis and environmental sampling) in the emerging high technology centered around environmental safety and health in Texas. Initial data were collected through interviews by telephone and in person and through onsite visits. Additional data was…

  12. Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.

    Science.gov (United States)

    Li, Zhaonan; Xu, Xinyi; Shen, Junshan

    2017-11-10

    In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.

  13. A formal concept analysis approach to consensus clustering of multi-experiment expression data

    Science.gov (United States)

    2014-01-01

    Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological

  14. Fuzzy ensemble clustering based on random projections for DNA microarray data analysis.

    Science.gov (United States)

    Avogadri, Roberto; Valentini, Giorgio

    2009-01-01

    Two major problems related the unsupervised analysis of gene expression data are represented by the accuracy and reliability of the discovered clusters, and by the biological fact that the boundaries between classes of patients or classes of functionally related genes are sometimes not clearly defined. The main goal of this work consists in the exploration of new strategies and in the development of new clustering methods to improve the accuracy and robustness of clustering results, taking into account the uncertainty underlying the assignment of examples to clusters in the context of gene expression data analysis. We propose a fuzzy ensemble clustering approach both to improve the accuracy of clustering results and to take into account the inherent fuzziness of biological and bio-medical gene expression data. We applied random projections that obey the Johnson-Lindenstrauss lemma to obtain several instances of lower dimensional gene expression data from the original high-dimensional ones, approximately preserving the information and the metric structure of the original data. Then we adopt a double fuzzy approach to obtain a consensus ensemble clustering, by first applying a fuzzy k-means algorithm to the different instances of the projected low-dimensional data and then by using a fuzzy t-norm to combine the multiple clusterings. Several variants of the fuzzy ensemble clustering algorithms are proposed, according to different techniques to combine the base clusterings and to obtain the final consensus clustering. We applied our proposed fuzzy ensemble methods to the gene expression analysis of leukemia, lymphoma, adenocarcinoma and melanoma patients, and we compared the results with other state of the art ensemble methods. Results show that in some cases, taking into account the natural fuzziness of the data, we can improve the discovery of classes of patients defined at bio-molecular level. The reduction of the dimension of the data, achieved through random

  15. Cluster: A New Application for Spatial Analysis of Pixelated Data for Epiphytotics.

    Science.gov (United States)

    Nelson, Scot C; Corcoja, Iulian; Pethybridge, Sarah J

    2017-12-01

    Spatial analysis of epiphytotics is essential to develop and test hypotheses about pathogen ecology, disease dynamics, and to optimize plant disease management strategies. Data collection for spatial analysis requires substantial investment in time to depict patterns in various frames and hierarchies. We developed a new approach for spatial analysis of pixelated data in digital imagery and incorporated the method in a stand-alone desktop application called Cluster. The user isolates target entities (clusters) by designating up to 24 pixel colors as nontargets and moves a threshold slider to visualize the targets. The app calculates the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and computes the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters manually and/or automatically by specifying a size threshold value to exclude smaller targets from the analysis. Up to 1,000 stochastic simulations randomly place the centroids of each cluster in ranked order of size (largest to smallest) within each matrix while preserving their calculated angles of orientation for the long axes. A two-tailed probability t test compares the mean inter-cluster distances for the observed versus the values derived from randomly simulated maps. This is the basis for statistical testing of the null hypothesis that the clusters are randomly distributed within the frame of interest. These frames can assume any shape, from natural (e.g., leaf) to arbitrary (e.g., a rectangular or polygonal field). Cluster summarizes normalized attributes of clusters, including pixel number, axis length, axis width, compass orientation, and the length/width ratio, available to the user as a downloadable spreadsheet. Each simulated map may be saved as an image and inspected. Provided examples demonstrate the utility of Cluster to analyze patterns at various spatial scales in plant pathology and ecology and highlight the

  16. Neutronic analysis of the KSTAR tokamak using Beowulf cluster

    International Nuclear Information System (INIS)

    Park, Jeong Hwan; Cho, Nam Zin; Kim, Jinchoon

    2000-01-01

    High-beta, beam-heated deuterium plasmas in KSTAR (Korea Superconducting Tokamak Advanced Research) will produce a peak neutron yield of 3.5x10 16 per second. Two equally probable D-D fusion reactions occur in deuterium plasma, one producing 2.45 MeV neutrons, and the other producing tritons which are confined in the plasma and undergo D-T reactions producing 14.1 MeV neutrons which are about 3 percent of the 2.45 MeV neutrons. The biological dose, nuclear heating of the cryogenically cooled magnets, and neutron activation of the surrounding materials have been investigated and their results are used for designing the KSTAR tokamak and the facility. In this work, the Beowulf cluster, Galaxy is used for intensive Monte-Carlo simulations and it is shown to be a cost effective parallel machine. (author)

  17. Prognostically distinct clinical patterns of systemic lupus erythematosus identified by cluster analysis.

    Science.gov (United States)

    To, C H; Mok, C C; Tang, S S K; Ying, S K Y; Wong, R W S; Lau, C S

    2009-12-01

    The objective of this study was to evaluate the patterns of clinical manifestations and their mortality in a large cohort of Chinese patients with systemic lupus erythematosus. The cumulative clinical manifestations of a large group of Chinese systemic lupus erythematosus patients who fulfilled at least four American College of Rheumatology criteria for systemic lupus erythematosus were studied. Patients were divided into distinct groups by using the K-mean cluster analysis. Clinical features, prevalence of proliferative lupus nephritis (World Health Organization class III, IV), autoantibody profile, and treatment data were compared and the standardized mortality ratios were calculated for each cluster of patients. There were 1082 patients included in the study (mean age at systemic lupus erythematosus diagnosis 30.5 years; mean systemic lupus erythematosus duration 10.3 years). Three distinct groups of patients were identified. Cluster 1 (n = 347) was characterized predominantly by mucocutaneous manifestations (malar rash, discoid rash, photosensitivity, oral ulcer) and arthritis but having the lowest prevalence of serositis, hematologic manifestations (hemolytic anemia, leukopenia, and thrombocytopenia), and proliferative lupus nephritis. Patients in cluster 2 (n = 409) had mainly renal and hematological manifestations but having the lowest prevalence of mucocutaneous manifestations. Pulmonary and gastrointestinal manifestations were significantly more frequent in cluster 2 than the other clusters. Cluster 3 patients (n = 326) had the most heterogeneous features. Besides having a high prevalence of mucocutaneous manifestations, serositis and hematologic manifestations, renal involvement, and proliferative lupus nephritis was also most prevalent among the three clusters. Patients in cluster 2 had a much higher standardized mortality ratio [standardized mortality ratio 7.23 (6.7-7.7), p lupus erythematosus could be clustered into prognostically distinct patterns of

  18. Analysis of O(2) adsorption on binary-alloy clusters of gold: energetics and correlations.

    Science.gov (United States)

    Joshi, Ajay M; Delgass, W Nicholas; Thomson, Kendall T

    2006-11-23

    We report a B3LYP density-functional theory (DFT) analysis of O(2) adsorption on 27 Au(n)M(m) (m, n = 0-3 and m + n = 2 or 3; M = Cu, Ag, Pd, Pt, and Na) clusters. The LANL2DZ pseudopotential and corresponding double-zeta basis set was used for heavy atoms, while a 6-311+G(3df) basis set was used for Na and O. We employed basis-set superposition error (BSSE) corrections in the electronic adsorption energies at 0 K (deltaE(ads)) and also calculated adsorption thermodynamics at standard conditions (298.15 K and 1 atm), i.e., internal energy of adsorption (deltaU(ads)) and Gibbs free energy of adsorption (deltaG(ads)). Natural Bond Orbital (NBO) analysis showed that all the clusters donated electron density to adsorbed O(2) and we successfully predicted intuitive linear correlations between the NBO charge on adsorbed O(2), O-O bond length, and O-O stretching frequency. Although there was no clear trend in the O(2) binding energy (BE = -deltaE(ads)) on pure and alloy dimers, we found the following interesting trend for trimers: BE (MAu(2)) clusters. The clusters having strongly electropositive Na atoms (e.g., Na(3) and Na(2)Au) donated almost one full electron to adsorbed O(2), and the BE is maximum on these clusters. Although O(2) dissociation is likely in such cases, we have restricted this study to trends in the adsorption of molecular O(2) only. We also found an approximate linear correlation between the charge transfer and BE versus energy difference between the bare-cluster HOMO and O(2) LUMOs, which we speculate to be a fundamental descriptor of the reactivity of small clusters toward O(2). Part of the scatter in these correlations is attributed to the differences in the O(2) binding orientations on different clusters (geometric effect). Relatively higher bare-cluster HOMO energy eases the charge transfer to adsorbed O(2) and enhances the reactivity toward O(2). The Frontier Orbital Picture (FOP) is not always useful in predicting the most favorable O(2) binding

  19. Salient concerns in using analgesia for cancer pain among outpatients: A cluster analysis study.

    Science.gov (United States)

    Meghani, Salimah H; Knafl, George J

    2017-02-10

    To identify unique clusters of patients based on their concerns in using analgesia for cancer pain and predictors of the cluster membership. This was a 3-mo prospective observational study ( n = 207). Patients were included if they were adults (≥ 18 years), diagnosed with solid tumors or multiple myelomas, and had at least one prescription of around-the-clock pain medication for cancer or cancer-treatment-related pain. Patients were recruited from two outpatient medical oncology clinics within a large health system in Philadelphia. A choice-based conjoint (CBC) analysis experiment was used to elicit analgesic treatment preferences (utilities). Patients employed trade-offs based on five analgesic attributes (percent relief from analgesics, type of analgesic, type of side-effects, severity of side-effects, out of pocket cost). Patients were clustered based on CBC utilities using novel adaptive statistical methods. Multiple logistic regression was used to identify predictors of cluster membership. The analyses found 4 unique clusters: Most patients made trade-offs based on the expectation of pain relief (cluster 1, 41%). For a subset, the main underlying concern was type of analgesic prescribed, i.e ., opioid vs non-opioid (cluster 2, 11%) and type of analgesic side effects (cluster 4, 21%), respectively. About one in four made trade-offs based on multiple concerns simultaneously including pain relief, type of side effects, and severity of side effects (cluster 3, 28%). In multivariable analysis, to identify predictors of cluster membership, clinical and socioeconomic factors (education, health literacy, income, social support) rather than analgesic attitudes and beliefs were found important; only the belief, i.e ., pain medications can mask changes in health or keep you from knowing what is going on in your body was found significant in predicting two of the four clusters [cluster 1 (-); cluster 4 (+)]. Most patients appear to be driven by a single salient concern

  20. A Deep Learning Prediction Model Based on Extreme-Point Symmetric Mode Decomposition and Cluster Analysis

    OpenAIRE

    Li, Guohui; Zhang, Songling; Yang, Hong

    2017-01-01

    Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed ...

  1. Information search behaviour among new car buyers: A two-step cluster analysis

    Directory of Open Access Journals (Sweden)

    S.M. Satish

    2010-03-01

    Full Text Available A two-step cluster analysis of new car buyers in India was performed to identify taxonomies of search behaviour using personality and situational variables, apart from sources of information. Four distinct groups were found—broad moderate searchers, intense heavy searchers, low broad searchers, and low searchers. Dealers can identify the members of each segment by measuring the variables used for clustering, and can then design appropriate communication strategies.

  2. Applying clustering to statistical analysis of student reasoning about two-dimensional kinematics

    Directory of Open Access Journals (Sweden)

    R. Padraic Springuel

    2007-12-01

    Full Text Available We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and written elements. The primary goal of this paper is to describe the methodology itself; we include a brief overview of relevant results.

  3. Galaxy Cluster Pressure Profiles as Determined by Sunyaev Zel’dovich Effect Observations with MUSTANG and Bolocam. II. Joint Analysis of 14 Clusters

    Science.gov (United States)

    Romero, Charles E.; Mason, Brian S.; Sayers, Jack; Mroczkowski, Tony; Sarazin, Craig; Donahue, Megan; Baldi, Alessandro; Clarke, Tracy E.; Young, Alexander H.; Sievers, Jonathan; Dicker, Simon R.; Reese, Erik D.; Czakon, Nicole; Devlin, Mark; Korngut, Phillip M.; Golwala, Sunil

    2017-04-01

    We present pressure profiles of galaxy clusters determined from high-resolution Sunyaev-Zel’dovich (SZ) effect observations of 14 clusters, which span the redshift range of 0.25MUSTANG and Bolocam data. In this analysis, we adopt the generalized NFW parameterization of pressure profiles to produce our models. Our constraints on ensemble-average pressure profile parameters, in this study γ, C 500, and P 0, are consistent with those in previous studies, but for individual clusters we find discrepancies with the X-ray derived pressure profiles from the ACCEPT2 database. We investigate potential sources of these discrepancies, especially cluster geometry, electron temperature of the intracluster medium, and substructure. We find that the ensemble mean profile for all clusters in our sample is described by the parameters [γ ,{C}500,{P}0]=[{0.3}-0.1+0.1,{1.3}-0.1+0.1,{8.6}-2.4+2.4], cool core clusters are described by [γ ,{C}500,{P}0] =[{0.6}-0.1+0.1,{0.9}-0.1+0.1,{3.6}-1.5+1.5], and disturbed clusters are described by [γ ,{C}500,{P}0]=[{0.0}-0.0+0.1,{1.5}-0.2+0.1,{13.8}-1.6+1.6]. Of the 14 clusters, 4 have clear substructure in our SZ observations, while an additional 2 clusters exhibit potential substructure.

  4. Bioclim Deliverable D1: environmental change analysis

    International Nuclear Information System (INIS)

    2001-01-01

    The BIOCLIM project on modelling sequential Biosphere systems under Climate change for radioactive waste disposal is part of the EURATOM fifth European framework programme. The project was launched in October 2000 for a three-year period. The project aims at providing a scientific basis and practical methodology for assessing the possible long term impacts on the safety of radioactive waste repositories in deep formations due to climate and environmental change. The project brings together a number of representatives from both European radioactive waste management organisations which have national responsibilities for the safe disposal of radioactive waste, either as disposers or regulators, and several highly experienced climate research teams. In particular, BIOCLIM aims to address the important objective of how to represent the development of future biosphere systems by addressing both how to model long-term climate change, the relevant environmental consequences of such changes and the implementation of a sequential approach to such changes. The results from the development of this sophisticated approach will be of great benefit for improving long term radiological impact calculations and the information presented in a safety case. Simulations will be conducted to represent the time series of long-term climate in three European areas within which disposal sites may be established (i.e. Central/Southern Spain, Northeast of France and Central Britain). Two complementary strategies will provide representations of future climate predictions together with associated vegetation patterns using either an analysis of distinct climate states or a continuous climate simulation over at least one glacial-interglacial cycle and possibly for other selected periods over the next 1,000,000 years. These results will be used to derive the characteristics of possible future human environments (i.e. biosphere systems) through which radionuclides, emerging from the repository, may

  5. Mining environmental high-throughput sequence data sets to identify divergent amplicon clusters for phylogenetic reconstruction and morphotype visualization.

    Science.gov (United States)

    Gimmler, Anna; Stoeck, Thorsten

    2015-08-01

    Environmental high-throughput sequencing (envHTS) is a very powerful tool, which in protistan ecology is predominantly used for the exploration of diversity and its geographic and local patterns. We here used a pyrosequenced V4-SSU rDNA data set from a solar saltern pond as test case to exploit such massive protistan amplicon data sets beyond this descriptive purpose. Therefore, we combined a Swarm-based blastn network including 11 579 ciliate V4 amplicons to identify divergent amplicon clusters with targeted polymerase chain reaction (PCR) primer design for full-length small subunit of the ribosomal DNA retrieval and probe design for fluorescence in situ hybridization (FISH). This powerful strategy allows to benefit from envHTS data sets to (i) reveal the phylogenetic position of the taxon behind divergent amplicons; (ii) improve phylogenetic resolution and evolutionary history of specific taxon groups; (iii) solidly assess an amplicons (species') degree of similarity to its closest described relative; (iv) visualize the morphotype behind a divergent amplicons cluster; (v) rapidly FISH screen many environmental samples for geographic/habitat distribution and abundances of the respective organism and (vi) to monitor the success of enrichment strategies in live samples for cultivation and isolation of the respective organisms. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  6. SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance.

    Science.gov (United States)

    Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A

    2018-01-01

    Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.

  7. Analysis and comparison of very large metagenomes with fast clustering and functional annotation

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2009-10-01

    Full Text Available Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from http://tools.camera.calit2.net/camera/rammcap/.

  8. Application of fuzzy c-means clustering in data analysis of metabolomics.

    Science.gov (United States)

    Li, Xiang; Lu, Xin; Tian, Jing; Gao, Peng; Kong, Hongwei; Xu, Guowang

    2009-06-01

    Fuzzy c-means (FCM) clustering is an unsupervised method derived from fuzzy logic that is suitable for solving multiclass and ambiguous clustering problems. In this study, FCM clustering is applied to cluster metabolomics data. FCM is performed directly on the data matrix to generate a membership matrix which represents the degree of association the samples have with each cluster. The method is parametrized with the number of clusters (C) and the fuzziness coefficient (m), which denotes the degree of fuzziness in the algorithm. Both have been optimized by combining FCM with partial least-squares (PLS) using the membership matrix as the Y matrix in the PLS model. The quality parameters R(2)Y and Q(2) of the PLS model have been used to monitor and optimize C and m. Data of metabolic profiles from three gene types of Escherichia coli were used to demonstrate the method above. Different multivariable analysis methods have been compared. Principal component analysis failed to model the metabolite data, while partial least-squares discriminant analysis yielded results with overfitting. On the basis of the optimized parameters, the FCM was able to reveal main phenotype changes and individual characters of three gene types of E. coli. Coupled with PLS, FCM provides a powerful research tool for metabolomics with improved visualization, accurate classification, and outlier estimation.

  9. Selected industrial and environmental applications of neutron activation analysis

    International Nuclear Information System (INIS)

    Kucera, J.

    1999-01-01

    A review of the applications of Instrumental Neutron Activation Analysis (INAA) in the industrial and environmental fields is given. Detection limits for different applications are also given. (author)

  10. 32 CFR 651.10 - Actions requiring environmental analysis.

    Science.gov (United States)

    2010-07-01

    ...) ENVIRONMENTAL QUALITY ENVIRONMENTAL ANALYSIS OF ARMY ACTIONS (AR 200-2) National Environmental Policy Act and..., renewal, or amendment), in accordance with AR 95-50. (f) Materiel development, operation and support... engineering, laser testing, and electromagnetic pulse generation. (i) Leases, easements, permits, licenses, or...

  11. 15 CFR 971.204 - Environmental and use conflict analysis.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 3 2010-01-01 2010-01-01 false Environmental and use conflict... Applications Contents § 971.204 Environmental and use conflict analysis. (a) Environmental information... parameters listed in NOAA's Technical Guidance Document pertaining to the upper and lower water column should...

  12. Profiling nurses' job satisfaction, acculturation, work environment, stress, cultural values and coping abilities: A cluster analysis.

    Science.gov (United States)

    Goh, Yong-Shian; Lee, Alice; Chan, Sally Wai-Chi; Chan, Moon Fai

    2015-08-01

    This study aimed to determine whether definable profiles existed in a cohort of nursing staff with regard to demographic characteristics, job satisfaction, acculturation, work environment, stress, cultural values and coping abilities. A survey was conducted in one hospital in Singapore from June to July 2012, and 814 full-time staff nurses completed a self-report questionnaire (89% response rate). Demographic characteristics, job satisfaction, acculturation, work environment, perceived stress, cultural values, ways of coping and intention to leave current workplace were assessed as outcomes. The two-step cluster analysis revealed three clusters. Nurses in cluster 1 (n = 222) had lower acculturation scores than nurses in cluster 3. Cluster 2 (n = 362) was a group of younger nurses who reported higher intention to leave (22.4%), stress level and job dissatisfaction than the other two clusters. Nurses in cluster 3 (n = 230) were mostly Singaporean and reported the lowest intention to leave (13.0%). Resources should be allocated to specifically address the needs of younger nurses and hopefully retain them in the profession. Management should focus their retention strategies on junior nurses and provide a work environment that helps to strengthen their intention to remain in nursing by increasing their job satisfaction. © 2014 Wiley Publishing Asia Pty Ltd.

  13. Mapping informative clusters in a hierarchical [corrected] framework of FMRI multivariate analysis.

    Directory of Open Access Journals (Sweden)

    Rui Xu

    Full Text Available Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies.

  14. Cluster analysis for the probability of DSB site induced by electron tracks

    Energy Technology Data Exchange (ETDEWEB)

    Yoshii, Y. [Biological Research, Education and Instrumentation Center, Sapporo Medical University, Sapporo 060-8556 (Japan); Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Sasaki, K. [Faculty of Health Sciences, Hokkaido University of Science, Sapporo 006-8585 (Japan); Matsuya, Y. [Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Date, H., E-mail: date@hs.hokudai.ac.jp [Faculty of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan)

    2015-05-01

    To clarify the influence of bio-cells exposed to ionizing radiations, the densely populated pattern of the ionization in the cell nucleus is of importance because it governs the extent of DNA damage which may lead to cell lethality. In this study, we have conducted a cluster analysis of ionization and excitation events to estimate the number of double-strand breaks (DSBs) induced by electron tracks. A Monte Carlo simulation for electrons in liquid water was performed to determine the spatial location of the ionization and excitation events. The events were divided into clusters by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The algorithm enables us to sort out the events into the groups (clusters) in which a minimum number of neighboring events are contained within a given radius. For evaluating the number of DSBs in the extracted clusters, we have introduced an aggregation index (AI). The computational results show that a sub-keV electron produces DSBs in a dense formation more effectively than higher energy electrons. The root-mean square radius (RMSR) of the cluster size is below 5 nm, which is smaller than the chromatin fiber thickness. It was found that this size of clustering events has a high possibility to cause lesions in DNA within the chromatin fiber site.

  15. Fatigue Feature Extraction Analysis based on a K-Means Clustering Approach

    Directory of Open Access Journals (Sweden)

    M.F.M. Yunoh

    2015-06-01

    Full Text Available This paper focuses on clustering analysis using a K-means approach for fatigue feature dataset extraction. The aim of this study is to group the dataset as closely as possible (homogeneity for the scattered dataset. Kurtosis, the wavelet-based energy coefficient and fatigue damage are calculated for all segments after the extraction process using wavelet transform. Kurtosis, the wavelet-based energy coefficient and fatigue damage are used as input data for the K-means clustering approach. K-means clustering calculates the average distance of each group from the centroid and gives the objective function values. Based on the results, maximum values of the objective function can be seen in the two centroid clusters, with a value of 11.58. The minimum objective function value is found at 8.06 for five centroid clusters. It can be seen that the objective function with the lowest value for the number of clusters is equal to five; which is therefore the best cluster for the dataset.

  16. ENTREPRENEURIAL ACTIVITY IN ROMANIA – A TIME SERIES CLUSTERING ANALYSIS AT THE NUTS3 LEVEL

    Directory of Open Access Journals (Sweden)

    Sipos-Gug Sebastian

    2013-07-01

    Full Text Available Entrepreneurship is an active field of research, having known a major increase in interest and publication levels in the last years (Landström et al., 2012. Within this field recently there has been an increasing interest in understanding why some regions seem to have a significantly higher entrepreneurship activity compared to others. In line with this research field, we would like to investigate the differences in entrepreneurial activity among the Romanian counties (NUTS 3 regions. While the classical research paradigm in this field is to conduct a temporally stationary analysis, we choose to use a time series clustering analysis to better understanding the dynamics of entrepreneurial activity between counties. Our analysis showed that if we use the total number of new privately owned companies that are founded each year in the last decade (2002-2012 we can distinguish between 5 clusters, one with high total entrepreneurial activity (18 counties, one with above average activity (8 counties, two clusters with average and slightly below average activity (total of 18 counties and one cluster with low and declining activity (2 counties. If we are interested in the entrepreneurial activity rate, that is the number of new privately owned companies founded each year adjusted by the population of the respective county, we obtain 4 clusters, one with a very high entrepreneurial rate (1 county, one with average rate (10 counties, and two clusters with below average entrepreneurial rate (total of 31 counties. In conclusion, our research shows that Romania is far from being a homogeneous geographical area in respect to entrepreneurial activity. Depending on what we are interested in, it can be divided in 5 or 4 clusters of counties, which behave differently as a function of time. Further research should be focused on explaining these regional differences, on studying the high performance clusters and trying to improve the low performing ones.

  17. Provenance Study of the Terracotta Army of Qin Shihuang’s Mausoleum by Fuzzy Cluster Analysis

    OpenAIRE

    Li, Rongwu; Li, Guoxia

    2015-01-01

    20 samples and 44 samples of terracotta warriors and horses from the 1st and 3rd pits of Qin Shihuang’s Mausoleum, 20 samples of clay near Qin’s Mausoleum, and 2 samples of Yaozhou porcelain bodies are obtained to determine the contents of 32 elements in each of them by neutron activation analysis (NAA). The NAA data are further analyzed using fuzzy cluster analysis to obtain the fuzzy cluster trend diagram. The analysis shows that the origins of the raw material of the terracotta warriors an...

  18. Identifying patterns in treatment response profiles in acute bipolar mania: a cluster analysis approach

    Directory of Open Access Journals (Sweden)

    Houston John P

    2008-07-01

    Full Text Available Abstract Background Patients with acute mania respond differentially to treatment and, in many cases, fail to obtain or sustain symptom remission. The objective of this exploratory analysis was to characterize response in bipolar disorder by identifying groups of patients with similar manic symptom response profiles. Methods Patients (n = 222 were selected from a randomized, double-blind study of treatment with olanzapine or divalproex in bipolar I disorder, manic or mixed episode, with or without psychotic features. Hierarchical clustering based on Ward's distance was used to identify groups of patients based on Young-Mania Rating Scale (YMRS total scores at each of 5 assessments over 7 weeks. Logistic regression was used to identify baseline predictors for clusters of interest. Results Four distinct clusters of patients were identified: Cluster 1 (n = 64: patients did not maintain a response (YMRS total scores ≤ 12; Cluster 2 (n = 92: patients responded rapidly (within less than a week and response was maintained; Cluster 3 (n = 36: patients responded rapidly but relapsed soon afterwards (YMRS ≥ 15; Cluster 4 (n = 30: patients responded slowly (≥ 2 weeks and response was maintained. Predictive models using baseline variables found YMRS Item 10 (Appearance, and psychosis to be significant predictors for Clusters 1 and 4 vs. Clusters 2 and 3, but none of the baseline characteristics allowed discriminating between Clusters 1 vs. 4. Experiencing a mixed episode at baseline predicted membership in Clusters 2 and 3 vs. Clusters 1 and 4. Treatment with divalproex, larger number of previous manic episodes, lack of disruptive-aggressive behavior, and more prominent depressive symptoms at baseline were predictors for Cluster 3 vs. 2. Conclusion Distinct treatment response profiles can be predicted by clinical features at baseline. The presence of these features as potential risk factors for relapse in patients who have responded to treatment

  19. A Lexical Analysis of Environmental Sound Categories

    Science.gov (United States)

    Houix, Olivier; Lemaitre, Guillaume; Misdariis, Nicolas; Susini, Patrick; Urdapilleta, Isabel

    2012-01-01

    In this article we report on listener categorization of meaningful environmental sounds. A starting point for this study was the phenomenological taxonomy proposed by Gaver (1993b). In the first experimental study, 15 participants classified 60 environmental sounds and indicated the properties shared by the sounds in each class. In a second…

  20. Analysis of the defect clusters in congruent lithium tantalate

    Science.gov (United States)

    Vyalikh, Anastasia; Zschornak, Matthias; Köhler, Thomas; Nentwich, Melanie; Weigel, Tina; Hanzig, Juliane; Zaripov, Ruslan; Vavilova, Evgenia; Gemming, Sibylle; Brendler, Erica; Meyer, Dirk C.

    2018-01-01

    A wide range of technological applications of lithium tantalate (LT) is closely related to the defect chemistry. In literature, several intrinsic defect models have been proposed. Here, using a combinational approach based on DFT and solid-state NMR, we demonstrate that distribution of electric field gradients (EFGs) can be employed as a fingerprint of a specific defect configuration. Analyzing the distribution of 7Li EFGs, the FT-IR and electron spin resonance (ESR) spectra, and the 7Li spin-lattice relaxation behavior, we have found that the congruent LT samples provided by two manufacturers show rather different defect concentrations and distributions although both were grown by the Czochralski method. After thermal treatment hydrogen out-diffusion and homogeneous distribution of other defects have been observed by ESR, NMR, and FT-IR. The defect structure in one of two congruent LT crystals after annealing has been identified and proved by defect formation energy considerations, whereas the more complex defect configuration, including the presence of extrinsic defects, has been suggested for the other LT sample. The approach of searching the EFG fingerprints from DFT calculations in NMR spectra can be applied for identifying the defect clusters in other complex oxides.

  1. Differentiating Procrastinators from Each Other: A Cluster Analysis.

    Science.gov (United States)

    Rozental, Alexander; Forsell, Erik; Svensson, Andreas; Forsström, David; Andersson, Gerhard; Carlbring, Per

    2015-01-01

    Procrastination refers to the tendency to postpone the initiation and completion of a given course of action. Approximately one-fifth of the adult population and half of the student population perceive themselves as being severe and chronic procrastinators. Albeit not a psychiatric diagnosis, procrastination has been shown to be associated with increased stress and anxiety, exacerbation of illness, and poorer performance in school and work. However, despite being severely debilitating, little is known about the population of procrastinators in terms of possible subgroups, and previous research has mainly investigated procrastination among university students. The current study examined data from a screening process recruiting participants to a randomized controlled trial of Internet-based cognitive behavior therapy for procrastination (Rozental et al., in press). In total, 710 treatment-seeking individuals completed self-report measures of procrastination, depression, anxiety, and quality of life. The results suggest that there might exist five separate subgroups, or clusters, of procrastinators: "Mild procrastinators" (24.93%), "Average procrastinators" (27.89%), "Well-adjusted procrastinators" (13.94%), "Severe procrastinators" (21.69%), and "Primarily depressed" (11.55%). Hence, there seems to be marked differences among procrastinators in terms of levels of severity, as well as a possible subgroup for which procrastinatory problems are primarily related to depression. Tailoring the treatment interventions to the specific procrastination profile of the individual could thus become important, as well as screening for comorbid psychiatric diagnoses in order to target difficulties associated with, for instance, depression.

  2. Event-by-Event Cluster Analysis of Final States from Heavy Ion Collisions

    International Nuclear Information System (INIS)

    Fialkowski, K.; Wit, R.

    1999-01-01

    We present an event-by-event analysis of the cluster structure of final multihadron states resulting from heavy ion collisions. A comparison of experimental data with the states obtained from Monte Carlo generators is shown. The analysis of the first available experimental events suggests that the method is suitable for selecting some different types of events. (author)

  3. Identification of Counterfeit Alcoholic Beverages Using Cluster Analysis in Principal-Component Space

    Science.gov (United States)

    Khodasevich, M. A.; Sinitsyn, G. V.; Gres'ko, M. A.; Dolya, V. M.; Rogovaya, M. V.; Kazberuk, A. V.

    2017-07-01

    A study of 153 brands of commercial vodka products showed that counterfeit samples could be identified by introducing a unified additive at the minimum concentration acceptable for instrumental detection and multivariate analysis of UV-Vis transmission spectra. Counterfeit products were detected with 100% probability by using hierarchical cluster analysis or the C-means method in two-dimensional principal-component space.

  4. Cluster Analysis of Flow Cytometric List Mode Data on a Personal Computer

    NARCIS (Netherlands)

    Bakker Schut, Tom C.; Bakker schut, T.C.; de Grooth, B.G.; Greve, Jan

    1993-01-01

    A cluster analysis algorithm, dedicated to analysis of flow cytometric data is described. The algorithm is written in Pascal and implemented on an MS-DOS personal computer. It uses k-means, initialized with a large number of seed points, followed by a modified nearest neighbor technique to reduce

  5. On the blind use of statistical tools in the analysis of globular cluster stars

    Science.gov (United States)

    D'Antona, Francesca; Caloi, Vittoria; Tailo, Marco

    2018-04-01

    As with most data analysis methods, the Bayesian method must be handled with care. We show that its application to determine stellar evolution parameters within globular clusters can lead to paradoxical results if used without the necessary precautions. This is a cautionary tale on the use of statistical tools for big data analysis.

  6. Assessment of life cycle environmental benefits of an industrial symbiosis cluster in China.

    Science.gov (United States)

    Yu, Fei; Han, Feng; Cui, Zhaojie

    2015-04-01

    Reusing industrial waste may have impressive potential environmental benefits, especially in terms of the total life cycle, and life cycle assessment (LCA) has been proved to be an effective method to evaluate industrial symbiosis (IS). Circular economy and IS have been developed for decades and have been successful in China. However, very few studies about the environmental benefit assessment of IS applied by LCA in China have been conducted. In the current article, LCA was used to evaluate the environmental benefits and costs of IS, compared with a no-IS scenario for four environmental impact categories. The results showed that four environmental benefits were avoided by the 11 symbiosis performances, namely, 41.6 thousand TJ of primary energy, 4.47 million t CO2e of greenhouse gasses, 19.7 thousand t SO2e of acidification, and 81.1 t PO4(3+)e of eutrophication. Among these IS performances, the comprehensive utilization of red mud produced the most visible benefit. The results also present that energy conservation was the distinctive feature of IS in China.

  7. Fault detection of flywheel system based on clustering and principal component analysis

    Directory of Open Access Journals (Sweden)

    Wang Rixin

    2015-12-01

    Full Text Available Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of “integrated power and attitude control” system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the relationship of parameters in each operation through the principal component analysis (PCA method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.

  8. Techniques and instruments used for real-time analysis of atmospheric nanoscale molecular clusters: A review

    Directory of Open Access Journals (Sweden)

    Xue Li

    2015-11-01

    Full Text Available The extremely high concentrations of PM2.5 (particulate matter with an aerodynamic meter ≤ 2.5 μm during severe and persistent haze events in China have been closely related to the formation of secondary aerosols (SA. New particle formation (NPF is the critical initial step of SA formation. New particles are commonly formed from gas-phase precursors (e.g., SO2, volatile organic compounds via nucleation and initial growth, in which molecular clusters with a mobility diameter smaller than 3 nm (hereafter referred to nanoscale molecular clusters will be involved throughout the whole process. Recently, significant breakthroughs have been obtained on NPF studies, which are mostly attributed to the technical development in the real-time analysis of size-resolved number concentration and chemical composition of nanoscale molecular clusters. Regarding the detection of size-resolved number concentrations of nanoscale molecular clusters, both methods and instruments have been well built up; practical application in laboratory-scale experiments and field measurements have also been successfully demonstrated. In contrast, real-time analysis of chemical composition of nanoscale molecular clusters has still encountered the great challenges caused by the complex organic compositions of the clusters, and improvement of present analytical strategies is urgently required. The better understanding in NPF will not only benefit the atmospheric modeling and climate predictions but also the source control of SA.

  9. Approximate fuzzy C-means (AFCM) cluster analysis of medical magnetic resonance image (MRI) data

    International Nuclear Information System (INIS)

    DelaPaz, R.L.; Chang, P.J.; Bernstein, R.; Dave, J.V.

    1987-01-01

    The authors describe the application of an approximate fuzzy C-means (AFCM) clustering algorithm as a data dimension reduction approach to medical magnetic resonance images (MRI). Image data consisted of one T1-weighted, two T2-weighted, and one T2*-weighted (magnetic susceptibility) image for each cranial study and a matrix of 10 images generated from 10 combinations of TE and TR for each body lymphoma study. All images were obtained with a 1.5 Tesla imaging system (GE Signa). Analyses were performed on over 100 MR image sets with a variety of pathologies. The cluster analysis was operated in an unsupervised mode and computational overhead was minimized by utilizing a table look-up approach without adversely affecting accuracy. Image data were first segmented into 2 coarse clusters, each of which was then subdivided into 16 fine clusters. The final tissue classifications were presented as color-coded anatomically-mapped images and as two and three dimensional displays of cluster center data in selected feature space (minimum spanning tree). Fuzzy cluster analysis appears to be a clinically useful dimension reduction technique which results in improved diagnostic specificity of medical magnetic resonance images

  10. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling

    Science.gov (United States)

    Rahman, Md. Habibur; Matin, M. A.; Salma, Umma

    2017-12-01

    The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.

  11. Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing binary outcomes: a simulation study

    Directory of Open Access Journals (Sweden)

    Ma Jinhui

    2013-01-01

    Full Text Available Abstracts Background The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE and cluster-specific (i.e. random-effects logistic regression (RELR models for analyzing data from cluster randomized trials (CRTs with missing binary responses. Methods In this simulation study, clustered responses were generated from a beta-binomial distribution. The number of clusters per trial arm, the number of subjects per cluster, intra-cluster correlation coefficient, and the percentage of missing data were allowed to vary. Under the assumption of covariate dependent missingness, missing outcomes were handled by complete case analysis, standard multiple imputation (MI and within-cluster MI strategies. Data were analyzed using GEE and RELR. Performance of the methods was assessed using standardized bias, empirical standard error, root mean squared error (RMSE, and coverage probability. Results GEE performs well on all four measures — provided the downward bias of the standard error (when the number of clusters per arm is small is adjusted appropriately — under the following scenarios: complete case analysis for CRTs with a small amount of missing data; standard MI for CRTs with variance inflation factor (VIF 50. RELR performs well only when a small amount of data was missing, and complete case analysis was applied. Conclusion GEE performs well as long as appropriate missing data strategies are adopted based on the design of CRTs and the percentage of missing data. In contrast, RELR does not perform well when either standard or within-cluster MI strategy is applied prior to the analysis.

  12. Dynamic Characteristics Analysis and Stabilization of PV-Based Multiple Microgrid Clusters

    DEFF Research Database (Denmark)

    Zhao, Zhuoli; Yang, Ping; Wang, Yuewu

    2018-01-01

    As the penetration of PV generation increases, there is a growing operational demand on PV systems to participate in microgrid frequency regulation. It is expected that future distribution systems will consist of multiple microgrid clusters. However, interconnecting PV microgrids may lead to system...... interactions and instability. To date, no research work has been done to analyze the dynamic behavior and enhance the stability of microgrid clusters considering the dynamics of the PV primary sources and dc links. To fill this gap, this paper presents comprehensive modeling, analysis, and stabilization of PV......-based multiple microgrid clusters. A detailed small-signal model for PV-based microgrid clusters considering local adaptive dynamic droop control mechanism of the voltage-source PV system is developed. The complete dynamic model is then used to access and compare the dynamic characteristics of the single...

  13. Grey Wolf Optimizer Based on Powell Local Optimization Method for Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Sen Zhang

    2015-01-01

    Full Text Available One heuristic evolutionary algorithm recently proposed is the grey wolf optimizer (GWO, inspired by the leadership hierarchy and hunting mechanism of grey wolves in nature. This paper presents an extended GWO algorithm based on Powell local optimization method, and we call it PGWO. PGWO algorithm significantly improves the original GWO in solving complex optimization problems. Clustering is a popular data analysis and data mining technique. Hence, the PGWO could be applied in solving clustering problems. In this study, first the PGWO algorithm is tested on seven benchmark functions. Second, the PGWO algorithm is used for data clustering on nine data sets. Compared to other state-of-the-art evolutionary algorithms, the results of benchmark and data clustering demonstrate the superior performance of PGWO algorithm.

  14. See Change: Cosmology Analysis Update for the Supernova Cosmology Project High-z Cluster Supernova Survey

    Science.gov (United States)

    Hayden, Brian; Aldering, Gregory; Amanullah, Rahman; Barbary, Kyle; Bohringer, Hans; Boone, Kyle Robert; Brodwin, Mark; Cunha, Carlos; Currie, Miles; Deustua, Susana; Dixon, Samantha; Eisenhardt, Peter; Fassbender, Rene; Fruchter, Andrew; Gladders, Michael; Gonzalez, Anthony; Goobar, Ariel; Hildebrandt, Hendrik; Hilton, Matt; Hoekstra, Henk; Hook, Isobel; Huang, Xiaosheng; Huterer, Dragan; Jee, Myungkook James; Kim, Alex; Kowalski, Marek; Lidman, Chris; Linder, Eric; Luther, Kyle; Meyers, Joshua; Muzzin, Adam; Nordin, Jakob; Pain, Reynald; Perlmutter, Saul; Richard, Johan; Rosati, Piero; Rozo, Eduardo; Rubin, David; Ruiz-Lapuente, Pilar; Rykoff, Eli; Santos, Joana; Myers Saunders, Clare; Sofiatti, Caroline; Spadafora, Anthony L.; Stanford, Spencer; Stern, Daniel; Suzuki, Nao; Webb, Tracy; Wechsler, Risa; Williams, Steven; Willis, Jon; Wilson, Gillian; Yen, Mike

    2018-01-01

    The Supernova Cosmology Project has finished executing a large (174 orbits, cycles 22-23) Hubble Space Telescope program, which has measured ~30 type Ia Supernovae above z~1 in the highest-redshift, most massive galaxy clusters known to date. We present the status of the ongoing blinded cosmology analysis, demonstrating substantial improvement to the uncertainty on the Dark Energy density above z~1. Our extensive HST and ground-based campaign has already produced unique results; we have confirmed several of the highest redshift cluster members known to date, confirmed the redshift of one of the most massive galaxy clusters expected across the entire sky, and characterized one of the most extreme starburst environments yet known in a z~1.7 cluster. We have also discovered a lensed SN Ia at z=2.22 magnified by a factor of ~2.8, which is the highest spectroscopic redshift SN Ia currently known.

  15. A Model-Based Cluster Analysis of Maternal Emotion Regulation and Relations to Parenting Behavior.

    Science.gov (United States)

    Shaffer, Anne; Whitehead, Monica; Davis, Molly; Morelen, Diana; Suveg, Cynthia

    2017-10-15

    In a diverse community sample of mothers (N = 108) and their preschool-aged children (M age  = 3.50 years), this study conducted person-oriented analyses of maternal emotion regulation (ER) based on a multimethod assessment incorporating physiological, observational, and self-report indicators. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia (RSA), and RSA suppression across two laboratory tasks. Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior. A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed. © 2017 Family Process Institute.

  16. Analysis of space payload operation modes based on divide-and-conquer clustering

    Directory of Open Access Journals (Sweden)

    Si Feng

    2016-01-01

    Full Text Available With the development of space electronic technology, the space payload operation modes are more and more complex, and manual interpretation is prone to errors for much workload. Generally the space payload’s operation modes are reflected by its telemetry data. By analysing the characteristics of the payload telemetry data, it is proposed an automatic analysis method of payload operation modes based on divide–and–conquer clustering. The clustering method combines division and incremental clustering. The principle of the method is introduced and the method is validated using the actual payload telemetry data. Furthermore the improved method is proposed to the problems encountered. Experimental results show that divide–and–conquer clustering method has the feature of calculation simple and classification accurate, when applied to the classification of payload operation modes. Furthermore this method can be applied to the other areas of payload data processing by extending the method.

  17. Environmental effects on stellar populations of dwarf galaxies and star clusters

    Science.gov (United States)

    Pasetto, Stefano; Cropper, Mark; fujita, Yutaka; Chiosi, Cesare; Grebel, Eva K.

    2015-08-01

    We investigate the competitive role of the different dissipative phenomena acting on the onset of star formation history of gravitationally bound system in an external environment. Ram pressure, Kelvin-Helmholtz instability, Rayleigh-Taylor, and tidal forces are accounted separately in an analytical framework and compared in their role in influencing the star forming regions. We present an analytical criterion to elucidate the dependence of star formation in a spherical stellar system on its surrounding environment useful in observational applications as well as theoretical interpretations of numerical results.We consider the different signatures of these phenomena in synthetically realized colour-magnitude diagrams (CMDs) of the orbiting system thus investigating the detectability limits of these different effects for future observational projects and their relevance.The theoretical framework developed has direct applications to the cases of dwarf galaxies in galaxy clusters and dwarf galaxies orbiting our Milky Way system, as well as any primordial gas-rich cluster of stars orbiting within its host galaxy.

  18. Environmental effects on star formation in dwarf galaxies and star clusters

    Science.gov (United States)

    Pasetto, Stefano; Cropper, Mark; fujita, Yutaka; Chiosi, Cesare; Grebel, Eva K.

    2015-08-01

    We investigate the competitive role of the different dissipative phenomena acting on the onset of star formation history of gravitationally bound system in an external environment.Ram pressure, Kelvin-Helmholtz instability, Rayleigh-Taylor, and tidal forces are accounted separately in an analytical framework and compared in their role in influencing the star forming regions. The two-fluids instability at the interface between a stellar system and its surrounding hotter and less dense environment is related to the star formation processes through a set of differential equations. We present an analytical criterion to elucidate the dependence of star formation in a spherical stellar system on its surrounding environment useful in theoretical interpretations of numerical results as well as observational applications. We show how spherical coordinates naturally enlighten the interpretation of the two-fluids instability in a geometry that directly applies to astrophysical case. Finally, we consider the different signatures of these phenomena in synthetically realized colour-magnitude diagrams of the orbiting system thus investigating the detectability limits of these different effects for future observational projects and their relevance.The theoretical framework developed has direct applications to the cases of dwarf galaxies in galaxy clusters and dwarf galaxies orbiting our Milky Way system, as well as any primordial gas-rich cluster of stars orbiting within its host galaxy.

  19. Improving estimation of kinetic parameters in dynamic force spectroscopy using cluster analysis

    Science.gov (United States)

    Yen, Chi-Fu; Sivasankar, Sanjeevi

    2018-03-01

    Dynamic Force Spectroscopy (DFS) is a widely used technique to characterize the dissociation kinetics and interaction energy landscape of receptor-ligand complexes with single-molecule resolution. In an Atomic Force Microscope (AFM)-based DFS experiment, receptor-ligand complexes, sandwiched between an AFM tip and substrate, are ruptured at different stress rates by varying the speed at which the AFM-tip and substrate are pulled away from each other. The rupture events are grouped according to their pulling speeds, and the mean force and loading rate of each group are calculated. These data are subsequently fit to established models, and energy landscape parameters such as the intrinsic off-rate (koff) and the width of the potential energy barrier (xβ) are extracted. However, due to large uncertainties in determining mean forces and loading rates of the groups, errors in the estimated koff and xβ can be substantial. Here, we demonstrate that the accuracy of fitted parameters in a DFS experiment can be dramatically improved by sorting rupture events into groups using cluster analysis instead of sorting them according to their pulling speeds. We test different clustering algorithms including Gaussian mixture, logistic regression, and K-means clustering, under conditions that closely mimic DFS experiments. Using Monte Carlo simulations, we benchmark the performance of these clustering algorithms over a wide range of koff and xβ, under different levels of thermal noise, and as a function of both the number of unbinding events and the number of pulling speeds. Our results demonstrate that cluster analysis, particularly K-means clustering, is very effective in improving the accuracy of parameter estimation, particularly when the number of unbinding events are limited and not well separated into distinct groups. Cluster analysis is easy to implement, and our performance benchmarks serve as a guide in choosing an appropriate method for DFS data analysis.

  20. NMR metabolic analysis of samples using fuzzy K-means clustering.

    Science.gov (United States)

    Cuperlović-Culf, Miroslava; Belacel, Nabil; Culf, Adrian S; Chute, Ian C; Ouellette, Rodney J; Burton, Ian W; Karakach, Tobias K; Walter, John A

    2009-12-01

    The global analysis of metabolites can be used to define the phenotypes of cells, tissues or organisms. Classifying groups of samples based on their metabolic profile is one of the main topics of metabolomics research. Crisp clustering methods assign each feature to one cluster, thereby omitting information about the multiplicity of sample subtypes. Here, we present the application of fuzzy K-means clustering method for the classification of samples based on metabolomics 1D (1)H NMR fingerprints. The sample classification was performed on NMR spectra of cancer cell line extracts and of urine samples of type 2 diabetes patients and animal models. The cell line dataset included NMR spectra of lipophilic cell extracts for two normal and three cancer cell lines with cancer cell lines including two invasive and one non-invasive cancers. The second dataset included previously published NMR spectra of urine samples of human type 2 diabetics and healthy controls, mouse wild type and diabetes model and rat obese and lean phenotypes. The fuzzy K-means clustering method allowed more accurate sample classification in both datasets relative to the other tested methods including principal component analysis (PCA), hierarchical clustering (HCL) and K-means clustering. In the cell line samples, fuzzy clustering provided a clear separation of individual cell lines, groups of cancer and normal cell lines as well as non-invasive and invasive tumour cell lines. In the diabetes dataset, clear separation of healthy controls and diabetics in all three models was possible only by using the fuzzy clustering method.

  1. Cluster analysis of near-infrared reflectance spectra of asteroid Itokawa

    Science.gov (United States)

    Inasawa, Tomoki; Kitazato, Kohei; Hirata, Naru; Demura, Hirohide

    2017-10-01

    The data from the analysis of samples returned by Hayabusa spacecraft have provided conclusive evidence regarding mineral composition and space weathering of near-Earth S-type asteroid Itokawa. To apply these information to the Hayabusa remote sensing data towards revealing the formation history of Itokawa, we made a more precise near-infrared spectral map of Itokawa than the previous ones from the Hayabusa NIRS data and performed its cluster analysis. The NIRS instrument had acquired more than 80,000 spatially resolved 0.75 to 2.20 microns reflectance spectra from the surface of Itokawa. We used PCA and k-means clustering for cluster analysis and found that at least three different types of surface areas would exist on Itokawa.

  2. Preliminary Cluster Analysis For Several Representatives Of Genus Kerivoula (Chiroptera: Vespertilionidae) in Borneo

    Science.gov (United States)

    Hasan, Noor Haliza; Abdullah, M. T.

    2008-01-01

    The aim of the study is to use cluster analysis on morphometric parameters within the genus Kerivoula to produce a dendrogram and to determine the suitability of this method to describe the relationship among species within this genus. A total of 15 adult male individuals from genus Kerivoula taken from sampling trips around Borneo and specimens kept at the zoological museum of Universiti Malaysia Sarawak were examined. A total of 27 characters using dental, skull and external body measurements were recorded. Clustering analysis illustrated the grouping and morphometric relationships between the species of this genus. It has clearly separated each species from each other despite the overlapping of measurements of some species within the genus. Cluster analysis provides an alternative approach to make a preliminary identification of a species.

  3. Use of multiple cluster analysis methods to explore the validity of a community outcomes concept map.

    Science.gov (United States)

    Orsi, Rebecca

    2017-02-01

    Concept mapping is now a commonly-used technique for articulating and evaluating programmatic outcomes. However, research regarding validity of knowledge and outcomes produced with concept mapping is sparse. The current study describes quantitative validity analyses using a concept mapping dataset. We sought to increase the validity of concept mapping evaluation results by running multiple cluster analysis methods and then using several metrics to choose from among solutions. We present four different clustering methods based on analyses using the R statistical software package: partitioning around medoids (PAM), fuzzy analysis (FANNY), agglomerative nesting (AGNES) and divisive analysis (DIANA). We then used the Dunn and Davies-Bouldin indices to assist in choosing a valid cluster solution for a concept mapping outcomes evaluation. We conclude that the validity of the outcomes map is high, based on the analyses described. Finally, we discuss areas for further concept mapping methods research. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    Science.gov (United States)

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  5. Environmental analysis of biomass-ethanol facilities

    Energy Technology Data Exchange (ETDEWEB)

    Corbus, D.; Putsche, V.

    1995-12-01

    This report analyzes the environmental regulatory requirements for several process configurations of a biomass-to-ethanol facility. It also evaluates the impact of two feedstocks (municipal solid waste [MSW] and agricultural residues) and three facility sizes (1000, 2000, and 3000 dry tons per day [dtpd]) on the environmental requirements. The basic biomass ethanol process has five major steps: (1) Milling, (2) Pretreatment, (3) Cofermentation, (4) Enzyme production, (5) Product recovery. Each step could have environmental impacts and thus be subject to regulation. Facilities that process 2000 dtpd of MSW or agricultural residues would produce 69 and 79 million gallons of ethanol, respectively.

  6. CLUSTER ANALYSIS UNTUK MEMPREDIKSI TALENTA PEMAIN BASKET MENGGUNAKAN JARINGAN SARAF TIRUAN SELF ORGANIZING MAPS (SOM

    Directory of Open Access Journals (Sweden)

    Gregorius Satia Budhi

    2008-01-01

    Full Text Available Basketball World has grown rapidly as the time goes on. This is signed by many competition and game all over the world. With the result there are many basketball players with their different playing characteristics. Demand for a coach or scout to look for or search great players to make a solid team as a coach requirement. With this application, a coach or scout will be helped in analyzing in decision making. This application uses Self Organizing Maps algorithm (SOM for Cluster Analysis. The real NBA player data is used for competitive learning or training process and real player data from Indonesian or Petra Christian University Basketball Players is used for testing process. The NBA Player data is prepared through cleaning process and then is transformed into a form that can be processed by SOM Algorithm. After that, the data is clustered with the SOM algorithm. The result of that clusters is displayed into a form that is easy to view and analyze. This result can be saved into a text file. By using the output / result of this application, that are the clusters of NBA player, the user can see the statistics of each cluster. With these cluster statistics coach or scout can predict the statistic and the position of a testing player who is in the same cluster. This information can give a support for the coach or scout to make a decision. Abstract in Bahasa Indonesia : Dunia bola basket telah berkembang dengan pesat seiring dengan berjalannya waktu. Hal ini ditandai dengan munculnya berbagai macam dan jenis kompetisi dan pertandingan baik dunia maupun dalam negeri. Sehingga makin banyak dilahirkannya pemain berbakat dengan berbagai karakteristik permainan yang berbeda. Tuntutan bagi seorang pelatih/pemandu bakat, untuk dapat melihat secara jeli dalam memenuhi kebutuhan tim untuk membentuk tim yang solid. Dengan dibuatnya aplikasi ini, maka akan membantu proses analisis dan pengambilan keputusan bagi pelatih maupun pemandu bakat Aplikasi ini

  7. Cluster analysis and relative relocation of mining-induced seismicity using HAMNET data

    Science.gov (United States)

    Wehling-Benatelli, S.; Becker, D.; Bischoff, M.; Friederich, W.; Meier, T.

    2012-04-01

    Longwall mining activity in the Ruhr-coal mining district leads to mining-induced seismicity. For detailed studies seismicity of the single longwall panel S 109 beneath Hamm-Herringen in the eastern Ruhr area was monitored between June 2006 and July 2007. More than 7000 seismic events with magnitudes -1.7 ≤ ML ≤ 2.0 are localized in this period. 70% of the events occur in the vicinity of the moving longwall face. Moreover, the seismicity pattern shows spatial clustering of events in distances up to 500 m from the panel which is related to remnant pillars of old workings and tectonic features. Two sources with common location and rock failure mechanism are expected to show identical waveforms. Hence, similar waveforms suggest similarity of source properties. Waveform similarity can be quantified by cross-correlation. Similarity matrices have been established and build the basis of a cluster analysis presented here. We compare two approaches for cluster definition: a single-linkage approach and excerpting clusters by visual inspection of the sorted similarity matrices. Clusters are found as areas of high inter-event similarity in the depicted matrix. In contrast, the single-linkage approach assigns an event to the cluster if the similarity threshold v sl = 0.9 is exceeded to at least one other member. This method is more restrictive and, in general, leads to clusters with less members than visual inspection. Both methods exhibit clusters which show the same properties. The largest clusters are built by low-magnitude events (around ML ≈-0.6) directly at the longwall face at the mining level. Other clusters include events with magnitudes as large as ML,max = 1.8. Their locations tend to lie above or below the mining level in load-bearing sandstone layers. Mining accompanying events show face-parallel near vertical fault planes whereas more distant clusters have typical solutions of remnant pillar failure with a medium dip angle. Relative relocation of the events

  8. Using Factor Analysis to Generate Clusters of Agile Practices

    OpenAIRE

    Abbas, Noura; Gravell, Andy; Wills, Gary

    2010-01-01

    In this paper, factor analysis is applied on a set of data that was collected to study the effectiveness of 58 different agile practices. The analysis extracted 15 factors; each was associated with a list of practices. These factors with the associated practices can be used as a guide for agile process improvement. Correlations between the extracted factors were calculated, and the significant correlation findings suggested that people who applied iterative and incremental development and qua...

  9. Environmental Impact Analysis Process Chemical Release Experiment

    National Research Council Canada - National Science Library

    1999-01-01

    The U.S. Air Force proposes to conduct an experiment to identify the potential environmental consequences of an inadvertent release of hydrazine rocket propellant in space, during orbital or suborbital operations...

  10. Applied research of environmental monitoring using instrumental neutron activation analysis

    Energy Technology Data Exchange (ETDEWEB)

    Chung, Young Sam; Moon, Jong Hwa; Chung, Young Ju

    1997-08-01

    This technical report is written as a guide book for applied research of environmental monitoring using Instrumental Neutron Activation Analysis. The contents are as followings; sampling and sample preparation as a airborne particulate matter, analytical methodologies, data evaluation and interpretation, basic statistical methods of data analysis applied in environmental pollution studies. (author). 23 refs., 7 tabs., 9 figs.

  11. Sustainable Process Design under uncertainty analysis: targeting environmental indicators

    DEFF Research Database (Denmark)

    L. Gargalo, Carina; Gani, Rafiqul

    2015-01-01

    This study focuses on uncertainty analysis of environmental indicators used to support sustainable process design efforts. To this end, the Life Cycle Assessment methodology is extended with a comprehensive uncertainty analysis to propagate the uncertainties in input LCA data to the environmental...

  12. Regional environmental analysis and management: New techniques for current problems

    Science.gov (United States)

    Honea, R. B.; Paludan, C. T. N.

    1974-01-01

    Advances in data acquisition and processing procedures for regional environmental analysis are discussed. Automated and semi-automated techniques employing Earth Resources Technology Satellite data and conventional data sources are presented. Experiences are summarized. The ERTS computer compatible tapes provide a very complete and flexible record of earth resources data and represent a viable medium to enhance regional environmental analysis research.

  13. California geothermal resource development environmental implications for ERCDC Environmental Analysis Office. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, J.A.

    1977-02-01

    The results of an analysis of the environmental implications for ERCDC Environmental Analysis Office (EAO) in relation to the development of California's geothermal resources are reported. While focusing primarily on environmental implications, particularly the natural, social, and economic elements, the report includes some ERCDC-wide policy and program considerations. The primary thrusts of the work have been in the development of an understanding of the interagency and intergovernmental environmental data and data-management roles and responsibilities and in the formulation of recommendations related thereto. Five appendices are included, one of which is a tax credit agreement between a power company and Skagit County, Washington. (JGB)

  14. Selected environmental applications of neutron activation analysis

    International Nuclear Information System (INIS)

    Kucera, J.

    2001-01-01

    NAA is very useful for the determination of trace and minor elements in many environmental applications. While instrumental NAA (INAA) has a number of valid applications in this field, radiochemical NAA (RNAA) prior to, or post irradiation provides some significant advantages. One of the major focus points for environmental applications of NAA is to assess the magnitude of various pollutants. This paper discusses doing this via two methods, namely air monitoring and biological monitoring. (author)

  15. Study on Adaptive Parameter Determination of Cluster Analysis in Urban Management Cases

    Science.gov (United States)

    Fu, J. Y.; Jing, C. F.; Du, M. Y.; Fu, Y. L.; Dai, P. P.

    2017-09-01

    The fine management for cities is the important way to realize the smart city. The data mining which uses spatial clustering analysis for urban management cases can be used in the evaluation of urban public facilities deployment, and support the policy decisions, and also provides technical support for the fine management of the city. Aiming at the problem that DBSCAN algorithm which is based on the density-clustering can not realize parameter adaptive determination, this paper proposed the optimizing method of parameter adaptive determination based on the spatial analysis. Firstly, making analysis of the function Ripley's K for the data set to realize adaptive determination of global parameter MinPts, which means setting the maximum aggregation scale as the range of data clustering. Calculating every point object's highest frequency K value in the range of Eps which uses K-D tree and setting it as the value of clustering density to realize the adaptive determination of global parameter MinPts. Then, the R language was used to optimize the above process to accomplish the precise clustering of typical urban management cases. The experimental results based on the typical case of urban management in XiCheng district of Beijing shows that: The new DBSCAN clustering algorithm this paper presents takes full account of the data's spatial and statistical characteristic which has obvious clustering feature, and has a better applicability and high quality. The results of the study are not only helpful for the formulation of urban management policies and the allocation of urban management supervisors in XiCheng District of Beijing, but also to other cities and related fields.

  16. STUDY ON ADAPTIVE PARAMETER DETERMINATION OF CLUSTER ANALYSIS IN URBAN MANAGEMENT CASES

    Directory of Open Access Journals (Sweden)

    J. Y. Fu

    2017-09-01

    Full Text Available The fine management for cities is the important way to realize the smart city. The data mining which uses spatial clustering analysis for urban management cases can be used in the evaluation of urban public facilities deployment, and support the policy decisions, and also provides technical support for the fine management of the city. Aiming at the problem that DBSCAN algorithm which is based on the density-clustering can not realize parameter adaptive determination, this paper proposed the optimizing method of parameter adaptive determination based on the spatial analysis. Firstly, making analysis of the function Ripley's K for the data set to realize adaptive determination of global parameter MinPts, which means setting the maximum aggregation scale as the range of data clustering. Calculating every point object’s highest frequency K value in the range of Eps which uses K-D tree and setting it as the value of clustering density to realize the adaptive determination of global parameter MinPts. Then, the R language was used to optimize the above process to accomplish the precise clustering of typical urban management cases. The experimental results based on the typical case of urban management in XiCheng district of Beijing shows that: The new DBSCAN clustering algorithm this paper presents takes full account of the data’s spatial and statistical characteristic which has obvious clustering feature, and has a better applicability and high quality. The results of the study are not only helpful for the formulation of urban management policies and the allocation of urban management supervisors in XiCheng District of Beijing, but also to other cities and related fields.

  17. An application of GA to normal and malignant tissues cluster analysis

    Science.gov (United States)

    Li, Xiang; Zhang, Guangjun; Yuan, Yan; Li, Qingbo; Wu, Jinguang

    2008-10-01

    In this paper, an application of genetic algorithm (GA) which makes the spectra of malignant tissue and that of normal tissue cluster respectively is investigated. Cluster analysis is a typical optimization problem of permutation and combination. The results of traditional algorithms closely depend on whether the parameters are rightly set. Besides, the physical understanding of sample spectra which has not been clearly known is usually needed to obtain a better result. The high dimension of the spectral data also adds difficulty in the analysis. Thus, it is almost impossible to set every parameter properly. Furthermore, since the variables and object functions are always discrete, there are a mass of local extremums. Conventional methods have no good strategy to deal with these inferior solutions. Therefore, the final cluster result is greatly influenced by the initial cluster centers and the order how the samples are input. Genetic algorithm is established based on the theory of nature selection and evolution. For GA, the understanding of the physical meaning is not necessary. Meanwhile, GA performs in a considerable high efficiency way. In the experiment, the sum of the inter-cluster distances is regarded as the object function. After smoothing, standard normal variate (SNV) processing, and outlier detection on sample spectra, Principal component analysis (PCA) is processed. Then selection, mutation and crossover are carried out on chromosomes whose ith bit value indicates which class sample i belongs to. Once the GA clustering is finished, tissue samples could be easily discriminated based on the characteristic absorbance peaks of protein, fat, nucleic acid and water. In this paper, three kinds of clustering algorithms are processed, and it shows that comparing to the conventional method, GA obtains a better result.

  18. Crouch gait patterns defined using k-means cluster analysis are related to underlying clinical pathology.

    Science.gov (United States)

    Rozumalski, Adam; Schwartz, Michael H

    2009-08-01

    In this study a gait classification method was developed and applied to subjects with Cerebral palsy who walk with excessive knee flexion at initial contact. Sagittal plane gait data, simplified using the gait features method, is used as input into a k-means cluster analysis to determine homogeneous groups. Several clinical domains were explored to determine if the clusters are related to underlying pathology. These domains included age, joint range-of-motion, strength, selective motor control, and spasticity. Principal component analysis is used to determine one overall score for each of the multi-joint domains (strength, selective motor control, and spasticity). The current study shows that there are five clusters among children with excessive knee flexion at initial contact. These clusters were labeled, in order of increasing gait pathology: (1) mild crouch with mild equinus, (2) moderate crouch, (3) moderate crouch with anterior pelvic tilt, (4) moderate crouch with equinus, and (5) severe crouch. Further analysis showed that age, range-of-motion, strength, selective motor control, and spasticity were significantly different between the clusters (p<0.001). The general tendency was for the clinical domains to worsen as gait pathology increased. This new classification tool can be used to define homogeneous groups of subjects in crouch gait, which can help guide treatment decisions and outcomes assessment.

  19. Factor-cluster analysis and enrichment study of Mangrove sediments - An example from Mengkabong, Sabah

    International Nuclear Information System (INIS)

    Praveena, S.M.; Ahmed, A.; Radojevic, M.; Mohd Harun Abdullah; Aris, A.Z.

    2007-01-01

    This paper examines the tidal effects in the sediment of Mengkabong mangrove forest, Sabah. Generally, all the studied parameters showed high value at high tide compared to low tide. Factor-cluster analyses were adopted to allow the identification of controlling factors at high and low tides. Factor analysis extracted six controlling factors at high tide and seven controlling factors at low tide. Cluster analysis extracted two district clusters at high and low tides. The study showed that factor-cluster analysis application is a useful tool to single out the controlling factors at high and low tides. this will provide a basis for describing the tidal effects in the mangrove sediment. The salinity and electrical conductivity clusters as well as component loadings at high and low tide explained the tidal process where there is high contribution of seawater to mangrove sediments that controls the sediment chemistry. The geo accumulation index (T geo ) values suggest the mangrove sediments are having background concentrations for Al, Cu, Fe and Zn and unpolluted for Pb. (author)

  20. Identification of discriminatory variables in proteomics data analysis by clustering of variables.

    Science.gov (United States)

    Karimi, Sadegh; Hemmateenejad, Bahram

    2013-03-12

    This article presents a data analysis method for biomarker discovery in proteomics data analysis. In factor analysis-based discriminate models, the latent variables (LV's) are calculated from the response data measured at all employed instrument channels. Since some channels are irrelevant and their responses do not possess useful information, the extracted LV's possess mixed information from both useful and irrelevant channels. In this work, clustering of variables (CLoVA) based on unsupervised pattern recognition is suggested as an efficient method to identify the most informative spectral region and then it is used to construct a more predictive multivariate classification model. In the suggested method, the instrument channels (m/z value) are clustered into different clusters via self-organization map. Subsequently, the spectral data of each cluster are separately used as the input variables of classification methods such as partial least square-discriminate analysis (PLS-DA) and extended canonical variate analysis (ECVA). The proposed method is evaluated by the analysis of two experimental data sets (ovarian and prostate cancer data set). It is found that our proposed method is able to detect cancerous from healthy samples with much higher sensitivity and selectivity than conventional PLS-DA and ECVA methods. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. A Factor Analysis Approach for Clustering Patient Reported Outcomes.

    Science.gov (United States)

    Oh, Jung Hun; Thor, Maria; Olsson, Caroline; Skokic, Viktor; Jörnsten, Rebecka; Alsadius, David; Pettersson, Niclas; Steineck, Gunnar; Deasy, Joseph O

    2016-10-17

    In the field of radiation oncology, the use of extensive patient reported outcomes is increasingly common to measure adverse side effects after radiotherapy in cancer patients. Factor analysis has the potential to identify an optimal number of latent factors (i.e., symptom groups). However, the ultimate goal of treatment response modeling is to understand the relationship between treatment variables such as radiation dose and symptom groups resulting from FA. Hence, it is crucial to identify clinically more relevant symptom groups and improved response variables from those symptom groups for a quantitative analysis. The goal of this study is to design a computational method for finding clinically relevant symptom groups from PROs and to test associations between symptom groups and radiation dose. We propose a novel approach where exploratory factor analysis is followed by confirmatory factor analysis to determine the relevant number of symptom groups. We also propose to use a combination of symptoms in a symptom group identified as a new response variable in linear regression analysis to investigate the relationship between the symptom group and dose-volume variables. We analyzed patient-reported gastrointestinal symptom profiles from 3 datasets in prostate cancer patients treated with radiotherapy. The final structural model of each dataset was validated using the other two datasets and compared to four other existing FA methods. Our systematic EFA-CFA approach provided clinically more relevant solutions than other methods, resulting in new clinically relevant outcome variables that enabled a quantitative analysis. As a result, statistically significant correlations were found between some dose-volume variables to relevant anatomic structures and symptom groups identified by FA. Our proposed method can aid in the process of understanding PROs and provide a basis for improving our understanding of radiation-induced side effects.

  2. Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014.

    Science.gov (United States)

    Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J

    2018-05-01

    The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Clustering analysis of water distribution systems: identifying critical components and community impacts.

    Science.gov (United States)

    Diao, K; Farmani, R; Fu, G; Astaraie-Imani, M; Ward, S; Butler, D

    2014-01-01

    Large water distribution systems (WDSs) are networks with both topological and behavioural complexity. Thereby, it is usually difficult to identify the key features of the properties of the system, and subsequently all the critical components within the system for a given purpose of design or control. One way is, however, to more explicitly visualize the network structure and interactions between components by dividing a WDS into a number of clusters (subsystems). Accordingly, this paper introduces a clustering strategy that decomposes WDSs into clusters with stronger internal connections than external connections. The detected cluster layout is very similar to the community structure of the served urban area. As WDSs may expand along with urban development in a community-by-community manner, the correspondingly formed distribution clusters may reveal some crucial configurations of WDSs. For verification, the method is applied to identify all the critical links during firefighting for the vulnerability analysis of a real-world WDS. Moreover, both the most critical pipes and clusters are addressed, given the consequences of pipe failure. Compared with the enumeration method, the method used in this study identifies the same group of the most critical components, and provides similar criticality prioritizations of them in a more computationally efficient time.

  4. Clustering analysis of Salmonella enterica serovar Typhi isolates in Korea by PFGE, ribotying, and phage typing.

    Science.gov (United States)

    Kim, Shukho; Kim, Sung-Hun; Park, Jeong-Hyun; Lee, Kyung-Shin; Park, Mi-Sun; Lee, Bok Kwon

    2009-01-01

    Salmonella enterica serovar Typhi is a Gram-negative bacterium causing the acute febrile disease typhoid fever. In Korea from 2004 to 2006, a total of 51 Salmonella Typhi isolates were identified in stool and blood from healthy carriers and patients with or without overseas travel histories. In this study, antibiogram, pulsed-field gel electrophoresis (PFGE), and automated ribotyping were performed as molecular epidemiological methods with phage typing as a classical subtyping tool of the isolates. Only two isolates were multidrug resistant and 82.3% of the isolates were susceptible to 16 antimicrobial agents tested. When the dendrogram was created based on the PFGE results, the subtypes could be clustered into five groups by 80% similarity criterion. The PFGE patterns of 31 isolates (60.8%) belonged to Cluster 3, the predominant cluster in the study. Three overseas travel-associated cases were differentiated into Cluster 4 of which three isolates were nalidixic acid or multidrug resistant. Major phage type and ribotype were A and PvuII-436-8-S-6, respectively. This study also showed the prevalence of PFGE Cluster 3 in Korea by clustering analysis and the link between some typhoid cases and travel to Cambodia, India, or Indonesia.

  5. Analysis of risk factors for cluster behavior of dental implant failures.

    Science.gov (United States)

    Chrcanovic, Bruno Ramos; Kisch, Jenö; Albrektsson, Tomas; Wennerberg, Ann

    2017-08-01

    Some studies indicated that implant failures are commonly concentrated in few patients. To identify and analyze cluster behavior of dental implant failures among subjects of a retrospective study. This retrospective study included patients receiving at least three implants only. Patients presenting at least three implant failures were classified as presenting a cluster behavior. Univariate and multivariate logistic regression models and generalized estimating equations analysis evaluated the effect of explanatory variables on the cluster behavior. There were 1406 patients with three or more implants (8337 implants, 592 failures). Sixty-seven (4.77%) patients presented cluster behavior, with 56.8% of all implant failures. The intake of antidepressants and bruxism were identified as potential negative factors exerting a statistically significant influence on a cluster behavior at the patient-level. The negative factors at the implant-level were turned implants, short implants, poor bone quality, age of the patient, the intake of medicaments to reduce the acid gastric production, smoking, and bruxism. A cluster pattern among patients with implant failure is highly probable. Factors of interest as predictors for implant failures could be a number of systemic and local factors, although a direct causal relationship cannot be ascertained. © 2017 Wiley Periodicals, Inc.

  6. A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.

    Science.gov (United States)

    Mo, Qianxing; Shen, Ronglai; Guo, Cui; Vannucci, Marina; Chan, Keith S; Hilsenbeck, Susan G

    2018-01-01

    Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Application of a time-space clustering methodology to the assessment of acute environmental effects on respiratory illnesses

    Energy Technology Data Exchange (ETDEWEB)

    Goldstein, I F; Cuzick, J

    1978-06-01

    A new methodology is proposed for the identification of environmental events of health significance. Health indices measured on a daily basis at various locations in a single geographical area are collected over time. First, the daily variations are examined to determine whether they reflect purely random variations or whether there are days on which there are extreme variations not plausibly explicable as random events. After such days are identified, the question of whether they occur only at a single location within the larger geographical area at one time, or whether they occur simultaneously at more than one location is investigated. Tests of statistical significance for both temporal and spatial clustering are proposed. The methodology is applied to daily hospital emergency room visits for various respiratory complaints to several New York City hospitals situated in two geographically separated districts which, however, have populations of similar socio-economic and ethnic composition.

  8. Chilled boneless beef international trade: a cluster analysis

    Directory of Open Access Journals (Sweden)

    Paulo Rodrigo Ramos Xavier Pereira

    2013-03-01

    Full Text Available The objective of this study was to measure and classify the international beef trade. For this, data related to the international chilled boneless beef (CBB trade, the major and most important market, were analyzed. Producing countries were classified into groups according to their trade relations, and the main factors that influenced one country to prefer to import CBB from a specific exporting country were analyzed. The results revealed four markets related to client demands with regard to the sanitation and traceability of beef products. Furthermore, extrinsic characteristics of the product are discussed, such as a productive system that aims to minimize environmental impacts and to value animal welfare and respect for social demands. The markets that pay highest prices require sanitary quality of suppliers, demanding traceable and process-certified products. Brazil does not access these markets because it does not meet these requirements. To change this scenario it is necessary to eradicate FMD across the Brazilian territory, acquiring a status of a zone with minimal BSE risk, aligning the intrinsic value of the CBB with expectations of consumers and implementing a traceability program that is both feasible and acceptable for clients.

  9. Cluster Analysis on Longitudinal Data of Patients with Adult-Onset Asthma.

    Science.gov (United States)

    Ilmarinen, Pinja; Tuomisto, Leena E; Niemelä, Onni; Tommola, Minna; Haanpää, Jussi; Kankaanranta, Hannu

    Previous cluster analyses on asthma are based on cross-sectional data. To identify phenotypes of adult-onset asthma by using data from baseline (diagnostic) and 12-year follow-up visits. The Seinäjoki Adult Asthma Study is a 12-year follow-up study of patients with new-onset adult asthma. K-means cluster analysis was performed by using variables from baseline and follow-up visits on 171 patients to identify phenotypes. Five clusters were identified. Patients in cluster 1 (n = 38) were predominantly nonatopic males with moderate smoking history at baseline. At follow-up, 40% of these patients had developed persistent obstruction but the number of patients with uncontrolled asthma (5%) and rhinitis (10%) was the lowest. Cluster 2 (n = 19) was characterized by older men with heavy smoking history, poor lung function, and persistent obstruction at baseline. At follow-up, these patients were mostly uncontrolled (84%) despite daily use of inhaled corticosteroid (ICS) with add-on therapy. Cluster 3 (n = 50) consisted mostly of nonsmoking females with good lung function at diagnosis/follow-up and well-controlled/partially controlled asthma at follow-up. Cluster 4 (n = 25) had obese and symptomatic patients at baseline/follow-up. At follow-up, these patients had several comorbidities (40% psychiatric disease) and were treated daily with ICS and add-on therapy. Patients in cluster 5 (n = 39) were mostly atopic and had the earliest onset of asthma, the highest blood eosinophils, and FEV 1 reversibility at diagnosis. At follow-up, these patients used the lowest ICS dose but 56% were well controlled. Results can be used to predict outcomes of patients with adult-onset asthma and to aid in development of personalized therapy (NCT02733016 at ClinicalTrials.gov). Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  10. Analysis of brood sex ratios: implications of offspring clustering

    Czech Academy of Sciences Publication Activity Database

    Krackow, S.; Tkadlec, Emil

    Roc. 50, č. 4 (2001), s. 293-301 ISSN 0340-5443 R&D Projects: GA ČR GA524/01/1316 Institutional research plan: CEZ:AV0Z6093917 Keywords : generalized linear mixed models * random coefficients * multilevel analysis Subject RIV: EG - Zoology Impact factor: 2.353, year: 2001

  11. Field of Study Choice: Using Conjoint Analysis and Clustering

    Science.gov (United States)

    Shtudiner, Ze'ev; Zwilling, Moti; Kantor, Jeffrey

    2017-01-01

    Purpose: The purpose of this paper is to measure student's preferences regarding various attributes that affect their decision process while choosing a higher education area of study. Design/ Methodology/Approach: The paper exhibits two different models which shed light on the perceived value of each examined area of study: conjoint analysis and…

  12. Multivariate cluster analysis of some major and trace elements ...

    African Journals Online (AJOL)

    UFUOMA

    This study comprises soils formed on Paleoproterozoic Birimian Basement rocks (poorly graded silty sand, gravely sand and silty clays) from the unsaturated zone of the Densu River Basin, taken from a five meter depth. Elemental analysis of the soils samples were carried out by Energy Dispersive X-ray. Fluorescence ...

  13. Analysis of the clustering of inertial particles in turbulent flows

    Science.gov (United States)

    Esmaily-Moghadam, Mahdi; Mani, Ali

    2016-12-01

    An asymptotic solution is derived for the motion of inertial particles exposed to Stokes drag in an unsteady random flow. This solution provides an estimate for the sum of Lyapunov exponents as a function of the Stokes number and Lagrangian strain- and rotation-rate autocovariance functions. The sum of exponents in a Lagrangian framework is the rate of contraction of clouds of particles, and in an Eulerian framework, it is the concentration-weighted divergence of the particle velocity field. Previous literature offers an estimate of the divergence of the particle velocity field, which is applicable only in the limit of small Stokes numbers [Robinson, Comm. Pure Appl. Math. 9, 69 (1956), 10.1002/cpa.3160090105 and Maxey, J. Fluid Mech. 174, 441 (1987), 10.1017/S0022112087000193] (R-M). In addition to reproducing R-M at this limit, our analysis provides a first-order correction to R-M at larger Stokes numbers. Our analysis is validated by a directly computed rate of contraction of clouds of particles from simulations of particles in homogeneous isotropic turbulence over a broad range of Stokes numbers. Our analysis and R-M predictions agree well with the direct computations at the limit of small Stokes numbers. At large Stokes numbers, in contrast to R-M, our model predictions remain bounded. In spite of an improvement over R-M, our analysis fails to predict the expansion of high Stokes clouds observed in the direct computations. Consistent with the general trend of particle segregation versus Stokes number, our analysis shows a maximum rate of contraction at an intermediate Stokes number of O (1 ) and minimal rates of contraction at small and large Stokes numbers.

  14. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters

    DEFF Research Database (Denmark)

    Kautsar, Satria A.; Suarez Duran, Hernando G.; Blin, Kai

    2017-01-01

    in specific genomic loci: biosynthetic gene clusters (BGCs). Here, we introduce plantiSMASH, a versatile online analysis platform that automates the identification of candidate plant BGCs. Moreover, it allows integration of transcriptomic data to prioritize candidate BGCs based on the coexpression patterns......Plant specialized metabolites are chemically highly diverse, play key roles in host-microbe interactions, have important nutritional value in crops and are frequently applied as medicines. It has recently become clear that plant biosynthetic pathway-encoding genes are sometimes densely clustered...... of predicted biosynthetic enzyme-coding genes, and facilitates comparative genomic analysis to study the evolutionary conservation of each cluster. Applied on 48 high-quality plant genomes, plantiSMASH identifies a rich diversity of candidate plant BGCs. These results will guide further experimental...

  15. Mental State Talk Structure in Children’s Narratives: A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Giuliana Pinto

    2017-01-01

    Full Text Available This study analysed children’s Theory of Mind (ToM as assessed by mental state talk in oral narratives. We hypothesized that the children’s mental state talk in narratives has an underlying structure, with specific terms organized in clusters. Ninety-eight children attending the last year of kindergarten were asked to tell a story twice, at the beginning and at the end of the school year. Mental state talk was analysed by identifying terms and expressions referring to perceptual, physiological, emotional, willingness, cognitive, moral, and sociorelational states. The cluster analysis showed that children’s mental state talk is organized in two main clusters: perceptual states and affective states. Results from the study confirm the feasibility of narratives as an outlet to inquire mental state talk and offer a more fine-grained analysis of mental state talk structure.

  16. Moving toward endotypes in atopic dermatitis: Identification of patient clusters based on serum biomarker analysis.

    Science.gov (United States)

    Thijs, Judith L; Strickland, Ian; Bruijnzeel-Koomen, Carla A F M; Nierkens, Stefan; Giovannone, Barbara; Csomor, Eszter; Sellman, Bret R; Mustelin, Tomas; Sleeman, Matthew A; de Bruin-Weller, Marjolein S; Herath, Athula; Drylewicz, Julia; May, Richard D; Hijnen, DirkJan

    2017-09-01

    Atopic dermatitis (AD) is a complex, chronic, inflammatory skin disease with a diverse clinical presentation. However, it is unclear whether this diversity exists at a biological level. We sought to test the hypothesis that AD is heterogeneous at the biological level of individual inflammatory mediators. Sera from 193 adult patients with moderate-to-severe AD (six area, six sign atopic dermatitis [SASSAD] score: geometric mean, 22.3 [95% CI, 21.3-23.3] and 39.1 [95% CI, 37.5-40.9], respectively) and 30 healthy control subjects without AD were analyzed for 147 serum mediators, total IgE levels, and 130 allergen-specific IgE levels. Population heterogeneity was assessed by using principal component analysis, followed by unsupervised k-means cluster analysis of the principal components. Patients with AD showed pronounced evidence of inflammation compared with healthy control subjects. Principal component analysis of data on sera from patients with AD revealed the presence of 4 potential clusters. Fifty-seven principal components described approximately 90% of the variance. Unsupervised k-means cluster analysis of the 57 largest principal components delivered 4 distinct clusters of patients with AD. Cluster 1 had high SASSAD scores and body surface areas with the highest levels of pulmonary and activation-regulated chemokine, tissue inhibitor of metalloproteinases 1, and soluble CD14. Cluster 2 had low SASSAD scores with the lowest levels of IFN-α, tissue inhibitor of metalloproteinases 1, and vascular endothelial growth factor. Cluster 3 had high SASSAD scores with the lowest levels of IFN-β, IL-1, and epithelial cytokines. Cluster 4 had low SASSAD scores but the highest levels of the inflammatory markers IL-1, IL-4, IL-13, and thymic stromal lymphopoietin. AD is a heterogeneous disease both clinically and biologically. Four distinct clusters of patients with AD have been identified that could represent endotypes with unique biological mechanisms. Elucidation of

  17. Analyzing Developing Country Market Integration with Incomplete Price Data Using Cluster Analysis

    OpenAIRE

    Ansah, Isaac Gershon; Gardebroek, Cornelis; Ihle, Rico; Jaleta, Moti

    2014-01-01

    Recent global food price developments have spurred renewed interest in analyzing integration of local markets to global markets. A popular approach to quantify market integration is cointegration analysis. However, local market price data often has missing values, outliers, or short and incomplete series, making cointegration analysis impossible. Instead of imputing missing data, this paper proposes cluster analysis as an alternative methodological approach for analyzing market integration. I...

  18. Identifying influential individuals on intensive care units: using cluster analysis to explore culture.

    Science.gov (United States)

    Fong, Allan; Clark, Lindsey; Cheng, Tianyi; Franklin, Ella; Fernandez, Nicole; Ratwani, Raj; Parker, Sarah Henrickson

    2017-07-01

    The objective of this paper is to identify attribute patterns of influential individuals in intensive care units using unsupervised cluster analysis. Despite the acknowledgement that culture of an organisation is critical to improving patient safety, specific methods to shift culture have not been explicitly identified. A social network analysis survey was conducted and an unsupervised cluster analysis was used. A total of 100 surveys were gathered. Unsupervised cluster analysis was used to group individuals with similar dimensions highlighting three general genres of influencers: well-rounded, knowledge and relational. Culture is created locally by individual influencers. Cluster analysis is an effective way to identify common characteristics among members of an intensive care unit team that are noted as highly influential by their peers. To change culture, identifying and then integrating the influencers in intervention development and dissemination may create more sustainable and effective culture change. Additional studies are ongoing to test the effectiveness of utilising these influencers to disseminate patient safety interventions. This study offers an approach that can be helpful in both identifying and understanding influential team members and may be an important aspect of developing methods to change organisational culture. © 2017 John Wiley & Sons Ltd.

  19. Evaluation of Portland cement from X-ray diffraction associated with cluster analysis

    International Nuclear Information System (INIS)

    Gobbo, Luciano de Andrade; Montanheiro, Tarcisio Jose; Montanheiro, Filipe; Sant'Agostino, Lilia Mascarenhas

    2013-01-01

    The Brazilian cement industry produced 64 million tons of cement in 2012, with noteworthy contribution of CP-II (slag), CP-III (blast furnace) and CP-IV (pozzolanic) cements. The industrial pole comprises about 80 factories that utilize raw materials of different origins and chemical compositions that require enhanced analytical technologies to optimize production in order to gain space in the growing consumer market in Brazil. This paper assesses the sensitivity of mineralogical analysis by X-ray diffraction associated with cluster analysis to distinguish different kinds of cements with different additions. This technique can be applied, for example, in the prospection of different types of limestone (calcitic, dolomitic and siliceous) as well as in the qualification of different clinkers. The cluster analysis does not require any specific knowledge of the mineralogical composition of the diffractograms to be clustered; rather, it is based on their similarity. The materials tested for addition have different origins: fly ashes from different power stations from South Brazil and slag from different steel plants in the Southeast. Cement with different additions of limestone and white Portland cement were also used. The Rietveld method of qualitative and quantitative analysis was used for measuring the results generated by the cluster analysis technique. (author)

  20. EFFECTIVENESS OF ENVIRONMENTAL TREATIES: TREND ANALYSIS OF TREATY-BASED ENVIRONMENTAL INDICATORS

    OpenAIRE

    CHENAZ B. SEELARBOKUS

    2005-01-01

    The literature on environmental regime effectiveness has shown a predilection for behaviour modification studies, whereby effectiveness is associated with a change in the behaviour of relevant actors. There has not been a systematic endeavour to link the implementation of international environmental agreements (IEAs) with improvement in environmental conditions. This article shifts away from the paradigm of behavioural analysis and focuses instead on linking IEA effectiveness with positive en...

  1. Environmental Sustainability Analysis of Biodiesel Production

    DEFF Research Database (Denmark)

    Herrmann, Ivan Tengbjerg; Hauschild, Michael Michael Zwicky; Birkved, Morten

    like these require a life cycle perspective on the biofuel - from the cradle (production of the agricultural feedstock) to the grave (use as fuel). An environmental life cycle assessment is performed on biodiesel to compare different production schemes including chemical and enzymatic esterification...... with the use of methanol or ethanol. The life cycle assessment includes all processes needed for the production, distribution and use of the biodiesel (the product system), and it includes all relevant environmental impacts from the product system, ranging from global impacts like climate change and loss...... of non-renewable resources over regional impacts like acidification, eutrophication and photochemical ozone to more local impacts like ecotoxicity and physical impacts like land use, to allow judging on the overall environmental sustainability of the biodiesel and to support identification of the main...

  2. Applying of hierarchical clustering to analysis of protein patterns in the human cancer-associated liver.

    Directory of Open Access Journals (Sweden)

    Natalia A Petushkova

    Full Text Available There are two ways that statistical methods can learn from biomedical data. One way is to learn classifiers to identify diseases and to predict outcomes using the training dataset with established diagnosis for each sample. When the training dataset is not available the task can be to mine for presence of meaningful groups (clusters of samples and to explore underlying data structure (unsupervised learning.We investigated the proteomic profiles of the cytosolic fraction of human liver samples using two-dimensional electrophoresis (2DE. Samples were resected upon surgical treatment of hepatic metastases in colorectal cancer. Unsupervised hierarchical clustering of 2DE gel images (n = 18 revealed a pair of clusters, containing 11 and 7 samples. Previously we used the same specimens to measure biochemical profiles based on cytochrome P450-dependent enzymatic activities and also found that samples were clearly divided into two well-separated groups by cluster analysis. It turned out that groups by enzyme activity almost perfectly match to the groups identified from proteomic data. Of the 271 reproducible spots on our 2DE gels, we selected 15 to distinguish the human liver cytosolic clusters. Using MALDI-TOF peptide mass fingerprinting, we identified 12 proteins for the selected spots, including known cancer-associated species.Our results highlight the importance of hierarchical cluster analysis of proteomic data, and showed concordance between results of biochemical and proteomic approaches. Grouping of the human liver samples and/or patients into differing clusters may provide insights into possible molecular mechanism of drug metabolism and creates a rationale for personalized treatment.

  3. Extending the input–output energy balance methodology in agriculture through cluster analysis

    International Nuclear Information System (INIS)

    Bojacá, Carlos Ricardo; Casilimas, Héctor Albeiro; Gil, Rodrigo; Schrevens, Eddie

    2012-01-01

    The input–output balance methodology has been applied to characterize the energy balance of agricultural systems. This study proposes to extend this methodology with the inclusion of multivariate analysis to reveal particular patterns in the energy use of a system. The objective was to demonstrate the usefulness of multivariate exploratory techniques to analyze the variability found in a farming system and, establish efficiency categories that can be used to improve the energy balance of the system. To this purpose an input–output analysis was applied to the major greenhouse tomato production area in Colombia. Individual energy profiles were built and the k-means clustering method was applied to the production factors. On average, the production system in the study zone consumes 141.8 GJ ha −1 to produce 96.4 GJ ha −1 , resulting in an energy efficiency of 0.68. With the k-means clustering analysis, three clusters of farmers were identified with energy efficiencies of 0.54, 0.67 and 0.78. The most energy efficient cluster grouped 56.3% of the farmers. It is possible to optimize the production system by improving the management practices of those with the lowest energy use efficiencies. Multivariate analysis techniques demonstrated to be a complementary pathway to improve the energy efficiency of a system. -- Highlights: ► An input–output energy balance was estimated for greenhouse tomatoes in Colombia. ► We used the k-means clustering method to classify growers based on their energy use. ► Three clusters of growers were found with energy efficiencies of 0.54, 0.67 and 0.78. ► Overall system optimization is possible by improving the energy use of the less efficient.

  4. Performance Analysis of a Cluster-Based MAC Protocol for Wireless Ad Hoc Networks

    Directory of Open Access Journals (Sweden)

    Kartsakli Elli

    2010-01-01

    Full Text Available An analytical model to evaluate the non-saturated performance of the Distributed Queuing Medium Access Control Protocol for Ad Hoc Networks (DQMANs in single-hop networks is presented in this paper. DQMAN is comprised of a spontaneous, temporary, and dynamic clustering mechanism integrated with a near-optimum distributed queuing Medium Access Control (MAC protocol. Clustering is executed in a distributed manner using a mechanism inspired by the Distributed Coordination Function (DCF of the IEEE 802.11. Once a station seizes the channel, it becomes the temporary clusterhead of a spontaneous cluster and it coordinates the peer-to-peer communications between the clustermembers. Within each cluster, a near-optimum distributed queuing MAC protocol is executed. The theoretical performance analysis of DQMAN in single-hop networks under non-saturation conditions is presented in this paper. The approach integrates the analysis of the clustering mechanism into the MAC layer model. Up to the knowledge of the authors, this approach is novel in the literature. In addition, the performance of an ad hoc network using DQMAN is compared to that obtained when using the DCF of the IEEE 802.11, as a benchmark reference.

  5. Gene microarray data analysis using parallel point-symmetry-based clustering.

    Science.gov (United States)

    Sarkar, Anasua; Maulik, Ujjwal

    2015-01-01

    Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or non-convex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetry-based K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.

  6. Profitability and efficiency of Italian utilities: cluster analysis of financial statement ratios

    International Nuclear Information System (INIS)

    Linares, E.

    2008-01-01

    The last ten years have witnessed conspicuous changes in European and Italian regulation of public utility services and in the strategies of the major players in these fields. In response to these changes Italian utilities have made a variety of choices regarding size, presence in more or less capital-intensive stages of different value chains, and diversification. These choices have been implemented both through internal growth and by means of mergers and acquisitions. In this context it is interesting to try to establish whether there is a nexus between these choices and the performance of Italian utilities in terms of profitability and efficiency. Therefore statistical multivariate analysis techniques (cluster analysis and factor analysis) have been applied to several ratios obtained from the 2005 financial statement of 34 utilities. First, a hierarchical cluster analysis method has been applied to financial statement data in order to identify homogeneous groups based on several indicators of the incidence of costs (external costs, personnel costs, depreciation and amortization), profitability (return on sales, return on assets, return on equity) and efficiency (in the utilization of personnel, of total assets, of property, plant and equipment). Five clusters have been found. Then the clusters have been characterized in terms of the aforementioned indicators, the presence in different stages of the energy value chains (electricity and gas) and other descriptive variables (such as turnover, number of employees, assets, percentage of property, plant and equipment on total assets, sales revenues from electricity, gas, water supply and sanitation, waste collection and treatment and other services). In a second round cluster analysis has been preceded by factor analysis, in order to find a smaller set of variables. This procedure has revealed three not directly observable factors that can be interpreted as follows: i) efficiency in ordinary and financial management

  7. Independent Component Analysis to Detect Clustered Microcalcification Breast Cancers

    Directory of Open Access Journals (Sweden)

    R. Gallardo-Caballero

    2012-01-01

    current reproducible studies on the same mammogram set. This proposal is mainly based on the use of extracted image features obtained by independent component analysis, but we also study the inclusion of the patient’s age as a nonimage feature which requires no human expertise. Our system achieves an average of 2.55 false positives per image at a sensitivity of 81.8% and 4.45 at a sensitivity of 91.8% in diagnosing the BCRP_CALC_1 subset of DDSM.

  8. Implementation of hybrid clustering based on partitioning around medoids algorithm and divisive analysis on human Papillomavirus DNA

    Science.gov (United States)

    Arimbi, Mentari Dian; Bustamam, Alhadi; Lestari, Dian

    2017-03-01

    Data clustering can be executed through partition or hierarchical method for many types of data including DNA sequences. Both clustering methods can be combined by processing partition algorithm in the first level and hierarchical in the second level, called hybrid clustering. In the partition phase some popular methods such as PAM, K-means, or Fuzzy c-means methods could be applied. In this study we selected partitioning around medoids (PAM) in our partition stage. Furthermore, following the partition algorithm, in hierarchical stage we applied divisive analysis algorithm (DIANA) in order to have more specific clusters and sub clusters structures. The number of main clusters is determined using Davies Bouldin Index (DBI) value. We choose the optimal number of clusters if the results minimize the DBI value. In this work, we conduct the clustering on 1252 HPV DNA sequences data from GenBank. The characteristic extraction is initially performed, followed by normalizing and genetic distance calculation using Euclidean distance. In our implementation, we used the hybrid PAM and DIANA using the R open source programming tool. In our results, we obtained 3 main clusters with average DBI value is 0.979, using PAM in the first stage. After executing DIANA in the second stage, we obtained 4 sub clusters for Cluster-1, 9 sub clusters for Cluster-2 and 2 sub clusters in Cluster-3, with the BDI value 0.972, 0.771, and 0.768 for each main cluster respectively. Since the second stage produce lower DBI value compare to the DBI value in the first stage, we conclude that this hybrid approach can improve the accuracy of our clustering results.

  9. Approaching messy problems: strategies for environmental analysis

    Science.gov (United States)

    L. M. Reid; R. R. Ziemer; T. E. Lisle

    1996-01-01

    Environmental problems are never neatly defined. Instead, each is a tangle of interacting processes whose manifestation and interpretation are warped by the vagaries of time, weather, expectation, and economics. Each problem involves livelihoods, values, and numerous specialized disciplines. Nevertheless, federal agencies in the Pacific Northwest have been given the...

  10. Procedures for Environmental Impact Analysis and Planning.

    Science.gov (United States)

    1982-10-01

    Novak, E., and R. Riggins, Computer-Aided Environmental Impact AnLysis : MIS- sion Change, O&M, and Training: User Manual, Technical Report E-85...there is mixed sentiment has a better chance of establishing a positive relationship with the audience. (8) Avoid using technical jargon or words that

  11. Analysis of Corporate Environmental Management: Methodological Aspects

    DEFF Research Database (Denmark)

    Madsen, Henning; Ulhøi, John Parm

    2001-01-01

    Human activities cannot avoid influencing conditions in the natural environment one way or the other. This includes as well common activities in the business sector. But during the past few decades, environmental disasters in Seveso and Bhopal, and the Exxon Valdes oil spill in Alaska have...

  12. Community succession analysis and environmental biological ...

    African Journals Online (AJOL)

    Yomi

    2011-02-14

    Feb 14, 2011 ... on abandoned hilly lands and implications for vegetation restoration strategy in Shanxi, China. X.Z Liu1, F. Zhang1*, H.B. Shao2,3* and J.T. Zhang4. 1Institute of Loess Plateau, Shanxi University, Taiyuan 030006, China. 2The CAS / Shandong Provincial Key Laboratory of Coastal Environmental Processes ...

  13. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Directory of Open Access Journals (Sweden)

    I. Crawford

    2015-11-01

    Full Text Available In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the

  14. Front Crawl Sprint Performance: A Cluster Analysis of Biomechanics, Energetics, Coordinative, and Anthropometric Determinants in Young Swimmers.

    Science.gov (United States)

    Figueiredo, Pedro; Silva, Ana; Sampaio, António; Vilas-Boas, João Paulo; Fernandes, Ricardo J

    2016-07-01

    The aim of this study was to evaluate the determinants of front crawl sprint performance of young swimmers using a cluster analysis. 103 swimmers, aged 11- to 13-years old, performed 25-m front crawl swimming at 50-m pace, recorded by two underwater cameras. Swimmers analysis included biomechanics, energetics, coordinative, and anthropometric characteristics. The organization of subjects in meaningful clusters, originated three groups (1.52 ± 0.16, 1.47 ± 0.17 and 1.40 ± 0.15 m/s, for Clusters 1, 2 and 3, respectively) with differences in velocity between Cluster 1 and 2 compared with Cluster 3 (p = .003). Anthropometric variables were the most determinants for clusters solution. Stroke length and stroke index were also considered relevant. In addition, differences between Cluster 1 and the others were also found for critical velocity, stroke rate and intracycle velocity variation (p energetics (swimming efficiency) are determinant domains to young swimmers sprint performance.

  15. Study on distinguishing of Chinese ancient porcelains by neutron activation and fuzzy cluster analysis

    International Nuclear Information System (INIS)

    Wang An

    1992-01-01

    By means of the method of neutron activation, the contents of trace elements in some samples of Chinese ancient porcelains from different places of production were determined. The data were analysed by fuzzy cluster analysis. On the basis of the above mentioned works, a method with regard to the distinguishing and determining of Chinese ancient porcelain was suggested

  16. Posterior AD-Type Pathology: Cognitive Subtypes Emerging from a Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Antonella Cappa

    2014-01-01

    Full Text Available Background. “Posterior shift” of the neuropathological changes of Alzheimer's disease (AD produces a syndrome (posterior cortical atrophy (PCA dominated by high-level visual deficits. Objective. To explore in patients with AD-type pathology whether a data-driven analysis (cluster analysis based on neuropsychological findings resulted in the emergence of different subgroups of patients; in particular to find out whether it was possible to identify patients with visuospatial deficits consistent with the hypothesis that PCA is a “dorsal stream” syndrome or, rather, whether there were subgroups of patients with different types of impairment within the high-level visual domain. Methods. 23 PCA and 16 DAT patients were studied. By a principal component analysis performed on a wide range of neuropsychological tasks, 15 variables were obtained that loaded onto five main factors (memory, language, perceptual, visuospatial, and calculation which entered a hierarchical cluster analysis. Results. Four clusters of cognitive impairment emerged: visuospatial/perceptual, memory, perceptual/calculation, and language. Only in the first cluster a visuospatial deficit clearly emerged. Conclusions. AD pathology produces not only variants dominated by memory (DAT and, to a lesser extent, visuospatial deficit (PCA, but also other distinct syndromic subtypes with disorders in visual perception and language which reflect a different vulnerability of specific functional networks.

  17. Cardiovascular reactivity patterns and pathways to hypertension : a multivariate cluster analysis

    NARCIS (Netherlands)

    Brindle, R C; Ginty, A T; Jones, A; Phillips, A C; Roseboom, T J; Carroll, D; Painter, R C; de Rooij, S R

    2016-01-01

    Substantial evidence links exaggerated mental stress induced blood pressure reactivity to future hypertension, but the results for heart rate reactivity are less clear. For this reason multivariate cluster analysis was carried out to examine the relationship between heart rate and blood pressure

  18. Validation of an ANN Flow Prediction Model Using a Multt-Station Cluster Analysis

    NARCIS (Netherlands)

    Demirel, M.C.; Booij, Martijn J.; Kahya, E.

    2012-01-01

    The objective of this study is to validate a flow prediction model for a hydrometric station using a multistation criterion in addition to standard single-station performance criteria. In this contribution we used cluster analysis to identify the regional flow height, i.e., water-level patterns and

  19. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters

    NARCIS (Netherlands)

    Cimermancic, P.; Medema, Marnix; Claesen, J.; Kurika, K.; Wieland Brown, L.C.; Mavrommatis, K.; Pati, A.; Godfrey, P.A.; Koehrsen, M.; Clardy, J.; Birren, B. W.; Takano, Eriko; Sali, A.; Linington, R.G.; Fischbach, M.A.

    2014-01-01

    Although biosynthetic gene clusters (BGCs) have been discovered for hundreds of bacterial metabolites, our knowledge of their diversity remains limited. Here, we used a novel algorithm to systematically identify BGCs in the extensive extant microbial sequencing data. Network analysis of the

  20. Classification of shoulder complaints in general practice by means of cluster analysis

    NARCIS (Netherlands)

    Winters, JC; Groenier, KH; Sobel, JS; Arendzen, HH; Meyboom-de Jong, B

    1997-01-01

    Objective: To determine if a classification of shoulder complaints in general practice can be made with a cluster analysis of variables of medical history and physical examination. Method: One hundred one patients with shoulder complaints were examined upon inclusion (week 0) and after 2 weeks.

  1. Cardiovascular reactivity patterns and pathways to hypertension: a multivariate cluster analysis

    NARCIS (Netherlands)

    Brindle, R. C.; Ginty, A. T.; Jones, A.; Phillips, A. C.; Roseboom, T. J.; Carroll, D.; Painter, R. C.; de Rooij, S. R.

    2016-01-01

    Substantial evidence links exaggerated mental stress induced blood pressure reactivity to future hypertension, but the results for heart rate reactivity are less clear. For this reason multivariate cluster analysis was carried out to examine the relationship between heart rate and blood pressure

  2. Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis

    Science.gov (United States)

    2016-10-01

    project activities, for the purpose of enhancing public understanding and increasing interest in learning and careers in science, technology, and the... Unsupervised hierarchical clustering of resting state functional connectivity data to identify patients with mild tinnitus. Poster session presented...including drafting of IRB behavioral and scanning protocols, advising on recruiting and initial data collection. She also supervised analysis of data and

  3. Student Motivation and Learning in Mathematics and Science: A Cluster Analysis

    Science.gov (United States)

    Ng, Betsy L. L.; Liu, W. C.; Wang, John C. K.

    2016-01-01

    The present study focused on an in-depth understanding of student motivation and self-regulated learning in mathematics and science through cluster analysis. It examined the different learning profiles of motivational beliefs and self-regulatory strategies in relation to perceived teacher autonomy support, basic psychological needs (i.e. autonomy,…

  4. Environmental analysis of higher brominated diphenyl ethers and decabromodiphenyl ethane.

    Science.gov (United States)

    Kierkegaard, Amelie; Sellström, Ulla; McLachlan, Michael S

    2009-01-16

    Methods for environmental analysis of higher brominated diphenyl ethers (PBDEs), in particular decabromodiphenyl ether (BDE209), and the recently discovered environmental contaminant decabromodiphenyl ethane (deBDethane) are reviewed. The extensive literature on analysis of BDE209 has identified several critical issues, including contamination of the sample, degradation of the analyte during sample preparation and GC analysis, and the selection of appropriate detection methods and surrogate standards. The limited experience with the analysis of deBDethane suggests that there are many commonalities with BDE209. The experience garnered from the analysis of BDE209 over the last 15 years will greatly facilitate progress in the analysis of deBDethane.

  5. Paternal age related schizophrenia (PARS): Latent subgroups detected by k-means clustering analysis.

    Science.gov (United States)

    Lee, Hyejoo; Malaspina, Dolores; Ahn, Hongshik; Perrin, Mary; Opler, Mark G; Kleinhaus, Karine; Harlap, Susan; Goetz, Raymond; Antonius, Daniel

    2011-05-01

    Paternal age related schizophrenia (PARS) has been proposed as a subgroup of schizophrenia with distinct etiology, pathophysiology and symptoms. This study uses a k-means clustering analysis approach to generate hypotheses about differences between PARS and other cases of schizophrenia. We studied PARS (operationally defined as not having any family history of schizophrenia among first and second-degree relatives and fathers' age at birth ≥ 35 years) in a series of schizophrenia cases recruited from a research unit. Data were available on demographic variables, symptoms (Positive and Negative Syndrome Scale; PANSS), cognitive tests (Wechsler Adult Intelligence Scale-Revised; WAIS-R) and olfaction (University of Pennsylvania Smell Identification Test; UPSIT). We conducted a series of k-means clustering analyses to identify clusters of cases containing high concentrations of PARS. Two analyses generated clusters with high concentrations of PARS cases. The first analysis (N=136; PARS=34) revealed a cluster containing 83% PARS cases, in which the patients showed a significant discrepancy between verbal and performance intelligence. The mean paternal and maternal ages were 41 and 33, respectively. The second analysis (N=123; PARS=30) revealed a cluster containing 71% PARS cases, of which 93% were females; the mean age of onset of psychosis, at 17.2, was significantly early. These results strengthen the evidence that PARS cases differ from other patients with schizophrenia. Hypothesis-generating findings suggest that features of PARS may include a discrepancy between verbal and performance intelligence, and in females, an early age of onset. These findings provide a rationale for separating these phenotypes from others in future clinical, genetic and pathophysiologic studies of schizophrenia and in considering responses to treatment. Copyright © 2011 Elsevier B.V. All rights reserved.

  6. Cluster Method Analysis of K. S. C. Image

    Science.gov (United States)

    Rodriguez, Joe, Jr.; Desai, M.

    1997-01-01

    Information obtained from satellite-based systems has moved to the forefront as a method in the identification of many land cover types. Identification of different land features through remote sensing is an effective tool for regional and global assessment of geometric characteristics. Classification data acquired from remote sensing images have a wide variety of applications. In particular, analysis of remote sensing images have special applications in the classification of various types of vegetation. Results obtained from classification studies of a particular area or region serve towards a greater understanding of what parameters (ecological, temporal, etc.) affect the region being analyzed. In this paper, we make a distinction between both types of classification approaches although, focus is given to the unsupervised classification method using 1987 Thematic Mapped (TM) images of Kennedy Space Center.

  7. Environmental analysis applied to schools. Methodologies for data acquisition

    International Nuclear Information System (INIS)

    Andriola, L.; Ceccacci, R.

    2001-01-01

    The environment analysis is the basis of environmental management for organizations and it is considered as the first step in EMAS. It allows to identify, deal with the issues and have a clear knowledge on environmental performances of organizations. Schools can be included in the organizations. Nevertheless, the complexity of environmental issues and applicable regulations makes very difficult for a school, that wants to implement an environmental management system (EMAS, ISO 14001, etc.), to face this first step. So, it has been defined an instrument, that is easy but complete and coherent with reference standard, to let schools choose their process for elaborating the initial environmental revue. This instrument consists, essentially, in cards that, if completed, facilitate the drafting of the environmental analysis report [it

  8. Cluster analysis of residential heat load profiles and the role of technical and household characteristics

    DEFF Research Database (Denmark)

    Carmo, Carolina; Christensen, Toke Haunstrup

    2016-01-01

    of the temporality of the energy demand is needed. This paper contributes to this by focusing on the daily load profiles of energy demand for heating of Danish dwellings with heat pumps. Based on hourly recordings from 139 dwellings and employing cluster and regression analysis, the paper explores patterns...... (typologies) in daily heating load profiles and how these relate to socio-economic and technical characteristics of the included households. The study shows that the load profiles vary according to the external load conditions. Two main clusters were identified for both weekdays and weekends and across load...

  9. Residential patterns in older homeless adults: Results of a cluster analysis.

    Science.gov (United States)

    Lee, Christopher Thomas; Guzman, David; Ponath, Claudia; Tieu, Lina; Riley, Elise; Kushel, Margot

    2016-03-01

    Adults aged 50 and older make up half of individuals experiencing homelessness and have high rates of morbidity and mortality. They may have different life trajectories and reside in different environments than do younger homeless adults. Although the environmental risks associated with homelessness are substantial, the environments in which older homeless individuals live have not been well characterized. We classified living environments and identified associated factors in a sample of older homeless adults. From July 2013 to June 2014, we recruited a community-based sample of 350 homeless men and women aged fifty and older in Oakland, California. We administered structured interviews including assessments of health, history of homelessness, social support, and life course. Participants used a recall procedure to describe where they stayed in the prior six months. We performed cluster analysis to classify residential venues and used multinomial logistic regression to identify individual factors prior to the onset of homelessness as well as the duration of unstable housing associated with living in them. We generated four residential groups describing those who were unsheltered (n = 162), cohabited unstably with friends and family (n = 57), resided in multiple institutional settings (shelters, jails, transitional housing) (n = 88), or lived primarily in rental housing (recently homeless) (n = 43). Compared to those who were unsheltered, having social support when last stably housed was significantly associated with cohabiting and institution use. Cohabiters and renters were significantly more likely to be women and have experienced a shorter duration of homelessness. Cohabiters were significantly more likely than unsheltered participants to have experienced abuse prior to losing stable housing. Pre-homeless social support appears to protect against street homelessness while low levels of social support may increase the risk for becoming homeless immediately after

  10. Hyperspectral microscopy and cluster analysis for oral cancer diagnosis

    Science.gov (United States)

    Jarman, Anneliese; Manickavasagam, Arunthathi; Hosny, Neveen; Festy, Frederic

    2017-02-01

    Oral cancer incidences have been increasing in recent years and late detection often leads to poor prognosis. Raman spectroscopy has been identified has a valuable diagnostic tool for cancer but its time consuming nature has prevented its clinical use. For Raman to become a realistic aid to histopathology, a rapid pre-screening technique is required to find small regions of interest on tissue sections [1]. The aim of this work is to investigate the feasibility of hyperspectral imaging in the visible spectral range as a fast imaging technique before Raman is performed. We have built a hyperspectral microscope which captures 300 focused and intensity corrected images with wavelength ranging from 450- 750 nm in around 30 minutes with sub-micron spatial resolution and around 10 nm spectral resolution. Hyperstacks of known absorbing samples, including fluorescent dyes and dried blood droplets, show excellent results with spectrally accurate transmission spectra and concentration-dependent intensity variations. We successfully showed the presence of different components from a non-absorbent saliva droplet sample. Data analysis is the greatest hurdle to the interpretation of more complex data such as unstained tissue sections.

  11. Analysis of Health Behavior Theories for Clustering of Health Behaviors.

    Science.gov (United States)

    Choi, Seung Hee; Duffy, Sonia A

    The objective of this article was to review the utility of established behavior theories, including the Health Belief Model, Theory of Reasoned Action, Theory of Planned Behavior, Transtheoretical Model, and Health Promotion Model, for addressing multiple health behaviors among people who smoke. It is critical to design future interventions for multiple health behavior changes tailored to individuals who currently smoke, yet it has not been addressed. Five health behavior theories/models were analyzed and critically evaluated. A review of the literature included a search of PubMed and Google Scholar from 2010 to 2016. Two hundred sixty-seven articles (252 studies from the initial search and 15 studies from the references of initially identified studies) were included in the analysis. Most of the health behavior theories/models emphasize psychological and cognitive constructs that can be applied only to one specific behavior at a time, thus making them not suitable to address multiple health behaviors. However, the Health Promotion Model incorporates "related behavior factors" that can explain multiple health behaviors among persons who smoke. Future multiple behavior interventions guided by the Health Promotion Model are necessary to show the utility and applicability of the model to address multiple health behaviors.

  12. A new classification of diabetic gait pattern based on cluster analysis of biomechanical data.

    Science.gov (United States)

    Sawacha, Zimi; Guarneri, Gabriella; Avogaro, Angelo; Cobelli, Claudio

    2010-09-01

    The diabetic foot, one of the most serious complications of diabetes mellitus and a major risk factor for plantar ulceration, is determined mainly by peripheral neuropathy. Neuropathic patients exhibit decreased stability while standing as well as during dynamic conditions. A new methodology for diabetic gait pattern classification based on cluster analysis has been proposed that aims to identify groups of subjects with similar patterns of gait and verify if three-dimensional gait data are able to distinguish diabetic gait patterns from one of the control subjects. The gait of 20 nondiabetic individuals and 46 diabetes patients with and without peripheral neuropathy was analyzed [mean age 59.0 (2.9) and 61.1(4.4) years, mean body mass index (BMI) 24.0 (2.8), and 26.3 (2.0)]. K-means cluster analysis was applied to classify the subjects' gait patterns through the analysis of their ground reaction forces, joints and segments (trunk, hip, knee, ankle) angles, and moments. Cluster analysis classification led to definition of four well-separated clusters: one aggregating just neuropathic subjects, one aggregating both neuropathics and non-neuropathics, one including only diabetes patients, and one including either controls or diabetic and neuropathic subjects. Cluster analysis was useful in grouping subjects with similar gait patterns and provided evidence that there were subgroups that might otherwise not be observed if a group ensemble was presented for any specific variable. In particular, we observed the presence of neuropathic subjects with a gait similar to the controls and diabetes patients with a long disease duration with a gait as altered as the neuropathic one. © 2010 Diabetes Technology Society.

  13. Multi-scale visual analysis of time-varying electrocorticography data via clustering of brain regions.

    Science.gov (United States)

    Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward; Dougherty, Max; Hamann, Bernd; Weber, Gunther H

    2017-06-06

    There exists a need for effective and easy-to-use software tools supporting the analysis of complex Electrocorticography (ECoG) data. Understanding how epileptic seizures develop or identifying diagnostic indicators for neurological diseases require the in-depth analysis of neural activity data from ECoG. Such data is multi-scale and is of high spatio-temporal resolution. Comprehensive analysis of this data should be supported by interactive visual analysis methods that allow a scientist to understand functional patterns at varying levels of granularity and comprehend its time-varying behavior. We introduce a novel multi-scale visual analysis system, ECoG ClusterFlow, for the detailed exploration of ECoG data. Our system detects and visualizes dynamic high-level structures, such as communities, derived from the time-varying connectivity network. The system supports two major views: 1) an overview summarizing the evolution of clusters over time and 2) an electrode view using hierarchical glyph-based design to visualize the propagation of clusters in their spatial, anatomical context. We present case studies that were performed in collaboration with neuroscientists and neurosurgeons using simulated and recorded epileptic seizure data to demonstrate our system's effectiveness. ECoG ClusterFlow supports the comparison of spatio-temporal patterns for specific time intervals and allows a user to utilize various clustering algorithms. Neuroscientists can identify the site of seizure genesis and its spatial progression during various the stages of a seizure. Our system serves as a fast and powerful means for the generation of preliminary hypotheses that can be used as a basis for subsequent application of rigorous statistical methods, with the ultimate goal being the clinical treatment of epileptogenic zones.

  14. Managing the environmental impacts of land transport: integrating environmental analysis with urban planning

    International Nuclear Information System (INIS)

    Irving, Paul; Moncrieff, Ian

    2004-01-01

    Ecological systems have limits or thresholds that vary by pollutant type, emissions sources and the sensitivity of a given location. Human health can also indicate sensitivity. Good environmental management requires any problem to be defined to obtain efficient and effective solutions. Cities are where transport activities, effects and resource management decisions are often most focussed. The New Zealand Ministry of Transport has developed two environmental management tools. The Vehicle Fleet Model (VFM) is a predictive database of the environmental performance of the New Zealand traffic fleet (and rail fleet). It calculates indices of local air quality, stormwater, and greenhouse gases emissions. The second is an analytical process based on Environmental Capacity Analysis (ECA). Information on local traffic is combined with environmental performance data from the Vehicle Fleet Model. This can be integrated within a live, geo-spatially defined analysis of the overall environmental effects within a defined local area. Variations in urban form and activity (traffic and other) that contribute to environmental effects can be tracked. This enables analysis of a range of mitigation strategies that may contribute, now or in the future, to maintaining environmental thresholds or meeting targets. A case study of the application of this approach was conducted within Waitakere City. The focus was on improving the understanding of the relative significance of stormwater contaminants derived from land transport

  15. Managing the environmental impacts of land transport: integrating environmental analysis with urban planning.

    Science.gov (United States)

    Irving, Paul; Moncrieff, Ian

    2004-12-01

    Ecological systems have limits or thresholds that vary by pollutant type, emissions sources and the sensitivity of a given location. Human health can also indicate sensitivity. Good environmental management requires any problem to be defined to obtain efficient and effective solutions. Cities are where transport activities, effects and resource management decisions are often most focussed. The New Zealand Ministry of Transport has developed two environmental management tools. The Vehicle Fleet Model (VFM) is a predictive database of the environmental performance of the New Zealand traffic fleet (and rail fleet). It calculates indices of local air quality, stormwater, and greenhouse gases emissions. The second is an analytical process based on Environmental Capacity Analysis (ECA). Information on local traffic is combined with environmental performance data from the Vehicle Fleet Model. This can be integrated within a live, geo-spatially defined analysis of the overall environmental effects within a defined local area. Variations in urban form and activity (traffic and other) that contribute to environmental effects can be tracked. This enables analysis of a range of mitigation strategies that may contribute, now or in the future, to maintaining environmental thresholds or meeting targets. A case study of the application of this approach was conducted within Waitakere City. The focus was on improving the understanding of the relative significance of stormwater contaminants derived from land transport.

  16. An Analysis of Base Level Environmental Organizations

    Science.gov (United States)

    1991-09-01

    Success," Engineering and Services Update, 1: 1-2 (January 1991). 35. Sutermeister , Robert A. People and Productivity (Second Edition). New York: McGraw...some form. Sutermeister includes Formal Organization, which is influenced by Structure, as a major factor affecting productivity (35). In his total...Department of Defense (DoD) there is a growing commitment towards environmental quality and compliance. As of 1988, DoD had more than 5,000 people working

  17. Degradation Assessment and Fault Diagnosis for Roller Bearing Based on AR Model and Fuzzy Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Lingli Jiang

    2011-01-01

    Full Text Available This paper proposes a new approach combining autoregressive (AR model and fuzzy cluster analysis for bearing fault diagnosis and degradation assessment. AR model is an effective approach to extract the fault feature, and is generally applied to stationary signals. However, the fault vibration signals of a roller bearing are non-stationary and non-Gaussian. Aiming at this problem, the set of parameters of the AR model is estimated based on higher-order cumulants. Consequently, the AR parameters are taken as the feature vectors, and fuzzy cluster analysis is applied to perform classification and pattern recognition. Experiments analysis results show that the proposed method can be used to identify various types and severities of fault bearings. This study is significant for non-stationary and non-Gaussian signal analysis, fault diagnosis and degradation assessment.

  18. Pathway enrichment and co-expression cluster analysis - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us FANTOM5 Pathway enrichment and co-expression cluster analysis Data detail Data name Pathway enri...a.nbdc01389-003.V002 No Update V1 10.18908/lsdba.nbdc01389-003.V001 - Description of data contents Pathway enri.../ File size: 86 MB Simple search URL - Data acquisition method - Data analysis method Co-expression cluster analysis Gostat enri...chment analysis Pathway enrichment analysis Number of data en...ite Policy | Contact Us Pathway enrichment and co-expression cluster analysis - FANTOM5 | LSDB Archive ...

  19. Envri Cluster - a Community-Driven Platform of European Environmental Researcher Infrastructures for Providing Common E-Solutions for Earth Science

    Science.gov (United States)

    Asmi, A.; Sorvari, S.; Kutsch, W. L.; Laj, P.

    2017-12-01

    European long-term environmental research infrastructures (often referred as ESFRI RIs) are the core facilities for providing services for scientists in their quest for understanding and predicting the complex Earth system and its functioning that requires long-term efforts to identify environmental changes (trends, thresholds and resilience, interactions and feedbacks). Many of the research infrastructures originally have been developed to respond to the needs of their specific research communities, however, it is clear that strong collaboration among research infrastructures is needed to serve the trans-boundary research requires exploring scientific questions at the intersection of different scientific fields, conducting joint research projects and developing concepts, devices, and methods that can be used to integrate knowledge. European Environmental research infrastructures have already been successfully worked together for many years and have established a cluster - ENVRI cluster - for their collaborative work. ENVRI cluster act as a collaborative platform where the RIs can jointly agree on the common solutions for their operations, draft strategies and policies and share best practices and knowledge. Supporting project for the ENVRI cluster, ENVRIplus project, brings together 21 European research infrastructures and infrastructure networks to work on joint technical solutions, data interoperability, access management, training, strategies and dissemination efforts. ENVRI cluster act as one stop shop for multidisciplinary RI users, other collaborative initiatives, projects and programmes and coordinates and implement jointly agreed RI strategies.

  20. Analysis of cost data in a cluster-randomized, controlled trial: comparison of methods

    DEFF Research Database (Denmark)

    Sokolowski, Ineta; Ørnbøl, Eva; Rosendal, Marianne

    studies have used non-valid analysis of skewed data. We propose two different methods to compare mean cost in two groups. Firstly, we use a non-parametric bootstrap method where the re-sampling takes place on two levels in order to take into account the cluster effect. Secondly, we proceed with a log......-transformation of the cost data and apply the normal theory on these data. Again we try to account for the cluster effect. The performance of these two methods is investigated in a simulation study. The advantages and disadvantages of the different approaches are discussed.......  We consider health care data from a cluster-randomized intervention study in primary care to test whether the average health care costs among study patients differ between the two groups. The problems of analysing cost data are that most data are severely skewed. Median instead of mean...

  1. Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method

    DEFF Research Database (Denmark)

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2014-01-01

    method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous......The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models...... the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace...

  2. CLUSTER TAXOMETRY OF ATTENTION DEFICIT/ HYPERACTIVITY DISORDER WITH LATENT CLASS AND CORRESPONDENCE ANALYSIS

    Directory of Open Access Journals (Sweden)

    DAVID A PINEDA

    2007-08-01

    Full Text Available Attention deficit/hyperactivity disorder (ADHD has heterogeneous symptoms with diverse grades of severity. Latentclass cluster analysis (LCCA can be used to classify children, using direct data from any instrument that reports thesesymptoms, without previous gold standard diagnosis. One ADHD symptoms checklist, and one ADHD comorbiditiesquestionnaire were used. LCCAs were developed for each instrument, which were administered to a sample of 540children and adolescents, aged 4-17 years, from the regular school of Manizales-Colombia. A simple correspondenceanalysis (SCA was done to determine the relationships between the groups classified from both LCCAs. Six clusters were obtained from ADHD checklist and five from the ADHD comorbidities questionnaire. SCA found fourindependent groups, derived from the concordances between the 11 clusters obtained by the LCCAs from bothinstruments. These findings suggest that LCCA and SCA can be use as accurate taxometric procedures to classifyexternalizing psychopathologies.

  3. Contour Cluster Shape Analysis for Building Damage Detection from Post-earthquake Airborne LiDAR

    Directory of Open Access Journals (Sweden)

    HE Meizhang

    2015-04-01

    Full Text Available Detection of the damaged building is the obligatory step prior to evaluate earthquake casualty and economic losses. It's very difficult to detect damaged buildings accurately based on the assumption that intact roofs appear in laser data as large planar segments whereas collapsed roofs are characterized by many small segments. This paper presents a contour cluster shape similarity analysis algorithm for reliable building damage detection from the post-earthquake airborne LiDAR point cloud. First we evaluate the entropies of shape similarities between all the combinations of two contour lines within a building cluster, which quantitatively describe the shape diversity. Then the maximum entropy model is employed to divide all the clusters into intact and damaged classes. The tests on the LiDAR data at El Mayor-Cucapah earthquake rupture prove the accuracy and reliability of the proposed method.

  4. 3D Building Models Segmentation Based on K-Means++ Cluster Analysis

    Science.gov (United States)

    Zhang, C.; Mao, B.

    2016-10-01

    3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model) 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid) and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.

  5. 3D BUILDING MODELS SEGMENTATION BASED ON K-MEANS++ CLUSTER ANALYSIS

    Directory of Open Access Journals (Sweden)

    C. Zhang

    2016-10-01

    Full Text Available 3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.

  6. Visual MRI: merging information visualization and non-parametric clustering techniques for MRI dataset analysis.

    Science.gov (United States)

    Castellani, Umberto; Cristani, Marco; Combi, Carlo; Murino, Vittorio; Sbarbati, Andrea; Marzola, Pasquina

    2008-11-01

    This paper presents Visual MRI, an innovative tool for the magnetic resonance imaging (MRI) analysis of tumoral tissues. The main goal of the analysis is to separate each magnetic resonance image in meaningful clusters, highlighting zones which are more probably related with the cancer evolution. Such non-invasive analysis serves to address novel cancer treatments, resulting in a less destabilizing and more effective type of therapy than the chemotherapy-based ones. The advancements brought by Visual MRI are two: first, it is an integration of effective information visualization (IV) techniques into a clustering framework, which separates each MRI image in a set of informative clusters; the second improvement relies in the clustering framework itself, which is derived from a recently re-discovered non-parametric grouping strategy, i.e., the mean shift. The proposed methodology merges visualization methods and data mining techniques, providing a computational framework that allows the physician to move effectively from the MRI image to the images displaying the derived parameter space. An unsupervised non-parametric clustering algorithm, derived from the mean shift paradigm, and called MRI-mean shift, is the novel data mining technique proposed here. The main underlying idea of such approach is that the parameter space is regarded as an empirical probability density function to estimate: the possible separate modes and their attraction basins represent separated clusters. The mean shift algorithm needs sensibility threshold values to be set, which could lead to highly different segmentation results. Usually, these values are set by hands. Here, with the MRI-mean shift algorithm, we propose a strategy based on a structured optimality criterion which faces effectively this issue, resulting in a completely unsupervised clustering framework. A linked brushing visualization technique is then used for representing clusters on the parameter space and on the MRI image

  7. Spatio-temporal cluster analysis of the incidence of Campylobacter cases and patients with general diarrhea in a Danish county, 1995–2004

    Directory of Open Access Journals (Sweden)

    Simonsen Jacob

    2009-02-01

    Full Text Available Abstract Campylobacter infections are the main cause of bacterial gastroenteritis in Denmark. While primarily foodborne, Campylobacter infections are also to some degree acquired through other sources which may include contact with animals or the environment, locally contaminated drinking water and more. We analyzed Campylobacter cases for clustering in space and time for the large Danish island of Funen in the period 1995–2003, under the assumption that infections caused by 'environmental' factors may show persistent clustering while foodborne infections will occur randomly in space. Input data were geo-coded datasets of the addresses of laboratory-confirmed Campylobacter cases and of the background population of Funen County. The dataset had a spatial extent of 4.900 km2. Data were aggregated into units of analysis (so-called features of 5 km by 5 km times 1 year, and the Campylobacter incidence calculated. We used a modified form of local Moran's I to test if features with similar incidence rates occurred next to each other in space and time, and compared the observed clusters with simulated clusters. Because clusters may be caused by a high tendency among local GPs to submit stool samples, we also analyzed a dataset of all submitted stool samples for comparison. The results showed a significant persisting clustering of Campylobacter incidence rates in the Western part of Funen. Results were visualized using the Netlogo software. The underlying causes of the observed clustering are not known and will require further examination, but may be partially explained by an increased rate of stool samples submissions by physicians in the area. We hope, by this approach, to have developed a tool which will allow for analyses of geographical clusters which may in turn form a basis for further epidemiological examinations to cast light on the sources of infection.

  8. HICOSMO - X-ray analysis of a complete sample of galaxy clusters

    Science.gov (United States)

    Schellenberger, G.; Reiprich, T.

    2017-10-01

    Galaxy clusters are known to be the largest virialized objects in the Universe. Based on the theory of structure formation one can use them as cosmological probes, since they originate from collapsed overdensities in the early Universe and witness its history. The X-ray regime provides the unique possibility to measure in detail the most massive visible component, the intra cluster medium. Using Chandra observations of a local sample of 64 bright clusters (HIFLUGCS) we provide total (hydrostatic) and gas mass estimates of each cluster individually. Making use of the completeness of the sample we quantify two interesting cosmological parameters by a Bayesian cosmological likelihood analysis. We find Ω_{M}=0.3±0.01 and σ_{8}=0.79±0.03 (statistical uncertainties) using our default analysis strategy combining both, a mass function analysis and the gas mass fraction results. The main sources of biases that we discuss and correct here are (1) the influence of galaxy groups (higher incompleteness in parent samples and a differing behavior of the L_{x} - M relation), (2) the hydrostatic mass bias (as determined by recent hydrodynamical simulations), (3) the extrapolation of the total mass (comparing various methods), (4) the theoretical halo mass function and (5) other cosmological (non-negligible neutrino mass), and instrumental (calibration) effects.

  9. Global Analysis of miRNA Gene Clusters and Gene Families Reveals Dynamic and Coordinated Expression

    Directory of Open Access Journals (Sweden)

    Li Guo

    2014-01-01

    Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.

  10. Links between patterns of racial socialization and discrimination experiences and psychological adjustment: a cluster analysis.

    Science.gov (United States)

    Ajayi, Alex A; Syed, Moin

    2014-10-01

    This study used a person-oriented analytic approach to identify meaningful patterns of barriers-focused racial socialization and perceived racial discrimination experiences in a sample of 295 late adolescents. Using cluster analysis, three distinct groups were identified: Low Barrier Socialization-Low Discrimination, High Barrier Socialization-Low Discrimination, and High Barrier Socialization-High Discrimination clusters. These groups were substantively unique in terms of the frequency of racial socialization messages about bias preparation and out-group mistrust its members received and their actual perceived discrimination experiences. Further, individuals in the High Barrier Socialization-High Discrimination cluster reported significantly higher depressive symptoms than those in the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. However, no differences in adjustment were observed between the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. Overall, the findings highlight important individual differences in how young people of color experience their race and how these differences have significant implications on psychological adjustment. Copyright © 2014 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.

  11. A clustering analysis of eddies' spatial distribution in the South China Sea

    Science.gov (United States)

    Yi, J.; Du, Y.; Wang, X.; He, Z.; Zhou, C.

    2013-02-01

    Spatial variation is important for studying the mesoscale eddies in the South China Sea (SCS). To investigate such spatial variations, this study made a clustering analysis on eddies' distribution using the K-means approach. Results showed that clustering tendency of anticyclonic eddies (AEs) and cyclonic eddies (CEs) were weak but not random, and the number of clusters were proved greater than four. Finer clustering results showed 10 regions where AEs densely populated and 6 regions for CEs in the SCS. Previous studies confirmed these partitions and possible generation mechanisms were related. Comparisons between AEs and CEs revealed that patterns of AE are relatively more aggregated than those of CE, and specific distinctions were summarized: (1) to the southwest of Luzon Island, AEs and CEs are generated spatially apart; AEs are likely located north of 14° N and closer to shore, while CEs are to the south and further offshore. (2) The central SCS and Nansha Trough are mostly dominated by AEs. (3) Along 112° E, clusters of AEs and CEs are located sequentially apart, and the pairs off Vietnam represent the dipole structures. (4) To the southwest of the Dongsha Islands, AEs are concentrated to the east of CEs. Overlaps of AEs and CEs in the northeastern and southern SCS were further examined considering seasonal variations. The northeastern overlap represented near-concentric distributions while the southern one was a mixed effect of seasonal variations, complex circulations and topography influences.

  12. Environmental Education in Macedonian Schools: A Comparative Analysis of Textbooks

    Science.gov (United States)

    Srbinovski, Mile

    2013-01-01

    The purpose of this article is to describe and discuss an analysis of the extent to which environmental issues are addressed in the textbooks in the schools of the Republic of Macedonia. Research has analyzed a range of textbooks (279) published in the past 15 years. Our fundamental conclusion is that the inclusion of environmental issues in the…

  13. Challenge clusters facing LCA in environmental decision-making-what we can learn from biofuels.

    Science.gov (United States)

    McManus, Marcelle C; Taylor, Caroline M; Mohr, Alison; Whittaker, Carly; Scown, Corinne D; Borrion, Aiduan Li; Glithero, Neryssa J; Yin, Yao

    Bioenergy is increasingly used to help meet greenhouse gas (GHG) and renewable energy targets. However, bioenergy's sustainability has been questioned, resulting in increasing use of life cycle assessment (LCA). Bioenergy systems are global and complex, and market forces can result in significant changes, relevant to LCA and policy. The goal of this paper is to illustrate the complexities associated with LCA, with particular focus on bioenergy and associated policy development, so that its use can more effectively inform policymakers. The review is based on the results from a series of workshops focused on bioenergy life cycle assessment. Expert submissions were compiled and categorized within the first two workshops. Over 100 issues emerged. Accounting for redundancies and close similarities in the list, this reduced to around 60 challenges, many of which are deeply interrelated. Some of these issues were then explored further at a policy-facing workshop in London, UK. The authors applied a rigorous approach to categorize the challenges identified to be at the intersection of biofuels/bioenergy LCA and policy. The credibility of LCA is core to its use in policy. Even LCAs that comply with ISO standards and policy and regulatory instruments leave a great deal of scope for interpretation and flexibility. Within the bioenergy sector, this has led to frustration and at times a lack of obvious direction. This paper identifies the main challenge clusters: overarching issues, application and practice and value and ethical judgments. Many of these are reflective of the transition from application of LCA to assess individual products or systems to the wider approach that is becoming more common. Uncertainty in impact assessment strongly influences planning and compliance due to challenges in assigning accountability, and communicating the inherent complexity and uncertainty within bioenergy is becoming of greater importance. The emergence of LCA in bioenergy governance is

  14. Market segmentation for multiple option healthcare delivery systems--an application of cluster analysis.

    Science.gov (United States)

    Jarboe, G R; Gates, R H; McDaniel, C D

    1990-01-01

    Healthcare providers of multiple option plans may be confronted with special market segmentation problems. This study demonstrates how cluster analysis may be used for discovering distinct patterns of preference for multiple option plans. The availability of metric, as opposed to categorical or ordinal, data provides the ability to use sophisticated analysis techniques which may be superior to frequency distributions and cross-tabulations in revealing preference patterns.

  15. Phenotype Clustering of Breast Epithelial Cells in Confocal Imagesbased on Nuclear Protein Distribution Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Long, Fuhui; Peng, Hanchuan; Sudar, Damir; Levievre, Sophie A.; Knowles, David W.

    2006-09-05

    Background: The distribution of the chromatin-associatedproteins plays a key role in directing nuclear function. Previously, wedeveloped an image-based method to quantify the nuclear distributions ofproteins and showed that these distributions depended on the phenotype ofhuman mammary epithelial cells. Here we describe a method that creates ahierarchical tree of the given cell phenotypes and calculates thestatistical significance between them, based on the clustering analysisof nuclear protein distributions. Results: Nuclear distributions ofnuclear mitotic apparatus protein were previously obtained fornon-neoplastic S1 and malignant T4-2 human mammary epithelial cellscultured for up to 12 days. Cell phenotype was defined as S1 or T4-2 andthe number of days in cultured. A probabilistic ensemble approach wasused to define a set of consensus clusters from the results of multipletraditional cluster analysis techniques applied to the nucleardistribution data. Cluster histograms were constructed to show how cellsin any one phenotype were distributed across the consensus clusters.Grouping various phenotypes allowed us to build phenotype trees andcalculate the statistical difference between each group. The resultsshowed that non-neoplastic S1 cells could be distinguished from malignantT4-2 cells with 94.19 percent accuracy; that proliferating S1 cells couldbe distinguished from differentiated S1 cells with 92.86 percentaccuracy; and showed no significant difference between the variousphenotypes of T4-2 cells corresponding to increasing tumor sizes.Conclusion: This work presents a cluster analysis method that canidentify significant cell phenotypes, based on the nuclear distributionof specific proteins, with high accuracy.

  16. Validation of hierarchical cluster analysis for identification of bacterial species using 42 bacterial isolates

    Science.gov (United States)

    Ghebremedhin, Meron; Yesupriya, Shubha; Luka, Janos; Crane, Nicole J.

    2015-03-01

    Recent studies have demonstrated the potential advantages of the use of Raman spectroscopy in the biomedical field due to its rapidity and noninvasive nature. In this study, Raman spectroscopy is applied as a method for differentiating between bacteria isolates for Gram status and Genus species. We created models for identifying 28 bacterial isolates using spectra collected with a 785 nm laser excitation Raman spectroscopic system. In order to investigate the groupings of these samples, partial least squares discriminant analysis (PLSDA) and hierarchical cluster analysis (HCA) was implemented. In addition, cluster analyses of the isolates were performed using various data types consisting of, biochemical tests, gene sequence alignment, high resolution melt (HRM) analysis and antimicrobial susceptibility tests of minimum inhibitory concentration (MIC) and degree of antimicrobial resistance (SIR). In order to evaluate the ability of these models to correctly classify bacterial isolates using solely Raman spectroscopic data, a set of 14 validation samples were tested using the PLSDA models and consequently the HCA models. External cluster evaluation criteria of purity and Rand index were calculated at different taxonomic levels to compare the performance of clustering using Raman spectra as well as the other datasets. Results showed that Raman spectra performed comparably, and in some cases better than, the other data types with Rand index and purity values up to 0.933 and 0.947, respectively. This study clearly demonstrates that the discrimination of bacterial species using Raman spectroscopic data and hierarchical cluster analysis is possible and has the potential to be a powerful point-of-care tool in clinical settings.

  17. Practice-related changes in neural activation patterns investigated via wavelet-based clustering analysis

    Science.gov (United States)

    Lee, Jinae; Park, Cheolw