WorldWideScience

Sample records for hierarchical clustering results

  1. Neutrosophic Hierarchical Clustering Algoritms

    Directory of Open Access Journals (Sweden)

    Rıdvan Şahin

    2014-03-01

    Full Text Available Interval neutrosophic set (INS is a generalization of interval valued intuitionistic fuzzy set (IVIFS, whose the membership and non-membership values of elements consist of fuzzy range, while single valued neutrosophic set (SVNS is regarded as extension of intuitionistic fuzzy set (IFS. In this paper, we extend the hierarchical clustering techniques proposed for IFSs and IVIFSs to SVNSs and INSs respectively. Based on the traditional hierarchical clustering procedure, the single valued neutrosophic aggregation operator, and the basic distance measures between SVNSs, we define a single valued neutrosophic hierarchical clustering algorithm for clustering SVNSs. Then we extend the algorithm to classify an interval neutrosophic data. Finally, we present some numerical examples in order to show the effectiveness and availability of the developed clustering algorithms.

  2. Convex Clustering: An Attractive Alternative to Hierarchical Clustering

    Science.gov (United States)

    Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth

    2015-01-01

    The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340

  3. Statistical Significance for Hierarchical Clustering

    Science.gov (United States)

    Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

    2017-01-01

    Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990

  4. Hierarchical clustering using correlation metric and spatial continuity constraint

    Science.gov (United States)

    Stork, Christopher L.; Brewer, Luke N.

    2012-10-02

    Large data sets are analyzed by hierarchical clustering using correlation as a similarity measure. This provides results that are superior to those obtained using a Euclidean distance similarity measure. A spatial continuity constraint may be applied in hierarchical clustering analysis of images.

  5. Cluster Based Hierarchical Routing Protocol for Wireless Sensor Network

    OpenAIRE

    Rashed, Md. Golam; Kabir, M. Hasnat; Rahim, Muhammad Sajjadur; Ullah, Shaikh Enayet

    2012-01-01

    The efficient use of energy source in a sensor node is most desirable criteria for prolong the life time of wireless sensor network. In this paper, we propose a two layer hierarchical routing protocol called Cluster Based Hierarchical Routing Protocol (CBHRP). We introduce a new concept called head-set, consists of one active cluster head and some other associate cluster heads within a cluster. The head-set members are responsible for control and management of the network. Results show that t...

  6. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    Science.gov (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  7. Similarity maps and hierarchical clustering for annotating FT-IR spectral images.

    Science.gov (United States)

    Zhong, Qiaoyong; Yang, Chen; Großerüschkamp, Frederik; Kallenbach-Thieltges, Angela; Serocka, Peter; Gerwert, Klaus; Mosig, Axel

    2013-11-20

    Unsupervised segmentation of multi-spectral images plays an important role in annotating infrared microscopic images and is an essential step in label-free spectral histopathology. In this context, diverse clustering approaches have been utilized and evaluated in order to achieve segmentations of Fourier Transform Infrared (FT-IR) microscopic images that agree with histopathological characterization. We introduce so-called interactive similarity maps as an alternative annotation strategy for annotating infrared microscopic images. We demonstrate that segmentations obtained from interactive similarity maps lead to similarly accurate segmentations as segmentations obtained from conventionally used hierarchical clustering approaches. In order to perform this comparison on quantitative grounds, we provide a scheme that allows to identify non-horizontal cuts in dendrograms. This yields a validation scheme for hierarchical clustering approaches commonly used in infrared microscopy. We demonstrate that interactive similarity maps may identify more accurate segmentations than hierarchical clustering based approaches, and thus are a viable and due to their interactive nature attractive alternative to hierarchical clustering. Our validation scheme furthermore shows that performance of hierarchical two-means is comparable to the traditionally used Ward's clustering. As the former is much more efficient in time and memory, our results suggest another less resource demanding alternative for annotating large spectral images.

  8. Clustering-based classification of road traffic accidents using hierarchical clustering and artificial neural networks.

    Science.gov (United States)

    Taamneh, Madhar; Taamneh, Salah; Alkheder, Sharaf

    2017-09-01

    Artificial neural networks (ANNs) have been widely used in predicting the severity of road traffic crashes. All available information about previously occurred accidents is typically used for building a single prediction model (i.e., classifier). Too little attention has been paid to the differences between these accidents, leading, in most cases, to build less accurate predictors. Hierarchical clustering is a well-known clustering method that seeks to group data by creating a hierarchy of clusters. Using hierarchical clustering and ANNs, a clustering-based classification approach for predicting the injury severity of road traffic accidents was proposed. About 6000 road accidents occurred over a six-year period from 2008 to 2013 in Abu Dhabi were used throughout this study. In order to reduce the amount of variation in data, hierarchical clustering was applied on the data set to organize it into six different forms, each with different number of clusters (i.e., clusters from 1 to 6). Two ANN models were subsequently built for each cluster of accidents in each generated form. The first model was built and validated using all accidents (training set), whereas only 66% of the accidents were used to build the second model, and the remaining 34% were used to test it (percentage split). Finally, the weighted average accuracy was computed for each type of models in each from of data. The results show that when testing the models using the training set, clustering prior to classification achieves (11%-16%) more accuracy than without using clustering, while the percentage split achieves (2%-5%) more accuracy. The results also suggest that partitioning the accidents into six clusters achieves the best accuracy if both types of models are taken into account.

  9. Clinical fracture risk evaluated by hierarchical agglomerative clustering

    DEFF Research Database (Denmark)

    Kruse, C; Eiken, P; Vestergaard, P

    2017-01-01

    reimbursement, primary healthcare sector use and comorbidity of female subjects were combined. Standardized variable means, Euclidean distances and Ward's D2 method of hierarchical agglomerative clustering (HAC), were used to form the clustering object. K number of clusters was selected with the lowest cluster...

  10. Improved Gravitation Field Algorithm and Its Application in Hierarchical Clustering

    Science.gov (United States)

    Zheng, Ming; Sun, Ying; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang

    2012-01-01

    Background Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. Method An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Results Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved. PMID:23173043

  11. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li

    2017-01-01

    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  12. Merging K-means with hierarchical clustering for identifying general-shaped groups.

    Science.gov (United States)

    Peterson, Anna D; Ghosh, Arka P; Maitra, Ranjan

    2018-01-01

    Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and K -means clustering are two approaches but have different strengths and weaknesses. For instance, hierarchical clustering identifies groups in a tree-like structure but suffers from computational complexity in large datasets while K -means clustering is efficient but designed to identify homogeneous spherically-shaped clusters. We present a hybrid non-parametric clustering approach that amalgamates the two methods to identify general-shaped clusters and that can be applied to larger datasets. Specifically, we first partition the dataset into spherical groups using K -means. We next merge these groups using hierarchical methods with a data-driven distance measure as a stopping criterion. Our proposal has the potential to reveal groups with general shapes and structure in a dataset. We demonstrate good performance on several simulated and real datasets.

  13. Robust Pseudo-Hierarchical Support Vector Clustering

    DEFF Research Database (Denmark)

    Hansen, Michael Sass; Sjöstrand, Karl; Olafsdóttir, Hildur

    2007-01-01

    Support vector clustering (SVC) has proven an efficient algorithm for clustering of noisy and high-dimensional data sets, with applications within many fields of research. An inherent problem, however, has been setting the parameters of the SVC algorithm. Using the recent emergence of a method...... for calculating the entire regularization path of the support vector domain description, we propose a fast method for robust pseudo-hierarchical support vector clustering (HSVC). The method is demonstrated to work well on generated data, as well as for detecting ischemic segments from multidimensional myocardial...

  14. Hierarchical modeling of cluster size in wildlife surveys

    Science.gov (United States)

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  15. Energy Efficient Hierarchical Clustering Approaches in Wireless Sensor Networks: A Survey

    Directory of Open Access Journals (Sweden)

    Bilal Jan

    2017-01-01

    Full Text Available Wireless sensor networks (WSN are one of the significant technologies due to their diverse applications such as health care monitoring, smart phones, military, disaster management, and other surveillance systems. Sensor nodes are usually deployed in large number that work independently in unattended harsh environments. Due to constraint resources, typically the scarce battery power, these wireless nodes are grouped into clusters for energy efficient communication. In clustering hierarchical schemes have achieved great interest for minimizing energy consumption. Hierarchical schemes are generally categorized as cluster-based and grid-based approaches. In cluster-based approaches, nodes are grouped into clusters, where a resourceful sensor node is nominated as a cluster head (CH while in grid-based approach the network is divided into confined virtual grids usually performed by the base station. This paper highlights and discusses the design challenges for cluster-based schemes, the important cluster formation parameters, and classification of hierarchical clustering protocols. Moreover, existing cluster-based and grid-based techniques are evaluated by considering certain parameters to help users in selecting appropriate technique. Furthermore, a detailed summary of these protocols is presented with their advantages, disadvantages, and applicability in particular cases.

  16. Hierarchical clustering of HPV genotype patterns in the ASCUS-LSIL triage study

    Science.gov (United States)

    Wentzensen, Nicolas; Wilson, Lauren E.; Wheeler, Cosette M.; Carreon, Joseph D.; Gravitt, Patti E.; Schiffman, Mark; Castle, Philip E.

    2010-01-01

    Anogenital cancers are associated with about 13 carcinogenic HPV types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common which complicate the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the ASCUS-LSIL triage study using unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered using complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: Cluster 1 included all CIN3 histology with abnormal cytology; Cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion (HSIL) cytology; Cluster 3 included older women with normal or low grade histology/cytology and low viral load; Cluster 4 included younger women with low grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: Group 1 included only HPV16; Group 2 included nine carcinogenic types plus non-carcinogenic HPV53 and HPV66; and Group 3 included non-carcinogenic types plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and HSIL. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease using unsupervised hierarchical clustering can address complex genotype distributions on a population level. PMID:20959485

  17. The structure of nearby clusters of galaxies Hierarchical clustering and an application to the Leo region

    CERN Document Server

    Materne, J

    1978-01-01

    A new method of classifying groups of galaxies, called hierarchical clustering, is presented as a tool for the investigation of nearby groups of galaxies. The method is free from model assumptions about the groups. The scaling of the different coordinates is necessary, and the level from which one accepts the groups as real has to be determined. Hierarchical clustering is applied to an unbiased sample of galaxies in the Leo region. Five distinct groups result which have reasonable physical properties, such as low crossing times and conservative mass-to-light ratios, and which follow a radial velocity- luminosity relation. Only 4 out of 39 galaxies were adopted as field galaxies. (27 refs).

  18. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  19. Star Cluster Structure from Hierarchical Star Formation

    Science.gov (United States)

    Grudic, Michael; Hopkins, Philip; Murray, Norman; Lamberts, Astrid; Guszejnov, David; Schmitz, Denise; Boylan-Kolchin, Michael

    2018-01-01

    Young massive star clusters (YMCs) spanning 104-108 M⊙ in mass generally have similar radial surface density profiles, with an outer power-law index typically between -2 and -3. This similarity suggests that they are shaped by scale-free physics at formation. Recent multi-physics MHD simulations of YMC formation have also produced populations of YMCs with this type of surface density profile, allowing us to narrow down the physics necessary to form a YMC with properties as observed. We show that the shallow density profiles of YMCs are a natural result of phase-space mixing that occurs as they assemble from the clumpy, hierarchically-clustered configuration imprinted by the star formation process. We develop physical intuition for this process via analytic arguments and collisionless N-body experiments, elucidating the connection between star formation physics and star cluster structure. This has implications for the early-time structure and evolution of proto-globular clusters, and prospects for simulating their formation in the FIRE cosmological zoom-in simulations.

  20. A Hierarchical Clustering Methodology for the Estimation of Toxicity

    Science.gov (United States)

    A Quantitative Structure Activity Relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural sim...

  1. Hierarchical Control for Multiple DC Microgrids Clusters

    DEFF Research Database (Denmark)

    Shafiee, Qobad; Dragicevic, Tomislav; Vasquez, Juan Carlos

    2014-01-01

    This paper presents a distributed hierarchical control framework to ensure reliable operation of dc Microgrid (MG) clusters. In this hierarchy, primary control is used to regulate the common bus voltage inside each MG locally. An adaptive droop method is proposed for this level which determines...

  2. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling

    Science.gov (United States)

    Rahman, Md. Habibur; Matin, M. A.; Salma, Umma

    2017-12-01

    The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.

  3. Hierarchical video summarization based on context clustering

    Science.gov (United States)

    Tseng, Belle L.; Smith, John R.

    2003-11-01

    A personalized video summary is dynamically generated in our video personalization and summarization system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order to maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. Besides finding the desired contents, the objective is to present a coherent summary. There are diverse methods for creating summaries, and we focus on the challenges of generating a hierarchical video summary based on context information. In our summarization algorithm, three inputs are used to generate the hierarchical video summary output. These inputs are (1) MPEG-7 metadata descriptions of the contents in the server, (2) user preference and usage environment declarations from the user client, and (3) context information including MPEG-7 controlled term list and classification scheme. In a video sequence, descriptions and relevance scores are assigned to each shot. Based on these shot descriptions, context clustering is performed to collect consecutively similar shots to correspond to hierarchical scene representations. The context clustering is based on the available context information, and may be derived from domain knowledge or rules engines. Finally, the selection of structured video segments to generate the hierarchical summary efficiently balances between scene representation and shot selection.

  4. Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination.

    Science.gov (United States)

    Yau, Christopher; Holmes, Chris

    2011-07-01

    We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.

  5. Assessment of surface water quality using hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Dheeraj Kumar Dabgerwal

    2016-02-01

    Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016,  pp: 32-44

  6. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    Science.gov (United States)

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  7. Which, When, and How: Hierarchical Clustering with Human–Machine Cooperation

    Directory of Open Access Journals (Sweden)

    Huanyang Zheng

    2016-12-01

    Full Text Available Human–Machine Cooperations (HMCs can balance the advantages and disadvantages of human computation (accurate but costly and machine computation (cheap but inaccurate. This paper studies HMCs in agglomerative hierarchical clusterings, where the machine can ask the human some questions. The human will return the answers to the machine, and the machine will use these answers to correct errors in its current clustering results. We are interested in the machine’s strategy on handling the question operations, in terms of three problems: (1 Which question should the machine ask? (2 When should the machine ask the question (early or late? (3 How does the machine adjust the clustering result, if the machine’s mistake is found by the human? Based on the insights of these problems, an efficient algorithm is proposed with five implementation variations. Experiments on image clusterings show that the proposed algorithm can improve the clustering accuracy with few question operations.

  8. The efficiency of average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling in identifying homogeneous precipitation catchments

    Science.gov (United States)

    Chuan, Zun Liang; Ismail, Noriszura; Shinyie, Wendy Ling; Lit Ken, Tan; Fam, Soo-Fen; Senawi, Azlyna; Yusoff, Wan Nur Syahidah Wan

    2018-04-01

    Due to the limited of historical precipitation records, agglomerative hierarchical clustering algorithms widely used to extrapolate information from gauged to ungauged precipitation catchments in yielding a more reliable projection of extreme hydro-meteorological events such as extreme precipitation events. However, identifying the optimum number of homogeneous precipitation catchments accurately based on the dendrogram resulted using agglomerative hierarchical algorithms are very subjective. The main objective of this study is to propose an efficient regionalized algorithm to identify the homogeneous precipitation catchments for non-stationary precipitation time series. The homogeneous precipitation catchments are identified using average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling, while uncentered correlation coefficient as the similarity measure. The regionalized homogeneous precipitation is consolidated using K-sample Anderson Darling non-parametric test. The analysis result shows the proposed regionalized algorithm performed more better compared to the proposed agglomerative hierarchical clustering algorithm in previous studies.

  9. Constructing storyboards based on hierarchical clustering analysis

    Science.gov (United States)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  10. Unsupervised active learning based on hierarchical graph-theoretic clustering.

    Science.gov (United States)

    Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve

    2009-10-01

    Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.

  11. Hierarchical clusters of phytoplankton variables in dammed water bodies

    Science.gov (United States)

    Silva, Eliana Costa e.; Lopes, Isabel Cristina; Correia, Aldina; Gonçalves, A. Manuela

    2017-06-01

    In this paper a dataset containing biological variables of the water column of several Portuguese reservoirs is analyzed. Hierarchical cluster analysis is used to obtain clusters of phytoplankton variables of the phylum Cyanophyta, with the objective of validating the classification of Portuguese reservoirs previewly presented in [1] which were divided into three clusters: (1) Interior Tagus and Aguieira; (2) Douro; and (3) Other rivers. Now three new clusters of Cyanophyta variables were found. Kruskal-Wallis and Mann-Whitney tests are used to compare the now obtained Cyanophyta clusters and the previous Reservoirs clusters, in order to validate the classification of the water quality of reservoirs. The amount of Cyanophyta algae present in the reservoirs from the three clusters is significantly different, which validates the previous classification.

  12. A hierarchical cluster analysis of normal-tension glaucoma using spectral-domain optical coherence tomography parameters.

    Science.gov (United States)

    Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2015-01-01

    Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.

  13. Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

    Science.gov (United States)

    Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2014-04-29

    To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  14. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    Science.gov (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Radar Emission Sources Identification Based on Hierarchical Agglomerative Clustering for Large Data Sets

    Directory of Open Access Journals (Sweden)

    Janusz Dudczyk

    2016-01-01

    Full Text Available More advanced recognition methods, which may recognize particular copies of radars of the same type, are called identification. The identification process of radar devices is a more specialized task which requires methods based on the analysis of distinctive features. These features are distinguished from the signals coming from the identified devices. Such a process is called Specific Emitter Identification (SEI. The identification of radar emission sources with the use of classic techniques based on the statistical analysis of basic measurable parameters of a signal such as Radio Frequency, Amplitude, Pulse Width, or Pulse Repetition Interval is not sufficient for SEI problems. This paper presents the method of hierarchical data clustering which is used in the process of radar identification. The Hierarchical Agglomerative Clustering Algorithm (HACA based on Generalized Agglomerative Scheme (GAS implemented and used in the research method is parameterized; therefore, it is possible to compare the results. The results of clustering are presented in dendrograms in this paper. The received results of grouping and identification based on HACA are compared with other SEI methods in order to assess the degree of their usefulness and effectiveness for systems of ESM/ELINT class.

  16. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    Energy Technology Data Exchange (ETDEWEB)

    Grasha, K.; Calzetti, D. [Astronomy Department, University of Massachusetts, Amherst, MA 01003 (United States); Adamo, A.; Messa, M. [Dept. of Astronomy, The Oskar Klein Centre, Stockholm University, Stockholm (Sweden); Kim, H. [Gemini Observatory, La Serena (Chile); Elmegreen, B. G. [IBM Research Division, T.J. Watson Research Center, Yorktown Hts., NY (United States); Gouliermis, D. A. [Zentrum für Astronomie der Universität Heidelberg, Institut für Theoretische Astrophysik, Albert-Ueberle-Str. 2, D-69120 Heidelberg (Germany); Dale, D. A. [Dept. of Physics and Astronomy, University of Wyoming, Laramie, WY (United States); Fumagalli, M. [Institute for Computational Cosmology and Centre for Extragalactic Astronomy, Durham University, Durham (United Kingdom); Grebel, E. K.; Shabani, F. [Astronomisches Rechen-Institut, Zentrum für Astronomie der Universität Heidelberg, Mönchhofstr. 12-14, D-69120 Heidelberg (Germany); Johnson, K. E. [Dept. of Astronomy, University of Virginia, Charlottesville, VA (United States); Kahre, L. [Dept. of Astronomy, New Mexico State University, Las Cruces, NM (United States); Kennicutt, R. C. [Institute of Astronomy, University of Cambridge, Cambridge (United Kingdom); Pellerin, A. [Dept. of Physics and Astronomy, State University of New York at Geneseo, Geneseo NY (United States); Ryon, J. E.; Ubeda, L. [Space Telescope Science Institute, Baltimore, MD (United States); Smith, L. J. [European Space Agency/Space Telescope Science Institute, Baltimore, MD (United States); Thilker, D., E-mail: kgrasha@astro.umass.edu [Dept. of Physics and Astronomy, The Johns Hopkins University, Baltimore, MD (United States)

    2017-05-10

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3–15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ∼40–60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  17. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    Science.gov (United States)

    Grasha, K.; Calzetti, D.; Adamo, A.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Dale, D. A.; Fumagalli, M.; Grebel, E. K.; Johnson, K. E.; Kahre, L.; Kennicutt, R. C.; Messa, M.; Pellerin, A.; Ryon, J. E.; Smith, L. J.; Shabani, F.; Thilker, D.; Ubeda, L.

    2017-05-01

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3-15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ˜40-60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  18. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    International Nuclear Information System (INIS)

    Grasha, K.; Calzetti, D.; Adamo, A.; Messa, M.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Dale, D. A.; Fumagalli, M.; Grebel, E. K.; Shabani, F.; Johnson, K. E.; Kahre, L.; Kennicutt, R. C.; Pellerin, A.; Ryon, J. E.; Ubeda, L.; Smith, L. J.; Thilker, D.

    2017-01-01

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3–15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ∼40–60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  19. Prediction of Solvent Physical Properties using the Hierarchical Clustering Method

    Science.gov (United States)

    Recently a QSAR (Quantitative Structure Activity Relationship) method, the hierarchical clustering method, was developed to estimate acute toxicity values for large, diverse datasets. This methodology has now been applied to the estimate solvent physical properties including sur...

  20. Determination of genetic structure of germplasm collections: are traditional hierarchical clustering methods appropriate for molecular marker data?

    NARCIS (Netherlands)

    Odong, T.L.; Heerwaarden, van J.; Jansen, J.; Hintum, van T.J.L.; Eeuwijk, van F.A.

    2011-01-01

    Despite the availability of newer approaches, traditional hierarchical clustering remains very popular in genetic diversity studies in plants. However, little is known about its suitability for molecular marker data. We studied the performance of traditional hierarchical clustering techniques using

  1. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

    Science.gov (United States)

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

    2017-07-01

    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  2. Novel density-based and hierarchical density-based clustering algorithms for uncertain data.

    Science.gov (United States)

    Zhang, Xianchao; Liu, Han; Zhang, Xiaotong

    2017-09-01

    Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing

  3. Technique for fast and efficient hierarchical clustering

    Science.gov (United States)

    Stork, Christopher

    2013-10-08

    A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.

  4. Multi-documents summarization based on clustering of learning object using hierarchical clustering

    Science.gov (United States)

    Mustamiin, M.; Budi, I.; Santoso, H. B.

    2018-03-01

    The Open Educational Resources (OER) is a portal of teaching, learning and research resources that is available in public domain and freely accessible. Learning contents or Learning Objects (LO) are granular and can be reused for constructing new learning materials. LO ontology-based searching techniques can be used to search for LO in the Indonesia OER. In this research, LO from search results are used as an ingredient to create new learning materials according to the topic searched by users. Summarizing-based grouping of LO use Hierarchical Agglomerative Clustering (HAC) with the dependency context to the user’s query which has an average value F-Measure of 0.487, while summarizing by K-Means F-Measure only has an average value of 0.336.

  5. The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

    KAUST Repository

    Euá n, Carolina; Ombao, Hernando; Ortega, Joaquí n

    2018-01-01

    We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms

  6. Evolutionary-Hierarchical Bases of the Formation of Cluster Model of Innovation Economic Development

    Directory of Open Access Journals (Sweden)

    Yuliya Vladimirovna Dubrovskaya

    2016-10-01

    Full Text Available The functioning of a modern economic system is based on the interaction of objects of different hierarchical levels. Thus, the problem of the study of innovation processes taking into account the mutual influence of the activities of these economic actors becomes important. The paper dwells evolutionary basis for the formation of models of innovation development on the basis of micro and macroeconomic analysis. Most of the concepts recognized that despite a big number of diverse models, the coordination of the relations between economic agents is of crucial importance for the successful innovation development. According to the results of the evolutionary-hierarchical analysis, the authors reveal key phases of the development of forms of business cooperation, science and government in the domestic economy. It has become the starting point of the conception of the characteristics of the interaction in the cluster models of innovation development of the economy. Considerable expectancies on improvement of the national innovative system are connected with the development of cluster and network structures. The main objective of government authorities is the formation of mechanisms and institutions that will foster cooperation between members of the clusters. The article explains that the clusters cannot become the factors in the growth of the national economy, not being an effective tool for interaction between the actors of the regional innovative systems.

  7. Hierarchical Adaptive Means (HAM) clustering for hardware-efficient, unsupervised and real-time spike sorting.

    Science.gov (United States)

    Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G

    2014-09-30

    This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.

  8. An Energy Efficient Cooperative Hierarchical MIMO Clustering Scheme for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Sungyoung Lee

    2011-12-01

    Full Text Available In this work, we present an energy efficient hierarchical cooperative clustering scheme for wireless sensor networks. Communication cost is a crucial factor in depleting the energy of sensor nodes. In the proposed scheme, nodes cooperate to form clusters at each level of network hierarchy ensuring maximal coverage and minimal energy expenditure with relatively uniform distribution of load within the network. Performance is enhanced by cooperative multiple-input multiple-output (MIMO communication ensuring energy efficiency for WSN deployments over large geographical areas. We test our scheme using TOSSIM and compare the proposed scheme with cooperative multiple-input multiple-output (CMIMO clustering scheme and traditional multihop Single-Input-Single-Output (SISO routing approach. Performance is evaluated on the basis of number of clusters, number of hops, energy consumption and network lifetime. Experimental results show significant energy conservation and increase in network lifetime as compared to existing schemes.

  9. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Directory of Open Access Journals (Sweden)

    I. Crawford

    2015-11-01

    Full Text Available In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the

  10. Hierarchical Network Design

    DEFF Research Database (Denmark)

    Thomadsen, Tommy

    2005-01-01

    Communication networks are immensely important today, since both companies and individuals use numerous services that rely on them. This thesis considers the design of hierarchical (communication) networks. Hierarchical networks consist of layers of networks and are well-suited for coping...... with changing and increasing demands. Two-layer networks consist of one backbone network, which interconnects cluster networks. The clusters consist of nodes and links, which connect the nodes. One node in each cluster is a hub node, and the backbone interconnects the hub nodes of each cluster and thus...... the clusters. The design of hierarchical networks involves clustering of nodes, hub selection, and network design, i.e. selection of links and routing of ows. Hierarchical networks have been in use for decades, but integrated design of these networks has only been considered for very special types of networks...

  11. Hierarchical clustering into groups of human brain regions according to elemental composition

    International Nuclear Information System (INIS)

    Stedman, J.D.; Spyrou, N.M.

    1998-01-01

    Thirteen brain regions were dissected from both hemispheres of fifteen 'normal' ageing subjects (8 females, 7 males) of mean age 79±7 years. Elemental compositions were determined by simultaneous application of particle induced X-ray emission (PIXE) and Rutherford backscattering (RBS) analyses using a 2 MeV, 4 nA proton beam scanned over 4 mm 2 of the sample surface. Elemental concentrations were found to be dependent upon the brain region and hemisphere studied. Hierarchical cluster analysis was applied to group the brain regions according to the sample concentrations of eight elements. The resulting dendrogram is presented and its clusters related to the sample compositions of grey and white matter. (author)

  12. The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

    KAUST Repository

    Euán, Carolina

    2018-04-12

    We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data.

  13. Hierarchical Star Formation in Turbulent Media: Evidence from Young Star Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Grasha, K.; Calzetti, D. [Astronomy Department, University of Massachusetts, Amherst, MA 01003 (United States); Elmegreen, B. G. [IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY (United States); Adamo, A.; Messa, M. [Department of Astronomy, The Oskar Klein Centre, Stockholm University, Stockholm (Sweden); Aloisi, A.; Bright, S. N.; Lee, J. C.; Ryon, J. E.; Ubeda, L. [Space Telescope Science Institute, Baltimore, MD (United States); Cook, D. O. [California Institute of Technology, 1200 East California Boulevard, Pasadena, CA (United States); Dale, D. A. [Department of Physics and Astronomy, University of Wyoming, Laramie, WY (United States); Fumagalli, M. [Institute for Computational Cosmology and Centre for Extragalactic Astronomy, Department of Physics, Durham University, Durham (United Kingdom); Gallagher III, J. S. [Department of Astronomy, University of Wisconsin–Madison, Madison, WI (United States); Gouliermis, D. A. [Zentrum für Astronomie der Universität Heidelberg, Institut für Theoretische Astrophysik, Albert-Ueberle-Str. 2, D-69120 Heidelberg (Germany); Grebel, E. K. [Astronomisches Rechen-Institut, Zentrum für Astronomie der Universität Heidelberg, Mönchhofstr. 12-14, D-69120, Heidelberg (Germany); Kahre, L. [Department of Astronomy, New Mexico State University, Las Cruces, NM (United States); Kim, H. [Gemini Observatory, La Serena (Chile); Krumholz, M. R., E-mail: kgrasha@astro.umass.edu [Research School of Astronomy and Astrophysics, Australian National University, Canberra, ACT 2611 (Australia)

    2017-06-10

    We present an analysis of the positions and ages of young star clusters in eight local galaxies to investigate the connection between the age difference and separation of cluster pairs. We find that star clusters do not form uniformly but instead are distributed so that the age difference increases with the cluster pair separation to the 0.25–0.6 power, and that the maximum size over which star formation is physically correlated ranges from ∼200 pc to ∼1 kpc. The observed trends between age difference and separation suggest that cluster formation is hierarchical both in space and time: clusters that are close to each other are more similar in age than clusters born further apart. The temporal correlations between stellar aggregates have slopes that are consistent with predictions of turbulence acting as the primary driver of star formation. The velocity associated with the maximum size is proportional to the galaxy’s shear, suggesting that the galactic environment influences the maximum size of the star-forming structures.

  14. Inferring hierarchical clustering structures by deterministic annealing

    International Nuclear Information System (INIS)

    Hofmann, T.; Buhmann, J.M.

    1996-01-01

    The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree evaluation and we derive a non-greedy technique for tree growing. Applying the principles of maximum entropy and minimum cross entropy, a deterministic annealing algorithm is derived in a meanfield approximation. This technique allows us to canonically superimpose tree structures and to fit parameters to averaged or open-quote fuzzified close-quote trees

  15. Intensity-based hierarchical clustering in CT-scans: application to interactive segmentation in cardiology

    Science.gov (United States)

    Hadida, Jonathan; Desrosiers, Christian; Duong, Luc

    2011-03-01

    The segmentation of anatomical structures in Computed Tomography Angiography (CTA) is a pre-operative task useful in image guided surgery. Even though very robust and precise methods have been developed to help achieving a reliable segmentation (level sets, active contours, etc), it remains very time consuming both in terms of manual interactions and in terms of computation time. The goal of this study is to present a fast method to find coarse anatomical structures in CTA with few parameters, based on hierarchical clustering. The algorithm is organized as follows: first, a fast non-parametric histogram clustering method is proposed to compute a piecewise constant mask. A second step then indexes all the space-connected regions in the piecewise constant mask. Finally, a hierarchical clustering is achieved to build a graph representing the connections between the various regions in the piecewise constant mask. This step builds up a structural knowledge about the image. Several interactive features for segmentation are presented, for instance association or disassociation of anatomical structures. A comparison with the Mean-Shift algorithm is presented.

  16. The use of hierarchical clustering for the design of optimized monitoring networks

    Science.gov (United States)

    Soares, Joana; Makar, Paul Andrew; Aklilu, Yayne; Akingunola, Ayodeji

    2018-05-01

    Associativity analysis is a powerful tool to deal with large-scale datasets by clustering the data on the basis of (dis)similarity and can be used to assess the efficacy and design of air quality monitoring networks. We describe here our use of Kolmogorov-Zurbenko filtering and hierarchical clustering of NO2 and SO2 passive and continuous monitoring data to analyse and optimize air quality networks for these species in the province of Alberta, Canada. The methodology applied in this study assesses dissimilarity between monitoring station time series based on two metrics: 1 - R, R being the Pearson correlation coefficient, and the Euclidean distance; we find that both should be used in evaluating monitoring site similarity. We have combined the analytic power of hierarchical clustering with the spatial information provided by deterministic air quality model results, using the gridded time series of model output as potential station locations, as a proxy for assessing monitoring network design and for network optimization. We demonstrate that clustering results depend on the air contaminant analysed, reflecting the difference in the respective emission sources of SO2 and NO2 in the region under study. Our work shows that much of the signal identifying the sources of NO2 and SO2 emissions resides in shorter timescales (hourly to daily) due to short-term variation of concentrations and that longer-term averages in data collection may lose the information needed to identify local sources. However, the methodology identifies stations mainly influenced by seasonality, if larger timescales (weekly to monthly) are considered. We have performed the first dissimilarity analysis based on gridded air quality model output and have shown that the methodology is capable of generating maps of subregions within which a single station will represent the entire subregion, to a given level of dissimilarity. We have also shown that our approach is capable of identifying different

  17. A data-driven approach to estimating the number of clusters in hierarchical clustering [version 1; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Antoine E. Zambelli

    2016-12-01

    Full Text Available DNA microarray and gene expression problems often require a researcher to perform clustering on their data in a bid to better understand its structure. In cases where the number of clusters is not known, one can resort to hierarchical clustering methods. However, there currently exist very few automated algorithms for determining the true number of clusters in the data. We propose two new methods (mode and maximum difference for estimating the number of clusters in a hierarchical clustering framework to create a fully automated process with no human intervention. These methods are compared to the established elbow and gap statistic algorithms using simulated datasets and the Biobase Gene ExpressionSet. We also explore a data mixing procedure inspired by cross validation techniques. We find that the overall performance of the maximum difference method is comparable or greater to that of the gap statistic in multi-cluster scenarios, and achieves that performance at a fraction of the computational cost. This method also responds well to our mixing procedure, which opens the door to future research. We conclude that both the mode and maximum difference methods warrant further study related to their mixing and cross-validation potential. We particularly recommend the use of the maximum difference method in multi-cluster scenarios given its accuracy and execution times, and present it as an alternative to existing algorithms.

  18. Prediction of in vitro and in vivo oestrogen receptor activity using hierarchical clustering

    Science.gov (United States)

    In this study, hierarchical clustering classification models were developed to predict in vitro and in vivo oestrogen receptor (ER) activity. Classification models were developed for binding, agonist, and antagonist in vitro ER activity and for mouse in vivo uterotrophic ER bindi...

  19. Kendall’s tau and agglomerative clustering for structure determination of hierarchical Archimedean copulas

    Directory of Open Access Journals (Sweden)

    Górecki J.

    2017-01-01

    Full Text Available Several successful approaches to structure determination of hierarchical Archimedean copulas (HACs proposed in the literature rely on agglomerative clustering and Kendall’s correlation coefficient. However, there has not been presented any theoretical proof justifying such approaches. This work fills this gap and introduces a theorem showing that, given the matrix of the pairwise Kendall correlation coefficients corresponding to a HAC, its structure can be recovered by an agglomerative clustering technique.

  20. Communication Base Station Log Analysis Based on Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Shao-Hua

    2017-01-01

    Full Text Available Communication base stations generate massive data every day, these base station logs play an important value in mining of the business circles. This paper use data mining technology and hierarchical clustering algorithm to group the scope of business circle for the base station by recording the data of these base stations.Through analyzing the data of different business circle based on feature extraction and comparing different business circle category characteristics, which can choose a suitable area for operators of commercial marketing.

  1. A study of hierarchical clustering of galaxies in an expanding universe

    Science.gov (United States)

    Porter, D. H.

    The nonlinear hierarchical clustering of galaxies in an Einstein-deSitter (Omega = 1), initially white noise mass fluctuations (n = 0) model universe is investigated and shown to be in contradiction with previous results. The model is done in terms of an 11,000-body numerical simulation. The independent statics of 0.72 million particles are used to simulte the boundary conditions. A new method for integrating the Newtonian N-body gravity equations, which has controllable accuracy, incorporates a recursive center of mass reduction, and regularizes two body encounters is used to do the simulation. The coordinate system used here is well suited for the investigation of galaxy clustering, incorporating the independent positions and velocities of an arbitrary number of particles into a logarithmic hierarchy of center of mass nodes. The boundary for the simulation is created by using this hierarchy to map the independent statics of 0.72 million particles into just 4,000 particles. This method for simulating the boundary conditions also has controllable accuracy.

  2. Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space

    OpenAIRE

    Loewenstein, Yaniv; Portugaly, Elon; Fromer, Menachem; Linial, Michal

    2008-01-01

    Motivation: UPGMA (average linking) is probably the most popular algorithm for hierarchical data clustering, especially in computational biology. However, UPGMA requires the entire dissimilarity matrix in memory. Due to this prohibitive requirement, UPGMA is not scalable to very large datasets. Application: We present a novel class of memory-constrained UPGMA (MC-UPGMA) algorithms. Given any practical memory size constraint, this framework guarantees the correct clustering solution without ex...

  3. Hierarchical and Non-Hierarchical Linear and Non-Linear Clustering Methods to “Shakespeare Authorship Question”

    Directory of Open Access Journals (Sweden)

    Refat Aljumily

    2015-09-01

    Full Text Available A few literary scholars have long claimed that Shakespeare did not write some of his best plays (history plays and tragedies and proposed at one time or another various suspect authorship candidates. Most modern-day scholars of Shakespeare have rejected this claim, arguing that strong evidence that Shakespeare wrote the plays and poems being his name appears on them as the author. This has caused and led to an ongoing scholarly academic debate for quite some long time. Stylometry is a fast-growing field often used to attribute authorship to anonymous or disputed texts. Stylometric attempts to resolve this literary puzzle have raised interesting questions over the past few years. The following paper contributes to “the Shakespeare authorship question” by using a mathematically-based methodology to examine the hypothesis that Shakespeare wrote all the disputed plays traditionally attributed to him. More specifically, the mathematically based methodology used here is based on Mean Proximity, as a linear hierarchical clustering method, and on Principal Components Analysis, as a non-hierarchical linear clustering method. It is also based, for the first time in the domain, on Self-Organizing Map U-Matrix and Voronoi Map, as non-linear clustering methods to cover the possibility that our data contains significant non-linearities. Vector Space Model (VSM is used to convert texts into vectors in a high dimensional space. The aim of which is to compare the degrees of similarity within and between limited samples of text (the disputed plays. The various works and plays assumed to have been written by Shakespeare and possible authors notably, Sir Francis Bacon, Christopher Marlowe, John Fletcher, and Thomas Kyd, where “similarity” is defined in terms of correlation/distance coefficient measure based on the frequency of usage profiles of function words, word bi-grams, and character triple-grams. The claim that Shakespeare authored all the disputed

  4. K-means clustering for optimal partitioning and dynamic load balancing of parallel hierarchical N-body simulations

    International Nuclear Information System (INIS)

    Marzouk, Youssef M.; Ghoniem, Ahmed F.

    2005-01-01

    A number of complex physical problems can be approached through N-body simulation, from fluid flow at high Reynolds number to gravitational astrophysics and molecular dynamics. In all these applications, direct summation is prohibitively expensive for large N and thus hierarchical methods are employed for fast summation. This work introduces new algorithms, based on k-means clustering, for partitioning parallel hierarchical N-body interactions. We demonstrate that the number of particle-cluster interactions and the order at which they are performed are directly affected by partition geometry. Weighted k-means partitions minimize the sum of clusters' second moments and create well-localized domains, and thus reduce the computational cost of N-body approximations by enabling the use of lower-order approximations and fewer cells. We also introduce compatible techniques for dynamic load balancing, including adaptive scaling of cluster volumes and adaptive redistribution of cluster centroids. We demonstrate the performance of these algorithms by constructing a parallel treecode for vortex particle simulations, based on the serial variable-order Cartesian code developed by Lindsay and Krasny [Journal of Computational Physics 172 (2) (2001) 879-907]. The method is applied to vortex simulations of a transverse jet. Results show outstanding parallel efficiencies even at high concurrencies, with velocity evaluation errors maintained at or below their serial values; on a realistic distribution of 1.2 million vortex particles, we observe a parallel efficiency of 98% on 1024 processors. Excellent load balance is achieved even in the face of several obstacles, such as an irregular, time-evolving particle distribution containing a range of length scales and the continual introduction of new vortex particles throughout the domain. Moreover, results suggest that k-means yields a more efficient partition of the domain than a global oct-tree

  5. Hierarchical clustering analysis of blood plasma lipidomics profiles from mono- and dizygotic twin families

    NARCIS (Netherlands)

    Draisma, H.H.; Reijmers, T.H.; Meulman, J.J.; Greef, J. van der; Hankemeier, T.; Boomsma, D.I.

    2013-01-01

    Twin and family studies are typically used to elucidate the relative contribution of genetic and environmental variation to phenotypic variation. Here, we apply a quantitative genetic method based on hierarchical clustering, to blood plasma lipidomics data obtained in a healthy cohort consisting of

  6. Implementation of hierarchical clustering using k-mer sparse matrix to analyze MERS-CoV genetic relationship

    Science.gov (United States)

    Bustamam, A.; Ulul, E. D.; Hura, H. F. A.; Siswantining, T.

    2017-07-01

    Hierarchical clustering is one of effective methods in creating a phylogenetic tree based on the distance matrix between DNA (deoxyribonucleic acid) sequences. One of the well-known methods to calculate the distance matrix is k-mer method. Generally, k-mer is more efficient than some distance matrix calculation techniques. The steps of k-mer method are started from creating k-mer sparse matrix, and followed by creating k-mer singular value vectors. The last step is computing the distance amongst vectors. In this paper, we analyze the sequences of MERS-CoV (Middle East Respiratory Syndrome - Coronavirus) DNA by implementing hierarchical clustering using k-mer sparse matrix in order to perform the phylogenetic analysis. Our results show that the ancestor of our MERS-CoV is coming from Egypt. Moreover, we found that the MERS-CoV infection that occurs in one country may not necessarily come from the same country of origin. This suggests that the process of MERS-CoV mutation might not only be influenced by geographical factor.

  7. A hierarchical clustering scheme approach to assessment of IP-network traffic using detrended fluctuation analysis

    Science.gov (United States)

    Takuma, Takehisa; Masugi, Masao

    2009-03-01

    This paper presents an approach to the assessment of IP-network traffic in terms of the time variation of self-similarity. To get a comprehensive view in analyzing the degree of long-range dependence (LRD) of IP-network traffic, we use a hierarchical clustering scheme, which provides a way to classify high-dimensional data with a tree-like structure. Also, in the LRD-based analysis, we employ detrended fluctuation analysis (DFA), which is applicable to the analysis of long-range power-law correlations or LRD in non-stationary time-series signals. Based on sequential measurements of IP-network traffic at two locations, this paper derives corresponding values for the LRD-related parameter α that reflects the degree of LRD of measured data. In performing the hierarchical clustering scheme, we use three parameters: the α value, average throughput, and the proportion of network traffic that exceeds 80% of network bandwidth for each measured data set. We visually confirm that the traffic data can be classified in accordance with the network traffic properties, resulting in that the combined depiction of the LRD and other factors can give us an effective assessment of network conditions at different times.

  8. On the Disruption of Star Clusters in a Hierarchical Interstellar Medium

    Science.gov (United States)

    Elmegreen, Bruce G.; Hunter, Deidre A.

    2010-03-01

    The distribution of the number of clusters as a function of mass M and age T suggests that clusters get eroded or dispersed in a regular way over time, such that the cluster number decreases inversely as an approximate power law with T within each fixed interval of M. This power law is inconsistent with standard dispersal mechanisms such as cluster evaporation and cloud collisions. In the conventional interpretation, it requires the unlikely situation where diverse mechanisms stitch together over time in a way that is independent of environment or M. Here, we consider another model in which the large-scale distribution of gas in each star-forming region plays an important role. We note that star clusters form with positional and temporal correlations in giant cloud complexes, and suggest that these complexes dominate the tidal force and collisional influence on a cluster during its first several hundred million years. Because the cloud complex density decreases regularly with position from the cluster birth site, the harassment and collision rates between the cluster and the cloud pieces decrease regularly with age as the cluster drifts. This decrease is typically a power law of the form required to explain the mass-age distribution. We reproduce this distribution for a variety of cases, including rapid disruption, slow erosion, combinations of these two, cluster-cloud collisions, cluster disruption by hierarchical disassembly, and partial cluster disruption. We also consider apparent cluster mass loss by fading below the surface brightness limit of a survey. In all cases, the observed log M-log T diagram can be reproduced under reasonable assumptions.

  9. ON THE DISRUPTION OF STAR CLUSTERS IN A HIERARCHICAL INTERSTELLAR MEDIUM

    International Nuclear Information System (INIS)

    Elmegreen, Bruce G.; Hunter, Deidre A.

    2010-01-01

    The distribution of the number of clusters as a function of mass M and age T suggests that clusters get eroded or dispersed in a regular way over time, such that the cluster number decreases inversely as an approximate power law with T within each fixed interval of M. This power law is inconsistent with standard dispersal mechanisms such as cluster evaporation and cloud collisions. In the conventional interpretation, it requires the unlikely situation where diverse mechanisms stitch together over time in a way that is independent of environment or M. Here, we consider another model in which the large-scale distribution of gas in each star-forming region plays an important role. We note that star clusters form with positional and temporal correlations in giant cloud complexes, and suggest that these complexes dominate the tidal force and collisional influence on a cluster during its first several hundred million years. Because the cloud complex density decreases regularly with position from the cluster birth site, the harassment and collision rates between the cluster and the cloud pieces decrease regularly with age as the cluster drifts. This decrease is typically a power law of the form required to explain the mass-age distribution. We reproduce this distribution for a variety of cases, including rapid disruption, slow erosion, combinations of these two, cluster-cloud collisions, cluster disruption by hierarchical disassembly, and partial cluster disruption. We also consider apparent cluster mass loss by fading below the surface brightness limit of a survey. In all cases, the observed log M-log T diagram can be reproduced under reasonable assumptions.

  10. The Hierarchical Clustering of Tax Burden in the EU27

    Directory of Open Access Journals (Sweden)

    Simkova Nikola

    2015-09-01

    Full Text Available The issue of taxation has become more important due to a significant share of the government revenue. There are several ways of expressing the tax burden of countries. This paper describes the traditional approach as a share of tax revenue to GDP which is applied to the total taxation and the capital taxation as a part of tax systems affecting investment decisions. The implicit tax rate on capital created by Eurostat also offers a possible explanation of the tax burden on capital, so its components are analysed in detail. This study uses one of the econometric methods called the hierarchical clustering. The data on which the clustering is based comprises countries in the EU27 for the period of 1995 – 2012. The aim of this paper is to reveal clusters of countries in the EU27 with similar tax burden or tax changes. The findings suggest that mainly newly acceding countries (2004 and 2007 are in a group of countries with a low tax burden which tried to encourage investors by favourable tax rates. On the other hand, there are mostly countries from the original EU15. Some clusters may be explained by similar historical development, geographic and demographic characteristics.

  11. ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time.

    Science.gov (United States)

    Cai, Yunpeng; Sun, Yijun

    2011-08-01

    Taxonomy-independent analysis plays an essential role in microbial community analysis. Hierarchical clustering is one of the most widely employed approaches to finding operational taxonomic units, the basis for many downstream analyses. Most existing algorithms have quadratic space and computational complexities, and thus can be used only for small or medium-scale problems. We propose a new online learning-based algorithm that simultaneously addresses the space and computational issues of prior work. The basic idea is to partition a sequence space into a set of subspaces using a partition tree constructed using a pseudometric, then recursively refine a clustering structure in these subspaces. The technique relies on new methods for fast closest-pair searching and efficient dynamic insertion and deletion of tree nodes. To avoid exhaustive computation of pairwise distances between clusters, we represent each cluster of sequences as a probabilistic sequence, and define a set of operations to align these probabilistic sequences and compute genetic distances between them. We present analyses of space and computational complexity, and demonstrate the effectiveness of our new algorithm using a human gut microbiota data set with over one million sequences. The new algorithm exhibits a quasilinear time and space complexity comparable to greedy heuristic clustering algorithms, while achieving a similar accuracy to the standard hierarchical clustering algorithm.

  12. Hierarchical Clustering of Large Databases and Classification of Antibiotics at High Noise Levels

    Directory of Open Access Journals (Sweden)

    Alexander V. Yarkov

    2008-12-01

    Full Text Available A new algorithm for divisive hierarchical clustering of chemical compounds based on 2D structural fragments is suggested. The algorithm is deterministic, and given a random ordering of the input, will always give the same clustering and can process a database up to 2 million records on a standard PC. The algorithm was used for classification of 1,183 antibiotics mixed with 999,994 random chemical structures. Similarity threshold, at which best separation of active and non active compounds took place, was estimated as 0.6. 85.7% of the antibiotics were successfully classified at this threshold with 0.4% of inaccurate compounds. A .sdf file was created with the probe molecules for clustering of external databases.

  13. A Resting-State Brain Functional Network Study in MDD Based on Minimum Spanning Tree Analysis and the Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Xiaowei Li

    2017-01-01

    Full Text Available A large number of studies demonstrated that major depressive disorder (MDD is characterized by the alterations in brain functional connections which is also identifiable during the brain’s “resting-state.” But, in the present study, the approach of constructing functional connectivity is often biased by the choice of the threshold. Besides, more attention was paid to the number and length of links in brain networks, and the clustering partitioning of nodes was unclear. Therefore, minimum spanning tree (MST analysis and the hierarchical clustering were first used for the depression disease in this study. Resting-state electroencephalogram (EEG sources were assessed from 15 healthy and 23 major depressive subjects. Then the coherence, MST, and the hierarchical clustering were obtained. In the theta band, coherence analysis showed that the EEG coherence of the MDD patients was significantly higher than that of the healthy controls especially in the left temporal region. The MST results indicated the higher leaf fraction in the depressed group. Compared with the normal group, the major depressive patients lost clustering in frontal regions. Our findings suggested that there was a stronger brain interaction in the MDD group and a left-right functional imbalance in the frontal regions for MDD controls.

  14. Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters.

    Science.gov (United States)

    Hensman, James; Lawrence, Neil D; Rattray, Magnus

    2013-08-20

    Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.

  15. On hierarchical solutions to the BBGKY hierarchy

    Science.gov (United States)

    Hamilton, A. J. S.

    1988-01-01

    It is thought that the gravitational clustering of galaxies in the universe may approach a scale-invariant, hierarchical form in the small separation, large-clustering regime. Past attempts to solve the Born-Bogoliubov-Green-Kirkwood-Yvon (BBGKY) hierarchy in this regime have assumed a certain separable hierarchical form for the higher order correlation functions of galaxies in phase space. It is shown here that such separable solutions to the BBGKY equations must satisfy the condition that the clustered component of the solution has cluster-cluster correlations equal to galaxy-galaxy correlations to all orders. The solutions also admit the presence of an arbitrary unclustered component, which plays no dyamical role in the large-clustering regime. These results are a particular property of the specific separable model assumed for the correlation functions in phase space, not an intrinsic property of spatially hierarchical solutions to the BBGKY hierarchy. The observed distribution of galaxies does not satisfy the required conditions. The disagreement between theory and observation may be traced, at least in part, to initial conditions which, if Gaussian, already have cluster correlations greater than galaxy correlations.

  16. CHIMERA: Top-down model for hierarchical, overlapping and directed cluster structures in directed and weighted complex networks

    Science.gov (United States)

    Franke, R.

    2016-11-01

    In many networks discovered in biology, medicine, neuroscience and other disciplines special properties like a certain degree distribution and hierarchical cluster structure (also called communities) can be observed as general organizing principles. Detecting the cluster structure of an unknown network promises to identify functional subdivisions, hierarchy and interactions on a mesoscale. It is not trivial choosing an appropriate detection algorithm because there are multiple network, cluster and algorithmic properties to be considered. Edges can be weighted and/or directed, clusters overlap or build a hierarchy in several ways. Algorithms differ not only in runtime, memory requirements but also in allowed network and cluster properties. They are based on a specific definition of what a cluster is, too. On the one hand, a comprehensive network creation model is needed to build a large variety of benchmark networks with different reasonable structures to compare algorithms. On the other hand, if a cluster structure is already known, it is desirable to separate effects of this structure from other network properties. This can be done with null model networks that mimic an observed cluster structure to improve statistics on other network features. A third important application is the general study of properties in networks with different cluster structures, possibly evolving over time. Currently there are good benchmark and creation models available. But what is left is a precise sandbox model to build hierarchical, overlapping and directed clusters for undirected or directed, binary or weighted complex random networks on basis of a sophisticated blueprint. This gap shall be closed by the model CHIMERA (Cluster Hierarchy Interconnection Model for Evaluation, Research and Analysis) which will be introduced and described here for the first time.

  17. Hierarchical Cluster Analysis of Semicircular Canal and Otolith Deficits in Bilateral Vestibulopathy

    Directory of Open Access Journals (Sweden)

    Alexander A. Tarnutzer

    2018-04-01

    Full Text Available BackgroundGait imbalance and oscillopsia are frequent complaints of bilateral vestibular loss (BLV. Video-head-impulse testing (vHIT of all six semicircular canals (SCCs has demonstrated varying involvement of the different canals. Sparing of anterior-canal function has been linked to aminoglycoside-related vestibulopathy and Menière’s disease. We hypothesized that utricular and saccular impairment [assessed by vestibular-evoked myogenic potentials (VEMPs] may be disease-specific also, possibly facilitating the differential diagnosis.MethodsWe searched our vHIT database (n = 3,271 for patients with bilaterally impaired SCC function who also received ocular VEMPs (oVEMPs and cervical VEMPs (cVEMPs and identified 101 patients. oVEMP/cVEMP latencies above the 95th percentile and peak-to-peak amplitudes below the 5th percentile of normal were considered abnormal. Frequency of impairment of vestibular end organs (horizontal/anterior/posterior SCC, utriculus/sacculus was analyzed with hierarchical cluster analysis and correlated with the underlying etiology.ResultsRates of utricular and saccular loss of function were similar (87.1 vs. 78.2%, p = 0.136, Fisher’s exact test. oVEMP abnormalities were found more frequent in aminoglycoside-related bilateral vestibular loss (BVL compared with Menière’s disease (91.7 vs. 54.6%, p = 0.039. Hierarchical cluster analysis indicated distinct patterns of vestibular end-organ impairment, showing that the results for the same end-organs on both sides are more similar than to other end-organs. Relative sparing of anterior-canal function was reflected in late merging with the other end-organs, emphasizing their distinct state. An anatomically corresponding pattern of SCC/otolith hypofunction was present in 60.4% (oVEMPs vs. horizontal SCCs, 34.7% (oVEMPs vs. anterior SCCs, and 48.5% (cVEMPs vs. posterior SCCs of cases. Average (±1 SD number of damaged sensors was 6.8 ± 2.2 out of 10

  18. Prioritizing the risk of plant pests by clustering methods; self-organising maps, k-means and hierarchical clustering

    Directory of Open Access Journals (Sweden)

    Susan Worner

    2013-09-01

    Full Text Available For greater preparedness, pest risk assessors are required to prioritise long lists of pest species with potential to establish and cause significant impact in an endangered area. Such prioritization is often qualitative, subjective, and sometimes biased, relying mostly on expert and stakeholder consultation. In recent years, cluster based analyses have been used to investigate regional pest species assemblages or pest profiles to indicate the risk of new organism establishment. Such an approach is based on the premise that the co-occurrence of well-known global invasive pest species in a region is not random, and that the pest species profile or assemblage integrates complex functional relationships that are difficult to tease apart. In other words, the assemblage can help identify and prioritise species that pose a threat in a target region. A computational intelligence method called a Kohonen self-organizing map (SOM, a type of artificial neural network, was the first clustering method applied to analyse assemblages of invasive pests. The SOM is a well known dimension reduction and visualization method especially useful for high dimensional data that more conventional clustering methods may not analyse suitably. Like all clustering algorithms, the SOM can give details of clusters that identify regions with similar pest assemblages, possible donor and recipient regions. More important, however SOM connection weights that result from the analysis can be used to rank the strength of association of each species within each regional assemblage. Species with high weights that are not already established in the target region are identified as high risk. However, the SOM analysis is only the first step in a process to assess risk to be used alongside or incorporated within other measures. Here we illustrate the application of SOM analyses in a range of contexts in invasive species risk assessment, and discuss other clustering methods such as k

  19. Hierarchical ordering with partial pairwise hierarchical relationships on the macaque brain data sets.

    Directory of Open Access Journals (Sweden)

    Woosang Lim

    Full Text Available Hierarchical organizations of information processing in the brain networks have been known to exist and widely studied. To find proper hierarchical structures in the macaque brain, the traditional methods need the entire pairwise hierarchical relationships between cortical areas. In this paper, we present a new method that discovers hierarchical structures of macaque brain networks by using partial information of pairwise hierarchical relationships. Our method uses a graph-based manifold learning to exploit inherent relationship, and computes pseudo distances of hierarchical levels for every pair of cortical areas. Then, we compute hierarchy levels of all cortical areas by minimizing the sum of squared hierarchical distance errors with the hierarchical information of few cortical areas. We evaluate our method on the macaque brain data sets whose true hierarchical levels are known as the FV91 model. The experimental results show that hierarchy levels computed by our method are similar to the FV91 model, and its errors are much smaller than the errors of hierarchical clustering approaches.

  20. A Clustering Routing Protocol for Mobile Ad Hoc Networks

    Directory of Open Access Journals (Sweden)

    Jinke Huang

    2016-01-01

    Full Text Available The dynamic topology of a mobile ad hoc network poses a real challenge in the design of hierarchical routing protocol, which combines proactive with reactive routing protocols and takes advantages of both. And as an essential technique of hierarchical routing protocol, clustering of nodes provides an efficient method of establishing a hierarchical structure in mobile ad hoc networks. In this paper, we designed a novel clustering algorithm and a corresponding hierarchical routing protocol for large-scale mobile ad hoc networks. Each cluster is composed of a cluster head, several cluster gateway nodes, several cluster guest nodes, and other cluster members. The proposed routing protocol uses proactive protocol between nodes within individual clusters and reactive protocol between clusters. Simulation results show that the proposed clustering algorithm and hierarchical routing protocol provide superior performance with several advantages over existing clustering algorithm and routing protocol, respectively.

  1. Efficient visible light photocatalytic NO{sub x} removal with cationic Ag clusters-grafted (BiO){sub 2}CO{sub 3} hierarchical superstructures

    Energy Technology Data Exchange (ETDEWEB)

    Feng, Xin [Chongqing Key Laboratory of Catalysis and Functional Organic Molecules, College of Environment and Resources, Engineering Research Center for Waste Oil Recovery Technology and Equipment of Ministry of Education, College of Environment and Resources, Chongqing Technology and Business University, Chongqing 40067 (China); Zhang, Wendong [Department of Scientific Research Management, Chongqing Normal University, Chongqing 401331 (China); Deng, Hua [State Key Joint Laboratory of Environment Simulation and Pollution Control, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085 (China); Ni, Zilin [Department of Scientific Research Management, Chongqing Normal University, Chongqing 401331 (China); Dong, Fan, E-mail: dfctbu@126.com [Chongqing Key Laboratory of Catalysis and Functional Organic Molecules, College of Environment and Resources, Engineering Research Center for Waste Oil Recovery Technology and Equipment of Ministry of Education, College of Environment and Resources, Chongqing Technology and Business University, Chongqing 40067 (China); Zhang, Yuxin, E-mail: zhangyuxin@cqu.edu.cn [College of Materials Science and Engineering, National Key Laboratory of Fundamental Science of Micro/Nano-Devices and System Technology, Chongqing University, Chongqing 400044 (China)

    2017-01-15

    Graphical abstract: The cationic Ag clusters-grafted (BiO){sub 2}CO{sub 3} hierarchical superstructures exhibits highly enhanced visible light photocatalytic air purification through an interfacial charge transfer process induced by Ag clusters. - Highlights: • Microstructural optimization and surface cluster-grafting were firstly combined. • Cationic Ag clusters were grafted on the surface of (BiO){sub 2}CO{sub 3} superstructures. • The Ag clusters-grafted BHS displayed enhanced visible light photocatalysis. • Direct interfacial charge transfer (IFCT) from BHS to Ag clusters was proposed. • The charge transfer process and the dominant reactive species were revealed. - Abstract: A facile method was developed to graft cationic Ag clusters on (BiO){sub 2}CO{sub 3} hierarchical superstructures (BHS) surface to improve their visible light activity. Significantly, the resultant Ag clusters-grafted BHS displayed a highly enhanced visible light photocatalytic performance for NOx removal due to the direct interfacial charge transfer (IFCT) from BHS to Ag clusters. The chemical and coordination state of the cationic Ag clusters was determined with the extended X-ray absorption fine structure (EXAFS) and a theoretical structure model was proposed for this unique Ag clusters. The charge transfer process and the dominant reactive species (·OH) were revealed on the basis of electron spin resonance (ESR) trapping. A new photocatalysis mechanism of Ag clusters-grafted BHS under visible light involving IFCT process was uncovered. In addition, the cationic Ag clusters-grafted BHS also demonstrated high photochemical and structural stability under repeated photocatalysis runs. The perspective of enhancing photocatalysis through combination of microstructural optimization and IFCT could provide a new avenue for the developing efficient visible light photocatalysts.

  2. Using hierarchical clustering methods to classify motor activities of COPD patients from wearable sensor data

    Directory of Open Access Journals (Sweden)

    Reilly John J

    2005-06-01

    Full Text Available Abstract Background Advances in miniature sensor technology have led to the development of wearable systems that allow one to monitor motor activities in the field. A variety of classifiers have been proposed in the past, but little has been done toward developing systematic approaches to assess the feasibility of discriminating the motor tasks of interest and to guide the choice of the classifier architecture. Methods A technique is introduced to address this problem according to a hierarchical framework and its use is demonstrated for the application of detecting motor activities in patients with chronic obstructive pulmonary disease (COPD undergoing pulmonary rehabilitation. Accelerometers were used to collect data for 10 different classes of activity. Features were extracted to capture essential properties of the data set and reduce the dimensionality of the problem at hand. Cluster measures were utilized to find natural groupings in the data set and then construct a hierarchy of the relationships between clusters to guide the process of merging clusters that are too similar to distinguish reliably. It provides a means to assess whether the benefits of merging for performance of a classifier outweigh the loss of resolution incurred through merging. Results Analysis of the COPD data set demonstrated that motor tasks related to ambulation can be reliably discriminated from tasks performed in a seated position with the legs in motion or stationary using two features derived from one accelerometer. Classifying motor tasks within the category of activities related to ambulation requires more advanced techniques. While in certain cases all the tasks could be accurately classified, in others merging clusters associated with different motor tasks was necessary. When merging clusters, it was found that the proposed method could lead to more than 12% improvement in classifier accuracy while retaining resolution of 4 tasks. Conclusion Hierarchical

  3. Assessment of genetic divergence in tomato through agglomerative hierarchical clustering and principal component analysis

    International Nuclear Information System (INIS)

    Iqbal, Q.; Saleem, M.Y.; Hameed, A.; Asghar, M.

    2014-01-01

    For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)

  4. An Algorithm for Inspecting Self Check-in Airline Luggage Based on Hierarchical Clustering and Cube-fitting

    Directory of Open Access Journals (Sweden)

    Gao Qingji

    2014-04-01

    Full Text Available Airport passengers are required to put only one baggage each time in the check-in self-service so that the baggage can be detected and identified successfully. In order to automatically get the number of baggage that had been put on the conveyor belt, dual laser rangefinders are used to scan the outer contour of luggage in this paper. The algorithm based on hierarchical clustering and cube-fitting is proposed to inspect the number and dimension of airline luggage. Firstly, the point cloud is projected to vertical direction. By the analysis of one-dimensional clustering, the number and height of luggage will be quickly computed. Secondly, the method of nearest hierarchical clustering is applied to divide the point cloud if the above cannot be distinguished. It can preferably solve the difficult issue like crossing or overlapping pieces of baggage. Finally, the point cloud is projected to the horizontal plane. By rotating point cloud based on the centre, its minimum bounding rectangle (MBR is obtained. The length and width of luggage are got form MBR. Many experiments in different cases have been done to verify the effectiveness of the algorithm.

  5. Biomolecule-Assisted Hydrothermal Synthesis and Self-Assembly of Bi2Te3 Nanostring-Cluster Hierarchical Structure

    DEFF Research Database (Denmark)

    Mi, Jianli; Lock, Nina; Sun, Ting

    2010-01-01

    A simple biomolecule-assisted hydrothermal approach has been developed for the fabrication of Bi2Te3 thermoelectric nanomaterials. The product has a nanostring-cluster hierarchical structure which is composed of ordered and aligned platelet-like crystals. The platelets are100 nm in diameter...

  6. SDN‐Based Hierarchical Agglomerative Clustering Algorithm for Interference Mitigation in Ultra‐Dense Small Cell Networks

    Directory of Open Access Journals (Sweden)

    Guang Yang

    2018-04-01

    Full Text Available Ultra‐dense small cell networks (UD‐SCNs have been identified as a promising scheme for next‐generation wireless networks capable of meeting the ever‐increasing demand for higher transmission rates and better quality of service. However, UD‐SCNs will inevitably suffer from severe interference among the small cell base stations, which will lower their spectral efficiency. In this paper, we propose a software‐defined networking (SDN‐based hierarchical agglomerative clustering (SDN‐HAC framework, which leverages SDN to centrally control all sub‐channels in the network, and decides on cluster merging using a similarity criterion based on a suitability function. We evaluate the proposed algorithm through simulation. The obtained results show that the proposed algorithm performs well and improves system payoff by 18.19% and 436.34% when compared with the traditional network architecture algorithms and non‐cooperative scenarios, respectively.

  7. A supplier selection using a hybrid grey based hierarchical clustering and artificial bee colony

    Directory of Open Access Journals (Sweden)

    Farshad Faezy Razi

    2014-06-01

    Full Text Available Selection of one or a combination of the most suitable potential providers and outsourcing problem is the most important strategies in logistics and supply chain management. In this paper, selection of an optimal combination of suppliers in inventory and supply chain management are studied and analyzed via multiple attribute decision making approach, data mining and evolutionary optimization algorithms. For supplier selection in supply chain, hierarchical clustering according to the studied indexes first clusters suppliers. Then, according to its cluster, each supplier is evaluated through Grey Relational Analysis. Then the combination of suppliers’ Pareto optimal rank and costs are obtained using Artificial Bee Colony meta-heuristic algorithm. A case study is conducted for a better description of a new algorithm to select a multiple source of suppliers.

  8. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

    Science.gov (United States)

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

    2014-11-01

    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Symptom Clusters in People Living with HIV Attending Five Palliative Care Facilities in Two Sub-Saharan African Countries: A Hierarchical Cluster Analysis.

    Science.gov (United States)

    Moens, Katrien; Siegert, Richard J; Taylor, Steve; Namisango, Eve; Harding, Richard

    2015-01-01

    Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries. Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores. Among the sample (N=217) the mean age was 36.5 (SD 9.0), 73.2% were female, and 49.1% were on antiretroviral therapy (ART). The cluster analysis produced five symptom clusters identified as: 1) dermatological; 2) generalised anxiety and elimination; 3) social and image; 4) persistently present; and 5) a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, ppeople living with HIV with longitudinally collected symptom data to test cluster stability and identify common symptom trajectories is recommended.

  10. A Performance-Prediction Model for PIC Applications on Clusters of Symmetric MultiProcessors: Validation with Hierarchical HPF+OpenMP Implementation

    Directory of Open Access Journals (Sweden)

    Sergio Briguglio

    2003-01-01

    Full Text Available A performance-prediction model is presented, which describes different hierarchical workload decomposition strategies for particle in cell (PIC codes on Clusters of Symmetric MultiProcessors. The devised workload decomposition is hierarchically structured: a higher-level decomposition among the computational nodes, and a lower-level one among the processors of each computational node. Several decomposition strategies are evaluated by means of the prediction model, with respect to the memory occupancy, the parallelization efficiency and the required programming effort. Such strategies have been implemented by integrating the high-level languages High Performance Fortran (at the inter-node stage and OpenMP (at the intra-node one. The details of these implementations are presented, and the experimental values of parallelization efficiency are compared with the predicted results.

  11. A Negative Selection Algorithm Based on Hierarchical Clustering of Self Set and its Application in Anomaly Detection

    Directory of Open Access Journals (Sweden)

    Wen Chen

    2011-08-01

    Full Text Available A negative selection algorithm based on the hierarchical clustering of self set HC-RNSA is introduced in this paper. Several strategies are applied to improve the algorithm performance. First, the self data set is replaced by the self cluster centers to compare with the detector candidates in each cluster level. As the number of self clusters is much less than the self set size, the detector generation efficiency is improved. Second, during the detector generation process, the detector candidates are restricted to the lower coverage space to reduce detector redundancy. In the article, the problem that the distances between antigens coverage to a constant value in the high dimensional space is analyzed, accordingly the Principle Component Analysis (PCA method is used to reduce the data dimension, and the fractional distance function is employed to enhance the distinctiveness between the self and non-self antigens. The detector generation procedure is terminated when the expected non-self coverage is reached. The theory analysis and experimental results demonstrate that the detection rate of HC-RNSA is higher than that of the traditional negative selection algorithms while the false alarm rate and time cost are reduced.

  12. A comparison of hierarchical cluster analysis and league table rankings as methods for analysis and presentation of district health system performance data in Uganda.

    Science.gov (United States)

    Tashobya, Christine K; Dubourg, Dominique; Ssengooba, Freddie; Speybroeck, Niko; Macq, Jean; Criel, Bart

    2016-03-01

    In 2003, the Uganda Ministry of Health introduced the district league table for district health system performance assessment. The league table presents district performance against a number of input, process and output indicators and a composite index to rank districts. This study explores the use of hierarchical cluster analysis for analysing and presenting district health systems performance data and compares this approach with the use of the league table in Uganda. Ministry of Health and district plans and reports, and published documents were used to provide information on the development and utilization of the Uganda district league table. Quantitative data were accessed from the Ministry of Health databases. Statistical analysis using SPSS version 20 and hierarchical cluster analysis, utilizing Wards' method was used. The hierarchical cluster analysis was conducted on the basis of seven clusters determined for each year from 2003 to 2010, ranging from a cluster of good through moderate-to-poor performers. The characteristics and membership of clusters varied from year to year and were determined by the identity and magnitude of performance of the individual variables. Criticisms of the league table include: perceived unfairness, as it did not take into consideration district peculiarities; and being oversummarized and not adequately informative. Clustering organizes the many data points into clusters of similar entities according to an agreed set of indicators and can provide the beginning point for identifying factors behind the observed performance of districts. Although league table ranking emphasize summation and external control, clustering has the potential to encourage a formative, learning approach. More research is required to shed more light on factors behind observed performance of the different clusters. Other countries especially low-income countries that share many similarities with Uganda can learn from these experiences. © The Author 2015

  13. Characterizing the course of back pain after osteoporotic vertebral fracture: a hierarchical cluster analysis of a prospective cohort study.

    Science.gov (United States)

    Toyoda, Hiromitsu; Takahashi, Shinji; Hoshino, Masatoshi; Takayama, Kazushi; Iseki, Kazumichi; Sasaoka, Ryuichi; Tsujio, Tadao; Yasuda, Hiroyuki; Sasaki, Takeharu; Kanematsu, Fumiaki; Kono, Hiroshi; Nakamura, Hiroaki

    2017-09-23

    This study demonstrated four distinct patterns in the course of back pain after osteoporotic vertebral fracture (OVF). Greater angular instability in the first 6 months after the baseline was one factor affecting back pain after OVF. Understanding the natural course of symptomatic acute OVF is important in deciding the optimal treatment strategy. We used latent class analysis to classify the course of back pain after OVF and identify the risk factors associated with persistent pain. This multicenter cohort study included 218 consecutive patients with ≤ 2-week-old OVFs who were enrolled at 11 institutions. Dynamic x-rays and back pain assessment with a visual analog scale (VAS) were obtained at enrollment and at 1-, 3-, and 6-month follow-ups. The VAS scores were used to characterize patient groups, using hierarchical cluster analysis. VAS for 128 patients was used for hierarchical cluster analysis. Analysis yielded four clusters representing different patterns of back pain progression. Cluster 1 patients (50.8%) had stable, mild pain. Cluster 2 patients (21.1%) started with moderate pain and progressed quickly to very low pain. Patients in cluster 3 (10.9%) had moderate pain that initially improved but worsened after 3 months. Cluster 4 patients (17.2%) had persistent severe pain. Patients in cluster 4 showed significant high baseline pain intensity, higher degree of angular instability, and higher number of previous OVFs, and tended to lack regular exercise. In contrast, patients in cluster 2 had significantly lower baseline VAS and less angular instability. We identified four distinct groups of OVF patients with different patterns of back pain progression. Understanding the course of back pain after OVF may help in its management and contribute to future treatment trials.

  14. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Guo Junqiao

    2008-09-01

    Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.

  15. Comparison of multianalyte proficiency test results by sum of ranking differences, principal component analysis, and hierarchical cluster analysis.

    Science.gov (United States)

    Škrbić, Biljana; Héberger, Károly; Durišić-Mladenović, Nataša

    2013-10-01

    Sum of ranking differences (SRD) was applied for comparing multianalyte results obtained by several analytical methods used in one or in different laboratories, i.e., for ranking the overall performances of the methods (or laboratories) in simultaneous determination of the same set of analytes. The data sets for testing of the SRD applicability contained the results reported during one of the proficiency tests (PTs) organized by EU Reference Laboratory for Polycyclic Aromatic Hydrocarbons (EU-RL-PAH). In this way, the SRD was also tested as a discriminant method alternative to existing average performance scores used to compare mutlianalyte PT results. SRD should be used along with the z scores--the most commonly used PT performance statistics. SRD was further developed to handle the same rankings (ties) among laboratories. Two benchmark concentration series were selected as reference: (a) the assigned PAH concentrations (determined precisely beforehand by the EU-RL-PAH) and (b) the averages of all individual PAH concentrations determined by each laboratory. Ranking relative to the assigned values and also to the average (or median) values pointed to the laboratories with the most extreme results, as well as revealed groups of laboratories with similar overall performances. SRD reveals differences between methods or laboratories even if classical test(s) cannot. The ranking was validated using comparison of ranks by random numbers (a randomization test) and using seven folds cross-validation, which highlighted the similarities among the (methods used in) laboratories. Principal component analysis and hierarchical cluster analysis justified the findings based on SRD ranking/grouping. If the PAH-concentrations are row-scaled, (i.e., z scores are analyzed as input for ranking) SRD can still be used for checking the normality of errors. Moreover, cross-validation of SRD on z scores groups the laboratories similarly. The SRD technique is general in nature, i.e., it can

  16. Relation between financial market structure and the real economy: comparison between clustering methods.

    Science.gov (United States)

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T

    2015-01-01

    We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].

  17. Relation between financial market structure and the real economy: comparison between clustering methods.

    Directory of Open Access Journals (Sweden)

    Nicoló Musmeci

    Full Text Available We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].

  18. NOVEL CONTEXT-AWARE CLUSTERING WITH HIERARCHICAL ADDRESSING (CCHA) FOR THE INTERNET OF THINGS (IoT)

    DEFF Research Database (Denmark)

    Mahalle, Parikshit N.; Prasad, Neeli R.; Prasad, Ramjee

    2013-01-01

    As computing technology becomes more tightly coupled into dynamic and mobile world of the Internet of Things (IoT), security mechanism becomes more stringent, less flexible and intrusive. Scalability issue in the IoT makes Identity Management (IdM) of ubiquitous things more challenging. Forming ad......-hoc network, interaction between these nomadic devices to provide seamless service extend the need of new identi-ties to the things, addressing and IdM in the IoT. New identities and identifier format to alleviate the perfor-mance issue is introduced in this paper. This paper pre-sents novel Context......-aware Clustering with Hierarchical Addressing (CCHA) scheme for the things with new identifier format. Simulation results shows that CCHA achieves better performance with less energy expendi-ture, less end-to-end delay and more throughput. Results also show that CCHA significantly reduces the failure probability...

  19. Hierarchical video summarization

    Science.gov (United States)

    Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.

    1998-12-01

    We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.

  20. Hierarchical Neural Regression Models for Customer Churn Prediction

    Directory of Open Access Journals (Sweden)

    Golshan Mohammadi

    2013-01-01

    Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.

  1. A Novel Cluster Head Selection Algorithm Based on Fuzzy Clustering and Particle Swarm Optimization.

    Science.gov (United States)

    Ni, Qingjian; Pan, Qianqian; Du, Huimin; Cao, Cen; Zhai, Yuqing

    2017-01-01

    An important objective of wireless sensor network is to prolong the network life cycle, and topology control is of great significance for extending the network life cycle. Based on previous work, for cluster head selection in hierarchical topology control, we propose a solution based on fuzzy clustering preprocessing and particle swarm optimization. More specifically, first, fuzzy clustering algorithm is used to initial clustering for sensor nodes according to geographical locations, where a sensor node belongs to a cluster with a determined probability, and the number of initial clusters is analyzed and discussed. Furthermore, the fitness function is designed considering both the energy consumption and distance factors of wireless sensor network. Finally, the cluster head nodes in hierarchical topology are determined based on the improved particle swarm optimization. Experimental results show that, compared with traditional methods, the proposed method achieved the purpose of reducing the mortality rate of nodes and extending the network life cycle.

  2. Permutation Tests of Hierarchical Cluster Analyses of Carrion Communities and Their Potential Use in Forensic Entomology.

    Science.gov (United States)

    van der Ham, Joris L

    2016-05-19

    Forensic entomologists can use carrion communities' ecological succession data to estimate the postmortem interval (PMI). Permutation tests of hierarchical cluster analyses of these data provide a conceptual method to estimate part of the PMI, the post-colonization interval (post-CI). This multivariate approach produces a baseline of statistically distinct clusters that reflect changes in the carrion community composition during the decomposition process. Carrion community samples of unknown post-CIs are compared with these baseline clusters to estimate the post-CI. In this short communication, I use data from previously published studies to demonstrate the conceptual feasibility of this multivariate approach. Analyses of these data produce series of significantly distinct clusters, which represent carrion communities during 1- to 20-day periods of the decomposition process. For 33 carrion community samples, collected over an 11-day period, this approach correctly estimated the post-CI within an average range of 3.1 days. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Application of hierarchical clustering method to classify of space-time rainfall patterns

    Science.gov (United States)

    Yu, Hwa-Lung; Chang, Tu-Je

    2010-05-01

    Understanding the local precipitation patterns is essential to the water resources management and flooding mitigation. The precipitation patterns can vary in space and time depending upon the factors from different spatial scales such as local topological changes and macroscopic atmospheric circulation. The spatiotemporal variation of precipitation in Taiwan is significant due to its complex terrain and its location at west pacific and subtropical area, where is the boundary between the pacific ocean and Asia continent with the complex interactions among the climatic processes. This study characterizes local-scale precipitation patterns by classifying the historical space-time precipitation records. We applied the hierarchical ascending clustering method to analyze the precipitation records from 1960 to 2008 at the six rainfall stations located in Lan-yang catchment at the northeast of the island. Our results identify the four primary space-time precipitation types which may result from distinct driving forces from the changes of atmospheric variables and topology at different space-time scales. This study also presents an important application of the statistical downscaling to combine large-scale upper-air circulation with local space-time precipitation patterns.

  4. Multichannel biomedical time series clustering via hierarchical probabilistic latent semantic analysis.

    Science.gov (United States)

    Wang, Jin; Sun, Xiangping; Nahavandi, Saeid; Kouzani, Abbas; Wu, Yuchuan; She, Mary

    2014-11-01

    Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  5. Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.

    Science.gov (United States)

    Loewenstein, Yaniv; Portugaly, Elon; Fromer, Menachem; Linial, Michal

    2008-07-01

    UPGMA (average linking) is probably the most popular algorithm for hierarchical data clustering, especially in computational biology. However, UPGMA requires the entire dissimilarity matrix in memory. Due to this prohibitive requirement, UPGMA is not scalable to very large datasets. We present a novel class of memory-constrained UPGMA (MC-UPGMA) algorithms. Given any practical memory size constraint, this framework guarantees the correct clustering solution without explicitly requiring all dissimilarities in memory. The algorithms are general and are applicable to any dataset. We present a data-dependent characterization of hardness and clustering efficiency. The presented concepts are applicable to any agglomerative clustering formulation. We apply our algorithm to the entire collection of protein sequences, to automatically build a comprehensive evolutionary-driven hierarchy of proteins from sequence alone. The newly created tree captures protein families better than state-of-the-art large-scale methods such as CluSTr, ProtoNet4 or single-linkage clustering. We demonstrate that leveraging the entire mass embodied in all sequence similarities allows to significantly improve on current protein family clusterings which are unable to directly tackle the sheer mass of this data. Furthermore, we argue that non-metric constraints are an inherent complexity of the sequence space and should not be overlooked. The robustness of UPGMA allows significant improvement, especially for multidomain proteins, and for large or divergent families. A comprehensive tree built from all UniProt sequence similarities, together with navigation and classification tools will be made available as part of the ProtoNet service. A C++ implementation of the algorithm is available on request.

  6. Hierarchical Control for Multiple DC-Microgrids Clusters

    DEFF Research Database (Denmark)

    Shafiee, Qobad; Dragicevic, Tomislav; Vasquez, Juan Carlos

    2014-01-01

    DC microgrids (MGs) have gained research interest during the recent years because of many potential advantages as compared to the ac system. To ensure reliable operation of a low-voltage dc MG as well as its intelligent operation with the other DC MGs, a hierarchical control is proposed in this p......DC microgrids (MGs) have gained research interest during the recent years because of many potential advantages as compared to the ac system. To ensure reliable operation of a low-voltage dc MG as well as its intelligent operation with the other DC MGs, a hierarchical control is proposed...

  7. The Case for a Hierarchical Cosmology

    Science.gov (United States)

    Vaucouleurs, G. de

    1970-01-01

    The development of modern theoretical cosmology is presented and some questionable assumptions of orthodox cosmology are pointed out. Suggests that recent observations indicate that hierarchical clustering is a basic factor in cosmology. The implications of hierarchical models of the universe are considered. Bibliography. (LC)

  8. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  9. Clustering analysis

    International Nuclear Information System (INIS)

    Romli

    1997-01-01

    Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods

  10. CBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network

    OpenAIRE

    Rashed, M. G.; Kabir, M. Hasnat; Rahim, M. Sajjadur; Ullah, Sk. Enayet

    2012-01-01

    A new two layer hierarchical routing protocol called Cluster Based Hierarchical Routing Protocol (CBHRP) is proposed in this paper. It is an extension of LEACH routing protocol. We introduce cluster head-set idea for cluster-based routing where several clusters are formed with the deployed sensors to collect information from target field. On rotation basis, a head-set member receives data from the neighbor nodes and transmits the aggregated results to the distance base station. This protocol ...

  11. Clustering, Hierarchical Organization, and the Topography of Abstract and Concrete Nouns

    Directory of Open Access Journals (Sweden)

    Joshua eTroche

    2014-04-01

    Full Text Available The empirical study of language has historically relied heavily upon concrete word stimuli. By definition, concrete words evoke salient perceptual associations that fit well within feature-based, sensorimotor models of word meaning. In contrast, many theorists argue that abstract words are disembodied in that their meaning is mediated through language. We investigated word meaning as distributed in multidimensional space using hierarchical cluster analysis. Participants (N=365 rated target words (n=400 English nouns across 12 cognitive dimensions (e.g., polarity, ease of teaching, emotional valence. Factor reduction revealed three latent factors, corresponding roughly to perceptual salience, affective association, and magnitude. We plotted the original 400 words for the three latent factors. Abstract and concrete words showed overlap in their topography but also differentiated themselves in semantic space. This topographic approach to word meaning offers a unique perspective to word concreteness.

  12. Clustering, hierarchical organization, and the topography of abstract and concrete nouns.

    Science.gov (United States)

    Troche, Joshua; Crutch, Sebastian; Reilly, Jamie

    2014-01-01

    The empirical study of language has historically relied heavily upon concrete word stimuli. By definition, concrete words evoke salient perceptual associations that fit well within feature-based, sensorimotor models of word meaning. In contrast, many theorists argue that abstract words are "disembodied" in that their meaning is mediated through language. We investigated word meaning as distributed in multidimensional space using hierarchical cluster analysis. Participants (N = 365) rated target words (n = 400 English nouns) across 12 cognitive dimensions (e.g., polarity, ease of teaching, emotional valence). Factor reduction revealed three latent factors, corresponding roughly to perceptual salience, affective association, and magnitude. We plotted the original 400 words for the three latent factors. Abstract and concrete words showed overlap in their topography but also differentiated themselves in semantic space. This topographic approach to word meaning offers a unique perspective to word concreteness.

  13. BioCluster: Tool for Identification and Clustering of Enterobacteriaceae Based on Biochemical Data

    Directory of Open Access Journals (Sweden)

    Ahmed Abdullah

    2015-06-01

    Full Text Available Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is labor-intensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface (GUI tool named BioCluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering (HC and the Improved Hierarchical Clustering (IHC, a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1–47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that BioCluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/.

  14. Clustering of resting state networks.

    Directory of Open Access Journals (Sweden)

    Megan H Lee

    Full Text Available The goal of the study was to demonstrate a hierarchical structure of resting state activity in the healthy brain using a data-driven clustering algorithm.The fuzzy-c-means clustering algorithm was applied to resting state fMRI data in cortical and subcortical gray matter from two groups acquired separately, one of 17 healthy individuals and the second of 21 healthy individuals. Different numbers of clusters and different starting conditions were used. A cluster dispersion measure determined the optimal numbers of clusters. An inner product metric provided a measure of similarity between different clusters. The two cluster result found the task-negative and task-positive systems. The cluster dispersion measure was minimized with seven and eleven clusters. Each of the clusters in the seven and eleven cluster result was associated with either the task-negative or task-positive system. Applying the algorithm to find seven clusters recovered previously described resting state networks, including the default mode network, frontoparietal control network, ventral and dorsal attention networks, somatomotor, visual, and language networks. The language and ventral attention networks had significant subcortical involvement. This parcellation was consistently found in a large majority of algorithm runs under different conditions and was robust to different methods of initialization.The clustering of resting state activity using different optimal numbers of clusters identified resting state networks comparable to previously obtained results. This work reinforces the observation that resting state networks are hierarchically organized.

  15. Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm

    DEFF Research Database (Denmark)

    Grotkjær, Thomas; Winther, Ole; Regenberg, Birgitte

    2006-01-01

    Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods...... analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset...

  16. Hierarchical Controlled Remote State Preparation by Using a Four-Qubit Cluster State

    Science.gov (United States)

    Ma, Peng-Cheng; Chen, Gui-Bin; Li, Xiao-Wei; Zhan, You-Bang

    2018-02-01

    We propose a scheme for hierarchical controlled remote preparation of an arbitrary single-qubit state via a four-qubit cluster state as the quantum channel. In this scheme, a sender wishes to help three agents to remotely prepare a quantum state, respectively. The three agents are divided into two grades, that is, an agent is in the upper grade and other two agents are in the lower grade. In this process of remote state preparation, the agent of the upper grade only needs the assistance of any one of the other two agents for recovering the sender's original state, while an agent of the lower grade needs the collaboration of all the other two agents. In other words, the agents of two grades have different authorities to reconstruct sender's original state.

  17. Hierarchical Controlled Remote State Preparation by Using a Four-Qubit Cluster State

    Science.gov (United States)

    Ma, Peng-Cheng; Chen, Gui-Bin; Li, Xiao-Wei; Zhan, You-Bang

    2018-06-01

    We propose a scheme for hierarchical controlled remote preparation of an arbitrary single-qubit state via a four-qubit cluster state as the quantum channel. In this scheme, a sender wishes to help three agents to remotely prepare a quantum state, respectively. The three agents are divided into two grades, that is, an agent is in the upper grade and other two agents are in the lower grade. In this process of remote state preparation, the agent of the upper grade only needs the assistance of any one of the other two agents for recovering the sender's original state, while an agent of the lower grade needs the collaboration of all the other two agents. In other words, the agents of two grades have different authorities to reconstruct sender's original state.

  18. Hydrothermal synthesis and photoluminescent properties of hierarchical GdPO4·H2O:Ln3+ (Ln3+ = Eu3+, Ce3+, Tb3+) flower-like clusters

    Science.gov (United States)

    Amurisana, Bao.; Zhiqiang, Song.; Haschaolu, O.; Yi, Chen; Tegus, O.

    2018-02-01

    3D hierarchical GdPO4·H2O:Ln3+ (Ln3+ = Eu3+, Ce3+, Tb3+) flower clusters were successfully prepared on glass slide substrate by a simple, economical hydrothermal process with the assistance of disodium ethylenediaminetetraacetic acid (Na2H2L, where L4- = (CH2COO)2N(CH2)2N(CH2COO)24-). In this process, Na2H2L was used as both a chelating agent and a structure-director. The hierarchical flower clusters have an average diameter of 7-12 μm and are composed of well-aligned microrods. The influence of the molar ratio of Na2H2L/Gd3+ and reaction time on the morphology was systematically studied. A possible crystal growth and formation mechanism of hierarchical flower clusters is proposed based on the evolution of morphology as a function of reaction time. The self-assembled GdPO4·H2O:Ln3+ superstructures exhibit strong orange-red (Eu3+, 5D0 → 7F1), green (Tb3+, 5D4 → 7F5) and near ultraviolet emissions (Ce3+, 5d → 7F5/2) under ultraviolet excitation, respectively. This study may provide a new channel for building hierarchically superstructued oxide micro/nanomaterials with optical and new properties.

  19. Micromechanics of hierarchical materials

    DEFF Research Database (Denmark)

    Mishnaevsky, Leon, Jr.

    2012-01-01

    A short overview of micromechanical models of hierarchical materials (hybrid composites, biomaterials, fractal materials, etc.) is given. Several examples of the modeling of strength and damage in hierarchical materials are summarized, among them, 3D FE model of hybrid composites...... with nanoengineered matrix, fiber bundle model of UD composites with hierarchically clustered fibers and 3D multilevel model of wood considered as a gradient, cellular material with layered composite cell walls. The main areas of research in micromechanics of hierarchical materials are identified, among them......, the investigations of the effects of load redistribution between reinforcing elements at different scale levels, of the possibilities to control different material properties and to ensure synergy of strengthening effects at different scale levels and using the nanoreinforcement effects. The main future directions...

  20. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    Science.gov (United States)

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  1. Hierarchical silica particles by dynamic multicomponent assembly

    DEFF Research Database (Denmark)

    Wu, Z. W.; Hu, Q. Y.; Pang, J. B.

    2005-01-01

    Abstract: Aerosol-assisted assembly of mesoporous silica particles with hierarchically controllable pore structure has been prepared using cetyltrimethylammonium bromide (CTAB) and poly(propylene oxide) (PPO, H[OCH(CH3)CH2],OH) as co-templates. Addition of the hydrophobic PPO significantly...... influences the delicate hydrophilic-hydrophobic balance in the well-studied CTAB-silicate co-assembling system, resulting in various mesostructures (such as hexagonal, lamellar, and hierarchical structure). The co-assembly of CTAB, silicate clusters, and a low-molecular-weight PPO (average M-n 425) results...... in a uniform lamellar structure, while the use of a high-molecular-weight PPO (average M-n 2000), which is more hydrophobic, leads to the formation of hierarchical pore structure that contains meso-meso or meso-macro pore structure. The role of PPO additives on the mesostructure evolution in the CTAB...

  2. Discovering hierarchical structure in normal relational data

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Herlau, Tue; Mørup, Morten

    2014-01-01

    -parametric generative model for hierarchical clustering of similarity based on multifurcating Gibbs fragmentation trees. This allows us to infer and display the posterior distribution of hierarchical structures that comply with the data. We demonstrate the utility of our method on synthetic data and data of functional...

  3. DATA CLASSIFICATION WITH NEURAL CLASSIFIER USING RADIAL BASIS FUNCTION WITH DATA REDUCTION USING HIERARCHICAL CLUSTERING

    Directory of Open Access Journals (Sweden)

    M. Safish Mary

    2012-04-01

    Full Text Available Classification of large amount of data is a time consuming process but crucial for analysis and decision making. Radial Basis Function networks are widely used for classification and regression analysis. In this paper, we have studied the performance of RBF neural networks to classify the sales of cars based on the demand, using kernel density estimation algorithm which produces classification accuracy comparable to data classification accuracy provided by support vector machines. In this paper, we have proposed a new instance based data selection method where redundant instances are removed with help of a threshold thus improving the time complexity with improved classification accuracy. The instance based selection of the data set will help reduce the number of clusters formed thereby reduces the number of centers considered for building the RBF network. Further the efficiency of the training is improved by applying a hierarchical clustering technique to reduce the number of clusters formed at every step. The paper explains the algorithm used for classification and for conditioning the data. It also explains the complexities involved in classification of sales data for analysis and decision-making.

  4. Competitive cluster growth in complex networks.

    Science.gov (United States)

    Moreira, André A; Paula, Demétrius R; Costa Filho, Raimundo N; Andrade, José S

    2006-06-01

    In this work we propose an idealized model for competitive cluster growth in complex networks. Each cluster can be thought of as a fraction of a community that shares some common opinion. Our results show that the cluster size distribution depends on the particular choice for the topology of the network of contacts among the agents. As an application, we show that the cluster size distributions obtained when the growth process is performed on hierarchical networks, e.g., the Apollonian network, have a scaling form similar to what has been observed for the distribution of a number of votes in an electoral process. We suggest that this similarity may be due to the fact that social networks involved in the electoral process may also possess an underlining hierarchical structure.

  5. Examination of Clustering in Eutectic Microstrcture

    Directory of Open Access Journals (Sweden)

    Bortnyik K.

    2017-06-01

    Full Text Available The eutectic microstructures are complex microstructures and a hard work to describe it with few numbers. The eutectics builds up eutectic cells. In the cells the phases are clustered. With the development of big databases the data mining also develops, and produces a lot of method to handling the large datasets, and earns information from the sets. One typical method is the clustering, which finds the groups in the datasets. In this article a partitioning and a hierarchical clustering is applied to eutectic structures to find the clusters. In the case of AlMn alloy the K-means algorithm work well, and find the eutectic cells. In the case of ductile cast iron the hierarchical clustering works better. With the combination of the partitioning and hierarchical clustering with the image transformation, an effective method is developed for clustering the objects in the microstructures.

  6. A local adaptive algorithm for emerging scale-free hierarchical networks

    International Nuclear Information System (INIS)

    Gomez Portillo, I J; Gleiser, P M

    2010-01-01

    In this work we study a growing network model with chaotic dynamical units that evolves using a local adaptive rewiring algorithm. Using numerical simulations we show that the model allows for the emergence of hierarchical networks. First, we show that the networks that emerge with the algorithm present a wide degree distribution that can be fitted by a power law function, and thus are scale-free networks. Using the LaNet-vi visualization tool we present a graphical representation that reveals a central core formed only by hubs, and also show the presence of a preferential attachment mechanism. In order to present a quantitative analysis of the hierarchical structure we analyze the clustering coefficient. In particular, we show that as the network grows the clustering becomes independent of system size, and also presents a power law decay as a function of the degree. Finally, we compare our results with a similar version of the model that has continuous non-linear phase oscillators as dynamical units. The results show that local interactions play a fundamental role in the emergence of hierarchical networks.

  7. bcl::Cluster : A method for clustering biological molecules coupled with visualization in the Pymol Molecular Graphics System.

    Science.gov (United States)

    Alexander, Nathan; Woetzel, Nils; Meiler, Jens

    2011-02-01

    Clustering algorithms are used as data analysis tools in a wide variety of applications in Biology. Clustering has become especially important in protein structure prediction and virtual high throughput screening methods. In protein structure prediction, clustering is used to structure the conformational space of thousands of protein models. In virtual high throughput screening, databases with millions of drug-like molecules are organized by structural similarity, e.g. common scaffolds. The tree-like dendrogram structure obtained from hierarchical clustering can provide a qualitative overview of the results, which is important for focusing detailed analysis. However, in practice it is difficult to relate specific components of the dendrogram directly back to the objects of which it is comprised and to display all desired information within the two dimensions of the dendrogram. The current work presents a hierarchical agglomerative clustering method termed bcl::Cluster. bcl::Cluster utilizes the Pymol Molecular Graphics System to graphically depict dendrograms in three dimensions. This allows simultaneous display of relevant biological molecules as well as additional information about the clusters and the members comprising them.

  8. Preliminary results from the hierarchical glitch pipeline

    International Nuclear Information System (INIS)

    Mukherjee, Soma

    2007-01-01

    This paper reports on the preliminary results obtained from the hierarchical glitch classification pipeline on LIGO data. The pipeline that has been under construction for the past year is now complete and end-to-end tested. It is ready to generate analysis results on a daily basis. The details of the pipeline, the classification algorithms employed and the results obtained with one days analysis on the gravitational wave and several auxiliary and environmental channels from all three LIGO detectors are discussed

  9. Hierarchical sets: analyzing pangenome structure through scalable set visualizations

    Science.gov (United States)

    2017-01-01

    Abstract Motivation: The increase in available microbial genome sequences has resulted in an increase in the size of the pangenomes being analyzed. Current pangenome visualizations are not intended for the pangenome sizes possible today and new approaches are necessary in order to convert the increase in available information to increase in knowledge. As the pangenome data structure is essentially a collection of sets we explore the potential for scalable set visualization as a tool for pangenome analysis. Results: We present a new hierarchical clustering algorithm based on set arithmetics that optimizes the intersection sizes along the branches. The intersection and union sizes along the hierarchy are visualized using a composite dendrogram and icicle plot, which, in pangenome context, shows the evolution of pangenome and core size along the evolutionary hierarchy. Outlying elements, i.e. elements whose presence pattern do not correspond with the hierarchy, can be visualized using hierarchical edge bundles. When applied to pangenome data this plot shows putative horizontal gene transfers between the genomes and can highlight relationships between genomes that is not represented by the hierarchy. We illustrate the utility of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. Availability and Implementation: The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https://cran.r-project.org/web/packages/hierarchicalSets) Contact: thomasp85@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28130242

  10. Clustering Dycom

    KAUST Repository

    Minku, Leandro L.

    2017-10-06

    Background: Software Effort Estimation (SEE) can be formulated as an online learning problem, where new projects are completed over time and may become available for training. In this scenario, a Cross-Company (CC) SEE approach called Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving the high cost of collecting such training projects. However, Dycom relies on splitting CC projects into different subsets in order to create its CC models. Such splitting can have a significant impact on Dycom\\'s predictive performance. Aims: This paper investigates whether clustering methods can be used to help finding good CC splits for Dycom. Method: Dycom is extended to use clustering methods for creating the CC subsets. Three different clustering methods are investigated, namely Hierarchical Clustering, K-Means, and Expectation-Maximisation. Clustering Dycom is compared against the original Dycom with CC subsets of different sizes, based on four SEE databases. A baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number of CC subsets to be pre-defined, and a poor choice can negatively affect predictive performance. EM enables Dycom to automatically set the number of CC subsets while still maintaining or improving predictive performance with respect to the baseline WC model. Clustering Dycom with Hierarchical Clustering did not offer significant advantage in terms of predictive performance. Conclusion: Clustering methods can be an effective way to automatically generate Dycom\\'s CC subsets.

  11. Chemical Fingerprint and Quantitative Analysis for the Quality Evaluation of Platycladi cacumen by Ultra-performance Liquid Chromatography Coupled with Hierarchical Cluster Analysis.

    Science.gov (United States)

    Shan, Mingqiu; Li, Sam Fong Yau; Yu, Sheng; Qian, Yan; Guo, Shuchen; Zhang, Li; Ding, Anwei

    2018-01-01

    Platycladi cacumen (dried twigs and leaves of Platycladus orientalis (L.) Franco) is a frequently utilized Chinese medicinal herb. To evaluate the quality of the phytomedcine, an ultra-performance liquid chromatographic method with diode array detection was established for chemical fingerprinting and quantitative analysis. In this study, 27 batches of P. cacumen from different regions were collected for analysis. A chemical fingerprint with 20 common peaks was obtained using Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine (Version 2004A). Among these 20 components, seven flavonoids (myricitrin, isoquercitrin, quercitrin, afzelin, cupressuflavone, amentoflavone and hinokiflavone) were identified and determined simultaneously. In the method validation, the seven analytes showed good regressions (R ≥ 0.9995) within linear ranges and good recoveries from 96.4% to 103.3%. Furthermore, with the contents of these seven flavonoids, hierarchical clustering analysis was applied to distinguish the 27 batches into five groups. The chemometric results showed that these groups were almost consistent with geographical positions and climatic conditions of the production regions. Integrating fingerprint analysis, simultaneous determination and hierarchical clustering analysis, the established method is rapid, sensitive, accurate and readily applicable, and also provides a significant foundation for quality control of P. cacumen efficiently. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  13. Genome-wide decoding of hierarchical modular structure of transcriptional regulation by cis-element and expression clustering.

    Science.gov (United States)

    Leyfer, Dmitriy; Weng, Zhiping

    2005-09-01

    A holistic approach to the study of cellular processes is identifying both gene-expression changes and regulatory elements promoting such changes. Cellular regulatory processes can be viewed as transcriptional modules (TMs), groups of coexpressed genes regulated by groups of transcription factors (TFs). We set out to devise a method that would identify TMs while avoiding arbitrary thresholds on TM sizes and number. Assuming that gene expression is determined by TFs that bind to the gene's promoter, clustering of genes based on TF binding sites (cis-elements) should create gene groups similar to those obtained by gene expression clustering. Intersections between the expression and cis-element-based gene clusters reveal TMs. Statistical significance assigned to each TM allows identification of regulatory units of any size. Our method correctly identifies the number and sizes of TMs on simulated datasets. We demonstrate that yeast experimental TMs are biologically relevant by comparing them with MIPS and GO categories. Our modules are in statistically significant agreement with TMs from other research groups. This work suggests that there is no preferential division of biological processes into regulatory units; each degree of partitioning exhibits a slice of biological network revealing hierarchical modular organization of transcriptional regulation.

  14. Chimera states in networks of logistic maps with hierarchical connectivities

    Science.gov (United States)

    zur Bonsen, Alexander; Omelchenko, Iryna; Zakharova, Anna; Schöll, Eckehard

    2018-04-01

    Chimera states are complex spatiotemporal patterns consisting of coexisting domains of coherence and incoherence. We study networks of nonlocally coupled logistic maps and analyze systematically how the dilution of the network links influences the appearance of chimera patterns. The network connectivities are constructed using an iterative Cantor algorithm to generate fractal (hierarchical) connectivities. Increasing the hierarchical level of iteration, we compare the resulting spatiotemporal patterns. We demonstrate that a high clustering coefficient and symmetry of the base pattern promotes chimera states, and asymmetric connectivities result in complex nested chimera patterns.

  15. Performance quantification of clustering algorithms for false positive removal in fMRI by ROC curves

    Directory of Open Access Journals (Sweden)

    André Salles Cunha Peres

    Full Text Available Abstract Introduction Functional magnetic resonance imaging (fMRI is a non-invasive technique that allows the detection of specific cerebral functions in humans based on hemodynamic changes. The contrast changes are about 5%, making visual inspection impossible. Thus, statistic strategies are applied to infer which brain region is engaged in a task. However, the traditional methods like general linear model and cross-correlation utilize voxel-wise calculation, introducing a lot of false-positive data. So, in this work we tested post-processing cluster algorithms to diminish the false-positives. Methods In this study, three clustering algorithms (the hierarchical cluster, k-means and self-organizing maps were tested and compared for false-positive removal in the post-processing of cross-correlation analyses. Results Our results showed that the hierarchical cluster presented the best performance to remove the false positives in fMRI, being 2.3 times more accurate than k-means, and 1.9 times more accurate than self-organizing maps. Conclusion The hierarchical cluster presented the best performance in false-positive removal because it uses the inconsistency coefficient threshold, while k-means and self-organizing maps utilize a priori cluster number (centroids and neurons number; thus, the hierarchical cluster avoids clustering scattered voxels, as the inconsistency coefficient threshold allows only the voxels to be clustered that are at a minimum distance to some cluster.

  16. "Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping out and Hierarchical Cluster Analysis"

    Directory of Open Access Journals (Sweden)

    Alex J. Bowers

    2010-05-01

    Full Text Available School personnel currently lack an effective method to pattern and visually interpret disaggregated achievement data collected on students as a means to help inform decision making. This study, through the examination of longitudinal K-12 teacher assigned grading histories for entire cohorts of students from a school district (n=188, demonstrates a novel application of hierarchical cluster analysis and pattern visualization in which all data points collected on every student in a cohort can be patterned, visualized and interpreted to aid in data driven decision making by teachers and administrators. Additionally, as a proof-of-concept study, overall schooling outcomes, such as student dropout or taking a college entrance exam, are identified from the data patterns and compared to past methods of dropout identification as one example of the usefulness of the method. Hierarchical cluster analysis correctly identified over 80% of the students who dropped out using the entire student grade history patterns from either K-12 or K-8.

  17. BULGELESS GIANT GALAXIES CHALLENGE OUR PICTURE OF GALAXY FORMATION BY HIERARCHICAL CLUSTERING ,

    International Nuclear Information System (INIS)

    Kormendy, John; Cornell, Mark E.; Drory, Niv; Bender, Ralf

    2010-01-01

    To better understand the prevalence of bulgeless galaxies in the nearby field, we dissect giant Sc-Scd galaxies with Hubble Space Telescope (HST) photometry and Hobby-Eberly Telescope (HET) spectroscopy. We use the HET High Resolution Spectrograph (resolution R ≡ λ/FWHM ≅ 15, 000) to measure stellar velocity dispersions in the nuclear star clusters and (pseudo)bulges of the pure-disk galaxies M 33, M 101, NGC 3338, NGC 3810, NGC 6503, and NGC 6946. The dispersions range from 20 ± 1 km s -1 in the nucleus of M 33 to 78 ± 2 km s -1 in the pseudobulge of NGC 3338. We use HST archive images to measure the brightness profiles of the nuclei and (pseudo)bulges in M 101, NGC 6503, and NGC 6946 and hence to estimate their masses. The results imply small mass-to-light ratios consistent with young stellar populations. These observations lead to two conclusions. (1) Upper limits on the masses of any supermassive black holes are M . ∼ 6 M sun in M 101 and M . ∼ 6 M sun in NGC 6503. (2) We show that the above galaxies contain only tiny pseudobulges that make up ∼ circ > 150 km s -1 , including M 101, NGC 6946, IC 342, and our Galaxy, show no evidence for a classical bulge. Four may contain small classical bulges that contribute 5%-12% of the light of the galaxy. Only four of the 19 giant galaxies are ellipticals or have classical bulges that contribute ∼1/3 of the galaxy light. We conclude that pure-disk galaxies are far from rare. It is hard to understand how bulgeless galaxies could form as the quiescent tail of a distribution of merger histories. Recognition of pseudobulges makes the biggest problem with cold dark matter galaxy formation more acute: How can hierarchical clustering make so many giant, pure-disk galaxies with no evidence for merger-built bulges? Finally, we emphasize that this problem is a strong function of environment: the Virgo cluster is not a puzzle, because more than 2/3 of its stellar mass is in merger remnants.

  18. Revisiting the variation of clustering coefficient of biological networks suggests new modular structure

    Directory of Open Access Journals (Sweden)

    Hao Dapeng

    2012-05-01

    Full Text Available Abstract Background A central idea in biology is the hierarchical organization of cellular processes. A commonly used method to identify the hierarchical modular organization of network relies on detecting a global signature known as variation of clustering coefficient (so-called modularity scaling. Although several studies have suggested other possible origins of this signature, it is still widely used nowadays to identify hierarchical modularity, especially in the analysis of biological networks. Therefore, a further and systematical investigation of this signature for different types of biological networks is necessary. Results We analyzed a variety of biological networks and found that the commonly used signature of hierarchical modularity is actually the reflection of spoke-like topology, suggesting a different view of network architecture. We proved that the existence of super-hubs is the origin that the clustering coefficient of a node follows a particular scaling law with degree k in metabolic networks. To study the modularity of biological networks, we systematically investigated the relationship between repulsion of hubs and variation of clustering coefficient. We provided direct evidences for repulsion between hubs being the underlying origin of the variation of clustering coefficient, and found that for biological networks having no anti-correlation between hubs, such as gene co-expression network, the clustering coefficient doesn’t show dependence of degree. Conclusions Here we have shown that the variation of clustering coefficient is neither sufficient nor exclusive for a network to be hierarchical. Our results suggest the existence of spoke-like modules as opposed to “deterministic model” of hierarchical modularity, and suggest the need to reconsider the organizational principle of biological hierarchy.

  19. The experimental results on the quality of clustering diverse set of data using a modified algorithm chameleon

    Directory of Open Access Journals (Sweden)

    Татьяна Борисовна Шатовская

    2015-03-01

    Full Text Available In this work results of modified Chameleon algorithm are discussed. Hierarchical multilevel algorithms consist of several stages: building the graph, coarsening, partitioning, recovering. Exploring of clustering quality for different data sets with different combinations of algorithms on different stages of the algorithm is the main aim of the article. And also aim is improving the construction phase through the optimization algorithm of choice k in the building the graph k-nearest neighbors

  20. Topology of the correlation networks among major currencies using hierarchical structure methods

    Science.gov (United States)

    Keskin, Mustafa; Deviren, Bayram; Kocakaplan, Yusuf

    2011-02-01

    We studied the topology of correlation networks among 34 major currencies using the concept of a minimal spanning tree and hierarchical tree for the full years of 2007-2008 when major economic turbulence occurred. We used the USD (US Dollar) and the TL (Turkish Lira) as numeraires in which the USD was the major currency and the TL was the minor currency. We derived a hierarchical organization and constructed minimal spanning trees (MSTs) and hierarchical trees (HTs) for the full years of 2007, 2008 and for the 2007-2008 period. We performed a technique to associate a value of reliability to the links of MSTs and HTs by using bootstrap replicas of data. We also used the average linkage cluster analysis for obtaining the hierarchical trees in the case of the TL as the numeraire. These trees are useful tools for understanding and detecting the global structure, taxonomy and hierarchy in financial data. We illustrated how the minimal spanning trees and their related hierarchical trees developed over a period of time. From these trees we identified different clusters of currencies according to their proximity and economic ties. The clustered structure of the currencies and the key currency in each cluster were obtained and we found that the clusters matched nicely with the geographical regions of corresponding countries in the world such as Asia or Europe. As expected the key currencies were generally those showing major economic activity.

  1. Hierarchical Sets: Analyzing Pangenome Structure through Scalable Set Visualizations

    DEFF Research Database (Denmark)

    Pedersen, Thomas Lin

    2017-01-01

    of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https...

  2. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  3. Quantum annealing for combinatorial clustering

    Science.gov (United States)

    Kumar, Vaibhaw; Bass, Gideon; Tomlin, Casey; Dulny, Joseph

    2018-02-01

    Clustering is a powerful machine learning technique that groups "similar" data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of within-the-cluster distances between points. The straightforward approach involves examining all the possible assignments of points to each of the clusters. This approach guarantees the solution will be a global minimum; however, the number of possible assignments scales quickly with the number of data points and becomes computationally intractable even for very small datasets. In order to circumvent this issue, cost function minima are found using popular local search-based heuristic approaches such as k-means and hierarchical clustering. Due to their greedy nature, such techniques do not guarantee that a global minimum will be found and can lead to sub-optimal clustering assignments. Other classes of global search-based techniques, such as simulated annealing, tabu search, and genetic algorithms, may offer better quality results but can be too time-consuming to implement. In this work, we describe how quantum annealing can be used to carry out clustering. We map the clustering objective to a quadratic binary optimization problem and discuss two clustering algorithms which are then implemented on commercially available quantum annealing hardware, as well as on a purely classical solver "qbsolv." The first algorithm assigns N data points to K clusters, and the second one can be used to perform binary clustering in a hierarchical manner. We present our results in the form of benchmarks against well-known k-means clustering and discuss the advantages and disadvantages of the proposed techniques.

  4. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR is an efficient tool for metamodelling of nonlinear dynamic models

    Directory of Open Access Journals (Sweden)

    Omholt Stig W

    2011-06-01

    Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback

  5. Weighted Clustering

    DEFF Research Database (Denmark)

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina

    2012-01-01

    We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...

  6. Revisiting the variation of clustering coefficient of biological networks suggests new modular structure.

    Science.gov (United States)

    Hao, Dapeng; Ren, Cong; Li, Chuanxing

    2012-05-01

    A central idea in biology is the hierarchical organization of cellular processes. A commonly used method to identify the hierarchical modular organization of network relies on detecting a global signature known as variation of clustering coefficient (so-called modularity scaling). Although several studies have suggested other possible origins of this signature, it is still widely used nowadays to identify hierarchical modularity, especially in the analysis of biological networks. Therefore, a further and systematical investigation of this signature for different types of biological networks is necessary. We analyzed a variety of biological networks and found that the commonly used signature of hierarchical modularity is actually the reflection of spoke-like topology, suggesting a different view of network architecture. We proved that the existence of super-hubs is the origin that the clustering coefficient of a node follows a particular scaling law with degree k in metabolic networks. To study the modularity of biological networks, we systematically investigated the relationship between repulsion of hubs and variation of clustering coefficient. We provided direct evidences for repulsion between hubs being the underlying origin of the variation of clustering coefficient, and found that for biological networks having no anti-correlation between hubs, such as gene co-expression network, the clustering coefficient doesn't show dependence of degree. Here we have shown that the variation of clustering coefficient is neither sufficient nor exclusive for a network to be hierarchical. Our results suggest the existence of spoke-like modules as opposed to "deterministic model" of hierarchical modularity, and suggest the need to reconsider the organizational principle of biological hierarchy.

  7. D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure

    Science.gov (United States)

    Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.

    2016-06-01

    Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  8. Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.

  9. An improved Pearson's correlation proximity-based hierarchical clustering for mining biological association between genes.

    Science.gov (United States)

    Booma, P M; Prabhakaran, S; Dhanalakshmi, R

    2014-01-01

    Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.

  10. Tune Your Brown Clustering, Please

    DEFF Research Database (Denmark)

    Derczynski, Leon; Chester, Sean; Bøgh, Kenneth Sejdenfaden

    2015-01-01

    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly...

  11. Hierarchical structure of the European countries based on debts as a percentage of GDP during the 2000-2011 period

    Science.gov (United States)

    Kantar, Ersin; Deviren, Bayram; Keskin, Mustafa

    2014-11-01

    We investigate hierarchical structures of the European countries by using debt as a percentage of Gross Domestic Product (GDP) of the countries as they change over a certain period of time. We obtain the topological properties among the countries based on debt as a percentage of GDP of European countries over the period 2000-2011 by using the concept of hierarchical structure methods (minimal spanning tree, (MST) and hierarchical tree, (HT)). This period is also divided into two sub-periods related to 2004 enlargement of the European Union, namely 2000-2004 and 2005-2011, in order to test various time-window and observe the temporal evolution. The bootstrap techniques is applied to see a value of statistical reliability of the links of the MSTs and HTs. The clustering linkage procedure is also used to observe the cluster structure more clearly. From the structural topologies of these trees, we identify different clusters of countries according to their level of debts. Our results show that by the debt crisis, the less and most affected Eurozone’s economies are formed as a cluster with each other in the MSTs and hierarchical trees.

  12. Tractography segmentation using a hierarchical Dirichlet processes mixture model.

    Science.gov (United States)

    Wang, Xiaogang; Grimson, W Eric L; Westin, Carl-Fredrik

    2011-01-01

    In this paper, we propose a new nonparametric Bayesian framework to cluster white matter fiber tracts into bundles using a hierarchical Dirichlet processes mixture (HDPM) model. The number of clusters is automatically learned driven by data with a Dirichlet process (DP) prior instead of being manually specified. After the models of bundles have been learned from training data without supervision, they can be used as priors to cluster/classify fibers of new subjects for comparison across subjects. When clustering fibers of new subjects, new clusters can be created for structures not observed in the training data. Our approach does not require computing pairwise distances between fibers and can cluster a huge set of fibers across multiple subjects. We present results on several data sets, the largest of which has more than 120,000 fibers. Copyright © 2010 Elsevier Inc. All rights reserved.

  13. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    Science.gov (United States)

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for

  14. Hierarchical Image Segmentation of Remotely Sensed Data using Massively Parallel GNU-LINUX Software

    Science.gov (United States)

    Tilton, James C.

    2003-01-01

    A hierarchical set of image segmentations is a set of several image segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. In [1], Tilton, et a1 describes an approach for producing hierarchical segmentations (called HSEG) and gave a progress report on exploiting these hierarchical segmentations for image information mining. The HSEG algorithm is a hybrid of region growing and constrained spectral clustering that produces a hierarchical set of image segmentations based on detected convergence points. In the main, HSEG employs the hierarchical stepwise optimization (HSWO) approach to region growing, which was described as early as 1989 by Beaulieu and Goldberg. The HSWO approach seeks to produce segmentations that are more optimized than those produced by more classic approaches to region growing (e.g. Horowitz and T. Pavlidis, [3]). In addition, HSEG optionally interjects between HSWO region growing iterations, merges between spatially non-adjacent regions (i.e., spectrally based merging or clustering) constrained by a threshold derived from the previous HSWO region growing iteration. While the addition of constrained spectral clustering improves the utility of the segmentation results, especially for larger images, it also significantly increases HSEG s computational requirements. To counteract this, a computationally efficient recursive, divide-and-conquer, implementation of HSEG (RHSEG) was devised, which includes special code to avoid processing artifacts caused by RHSEG s recursive subdivision of the image data. The recursive nature of RHSEG makes for a straightforward parallel implementation. This paper describes the HSEG algorithm, its recursive formulation (referred to as RHSEG), and the implementation of RHSEG using massively parallel GNU-LINUX software. Results with Landsat TM data are included comparing RHSEG with classic

  15. A new application of hierarchical cluster analysis to investigate organic peaks in bulk mass spectra obtained with an Aerodyne Aerosol Mass Spectrometer

    Science.gov (United States)

    Middlebrook, A. M.; Marcolli, C.; Canagaratna, M. R.; Worsnop, D. R.; Bahreini, R.; de Gouw, J. A.; Warneke, C.; Goldan, P. D.; Kuster, W. C.; Williams, E. J.; Lerner, B. M.; Roberts, J. M.; Meagher, J. F.; Fehsenfeld, F. C.; Marchewka, M. L.; Bertman, S. B.

    2006-12-01

    We applied hierarchical cluster analysis to an Aerodyne aerosol mass spectrometer (AMS) bulk mass spectral dataset collected aboard the NOAA research vessel Ronald H. Brown during the 2002 New England Air Quality Study off the east coast of the United States. Emphasizing the organic peaks, the cluster analysis yielded a series of categories that are distinguishable with respect to their mass spectra and their occurrence as a function of time. The differences between the categories mainly arise from relative intensity changes rather than from the presence or absence of specific peaks. The most frequent category exhibits a strong signal at m/z 44 and represents oxidized organic matter probably originating from both anthropogenic as well as biogenic sources. On the basis of spectral and trace gas correlations, the second most common category with strong signals at m/z 29, 43, and 44 contains contributions from isoprene oxidation products. The third through the fifth most common categories have peak patterns characteristic of monoterpene oxidation products and were most frequently observed when air masses from monoterpene rich regions were sampled. Taken together, the second through the fifth most common categories represent on average 17% of the total organic mass that stems likely from biogenic sources during the ship's cruise. These numbers have to be viewed as lower limits since the most common category was attributed to anthropogenic sources for this calculation. The cluster analysis was also very effective in identifying a few contaminated mass spectra that were not removed during pre-processing. This study demonstrates that hierarchical clustering is a useful tool to analyze the complex patterns of the organic peaks in bulk aerosol mass spectra from a field study.

  16. Energy Aware Cluster Based Routing Scheme For Wireless Sensor Network

    Directory of Open Access Journals (Sweden)

    Roy Sohini

    2015-09-01

    Full Text Available Wireless Sensor Network (WSN has emerged as an important supplement to the modern wireless communication systems due to its wide range of applications. The recent researches are facing the various challenges of the sensor network more gracefully. However, energy efficiency has still remained a matter of concern for the researches. Meeting the countless security needs, timely data delivery and taking a quick action, efficient route selection and multi-path routing etc. can only be achieved at the cost of energy. Hierarchical routing is more useful in this regard. The proposed algorithm Energy Aware Cluster Based Routing Scheme (EACBRS aims at conserving energy with the help of hierarchical routing by calculating the optimum number of cluster heads for the network, selecting energy-efficient route to the sink and by offering congestion control. Simulation results prove that EACBRS performs better than existing hierarchical routing algorithms like Distributed Energy-Efficient Clustering (DEEC algorithm for heterogeneous wireless sensor networks and Energy Efficient Heterogeneous Clustered scheme for Wireless Sensor Network (EEHC.

  17. Using Cluster Analysis to Compartmentalize a Large Managed Wetland Based on Physical, Biological, and Climatic Geospatial Attributes.

    Science.gov (United States)

    Hahus, Ian; Migliaccio, Kati; Douglas-Mankin, Kyle; Klarenberg, Geraldine; Muñoz-Carpena, Rafael

    2018-04-27

    Hierarchical and partitional cluster analyses were used to compartmentalize Water Conservation Area 1, a managed wetland within the Arthur R. Marshall Loxahatchee National Wildlife Refuge in southeast Florida, USA, based on physical, biological, and climatic geospatial attributes. Single, complete, average, and Ward's linkages were tested during the hierarchical cluster analyses, with average linkage providing the best results. In general, the partitional method, partitioning around medoids, found clusters that were more evenly sized and more spatially aggregated than those resulting from the hierarchical analyses. However, hierarchical analysis appeared to be better suited to identify outlier regions that were significantly different from other areas. The clusters identified by geospatial attributes were similar to clusters developed for the interior marsh in a separate study using water quality attributes, suggesting that similar factors have influenced variations in both the set of physical, biological, and climatic attributes selected in this study and water quality parameters. However, geospatial data allowed further subdivision of several interior marsh clusters identified from the water quality data, potentially indicating zones with important differences in function. Identification of these zones can be useful to managers and modelers by informing the distribution of monitoring equipment and personnel as well as delineating regions that may respond similarly to future changes in management or climate.

  18. Complexity of major UK companies between 2006 and 2010: Hierarchical structure method approach

    Science.gov (United States)

    Ulusoy, Tolga; Keskin, Mustafa; Shirvani, Ayoub; Deviren, Bayram; Kantar, Ersin; Çaǧrı Dönmez, Cem

    2012-11-01

    This study reports on topology of the top 40 UK companies that have been analysed for predictive verification of markets for the period 2006-2010, applying the concept of minimal spanning tree and hierarchical tree (HT) analysis. Construction of the minimal spanning tree (MST) and the hierarchical tree (HT) is confined to a brief description of the methodology and a definition of the correlation function between a pair of companies based on the London Stock Exchange (LSE) index in order to quantify synchronization between the companies. A derivation of hierarchical organization and the construction of minimal-spanning and hierarchical trees for the 2006-2008 and 2008-2010 periods have been used and the results validate the predictive verification of applied semantics. The trees are known as useful tools to perceive and detect the global structure, taxonomy and hierarchy in financial data. From these trees, two different clusters of companies in 2006 were detected. They also show three clusters in 2008 and two between 2008 and 2010, according to their proximity. The clusters match each other as regards their common production activities or their strong interrelationship. The key companies are generally given by major economic activities as expected. This work gives a comparative approach between MST and HT methods from statistical physics and information theory with analysis of financial markets that may give new valuable and useful information of the financial market dynamics.

  19. Brightest Cluster Galaxies in REXCESS Clusters

    Science.gov (United States)

    Haarsma, Deborah B.; Leisman, L.; Bruch, S.; Donahue, M.

    2009-01-01

    Most galaxy clusters contain a Brightest Cluster Galaxy (BCG) which is larger than the other cluster ellipticals and has a more extended profile. In the hierarchical model, the BCG forms through many galaxy mergers in the crowded center of the cluster, and thus its properties give insight into the assembly of the cluster as a whole. In this project, we are working with the Representative XMM-Newton Cluster Structure Survey (REXCESS) team (Boehringer et al 2007) to study BCGs in 33 X-ray luminous galaxy clusters, 0.055 < z < 0.183. We are imaging the BCGs in R band at the Southern Observatory for Astrophysical Research (SOAR) in Chile. In this poster, we discuss our methods and give preliminary measurements of the BCG magnitudes, morphology, and stellar mass. We compare these BCG properties with the properties of their host clusters, particularly of the X-ray emitting gas.

  20. Cluster evolution

    International Nuclear Information System (INIS)

    Schaeffer, R.

    1987-01-01

    The galaxy and cluster luminosity functions are constructed from a model of the mass distribution based on hierarchical clustering at an epoch where the matter distribution is non-linear. These luminosity functions are seen to reproduce the present distribution of objects as can be inferred from the observations. They can be used to deduce the redshift dependence of the cluster distribution and to extrapolate the observations towards the past. The predicted evolution of the cluster distribution is quite strong, although somewhat less rapid than predicted by the linear theory

  1. Data clustering theory, algorithms, and applications

    CERN Document Server

    Gan, Guojun; Wu, Jianhong

    2007-01-01

    Cluster analysis is an unsupervised process that divides a set of objects into homogeneous groups. This book starts with basic information on cluster analysis, including the classification of data and the corresponding similarity measures, followed by the presentation of over 50 clustering algorithms in groups according to some specific baseline methodologies such as hierarchical, center-based, and search-based methods. As a result, readers and users can easily identify an appropriate algorithm for their applications and compare novel ideas with existing results. The book also provides examples of clustering applications to illustrate the advantages and shortcomings of different clustering architectures and algorithms. Application areas include pattern recognition, artificial intelligence, information technology, image processing, biology, psychology, and marketing. Readers also learn how to perform cluster analysis with the C/C++ and MATLAB® programming languages.

  2. MAP-Based Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and -Norm Minimization

    Directory of Open Access Journals (Sweden)

    Kellermann Walter

    2007-01-01

    Full Text Available We address the problem of underdetermined BSS. While most previous approaches are designed for instantaneous mixtures, we propose a time-frequency-domain algorithm for convolutive mixtures. We adopt a two-step method based on a general maximum a posteriori (MAP approach. In the first step, we estimate the mixing matrix based on hierarchical clustering, assuming that the source signals are sufficiently sparse. The algorithm works directly on the complex-valued data in the time-frequency domain and shows better convergence than algorithms based on self-organizing maps. The assumption of Laplacian priors for the source signals in the second step leads to an algorithm for estimating the source signals. It involves the -norm minimization of complex numbers because of the use of the time-frequency-domain approach. We compare a combinatorial approach initially designed for real numbers with a second-order cone programming (SOCP approach designed for complex numbers. We found that although the former approach is not theoretically justified for complex numbers, its results are comparable to, or even better than, the SOCP solution. The advantage is a lower computational cost for problems with low input/output dimensions.

  3. 3D NEAREST NEIGHBOUR SEARCH USING A CLUSTERED HIERARCHICAL TREE STRUCTURE

    Directory of Open Access Journals (Sweden)

    A. Suhaibah

    2016-06-01

    Full Text Available Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  4. Dynamic Hierarchical Energy-Efficient Method Based on Combinatorial Optimization for Wireless Sensor Networks.

    Science.gov (United States)

    Chang, Yuchao; Tang, Hongying; Cheng, Yongbo; Zhao, Qin; Yuan, Baoqing Li andXiaobing

    2017-07-19

    Routing protocols based on topology control are significantly important for improving network longevity in wireless sensor networks (WSNs). Traditionally, some WSN routing protocols distribute uneven network traffic load to sensor nodes, which is not optimal for improving network longevity. Differently to conventional WSN routing protocols, we propose a dynamic hierarchical protocol based on combinatorial optimization (DHCO) to balance energy consumption of sensor nodes and to improve WSN longevity. For each sensor node, the DHCO algorithm obtains the optimal route by establishing a feasible routing set instead of selecting the cluster head or the next hop node. The process of obtaining the optimal route can be formulated as a combinatorial optimization problem. Specifically, the DHCO algorithm is carried out by the following procedures. It employs a hierarchy-based connection mechanism to construct a hierarchical network structure in which each sensor node is assigned to a special hierarchical subset; it utilizes the combinatorial optimization theory to establish the feasible routing set for each sensor node, and takes advantage of the maximum-minimum criterion to obtain their optimal routes to the base station. Various results of simulation experiments show effectiveness and superiority of the DHCO algorithm in comparison with state-of-the-art WSN routing algorithms, including low-energy adaptive clustering hierarchy (LEACH), hybrid energy-efficient distributed clustering (HEED), genetic protocol-based self-organizing network clustering (GASONeC), and double cost function-based routing (DCFR) algorithms.

  5. A Multidimensional and Multimembership Clustering Method for Social Networks and Its Application in Customer Relationship Management

    Directory of Open Access Journals (Sweden)

    Peixin Zhao

    2013-01-01

    Full Text Available Community detection in social networks plays an important role in cluster analysis. Many traditional techniques for one-dimensional problems have been proven inadequate for high-dimensional or mixed type datasets due to the data sparseness and attribute redundancy. In this paper we propose a graph-based clustering method for multidimensional datasets. This novel method has two distinguished features: nonbinary hierarchical tree and the multi-membership clusters. The nonbinary hierarchical tree clearly highlights meaningful clusters, while the multimembership feature may provide more useful service strategies. Experimental results on the customer relationship management confirm the effectiveness of the new method.

  6. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2012-01-01

    Full Text Available Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

  7. Modular networks with hierarchical organization

    Indian Academy of Sciences (India)

    Several networks occurring in real life have modular structures that are arranged in a hierarchical fashion. In this paper, we have proposed a model for such networks, using a stochastic generation method. Using this model we show that, the scaling relation between the clustering and degree of the nodes is not a necessary ...

  8. Efficient Record Linkage Algorithms Using Complete Linkage Clustering.

    Science.gov (United States)

    Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar

    2016-01-01

    Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times.

  9. Cluster Physics with Merging Galaxy Clusters

    Directory of Open Access Journals (Sweden)

    Sandor M. Molnar

    2016-02-01

    Full Text Available Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard LCDM model, where the total density is dominated by the cosmological constant ($Lambda$ and the matter density by cold dark matter (CDM, structure formation is hierarchical, and clusters grow mostly by merging.Mergers of two massive clusters are the most energetic events in the universe after the Big Bang,hence they provide a unique laboratory to study cluster physics.The two main mass components in clusters behave differently during collisions:the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulenceare developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thusour review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clustersis to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses.New high spatial and spectral resolution ground and space based telescopeswill come online in the near future. Motivated by these new opportunities,we briefly discuss methods which will be feasible in the near future in studying merging clusters.

  10. Evaluation of Hierarchical Clustering Algorithms for Document Datasets

    National Research Council Canada - National Science Library

    Zhao, Ying; Karypis, George

    2002-01-01

    Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters...

  11. Hierarchical Spatial Concept Formation Based on Multimodal Information for Human Support Robots.

    Science.gov (United States)

    Hagiwara, Yoshinobu; Inoue, Masakazu; Kobayashi, Hiroyoshi; Taniguchi, Tadahiro

    2018-01-01

    In this paper, we propose a hierarchical spatial concept formation method based on the Bayesian generative model with multimodal information e.g., vision, position and word information. Since humans have the ability to select an appropriate level of abstraction according to the situation and describe their position linguistically, e.g., "I am in my home" and "I am in front of the table," a hierarchical structure of spatial concepts is necessary in order for human support robots to communicate smoothly with users. The proposed method enables a robot to form hierarchical spatial concepts by categorizing multimodal information using hierarchical multimodal latent Dirichlet allocation (hMLDA). Object recognition results using convolutional neural network (CNN), hierarchical k-means clustering result of self-position estimated by Monte Carlo localization (MCL), and a set of location names are used, respectively, as features in vision, position, and word information. Experiments in forming hierarchical spatial concepts and evaluating how the proposed method can predict unobserved location names and position categories are performed using a robot in the real world. Results verify that, relative to comparable baseline methods, the proposed method enables a robot to predict location names and position categories closer to predictions made by humans. As an application example of the proposed method in a home environment, a demonstration in which a human support robot moves to an instructed place based on human speech instructions is achieved based on the formed hierarchical spatial concept.

  12. Hierarchical Spatial Concept Formation Based on Multimodal Information for Human Support Robots

    Directory of Open Access Journals (Sweden)

    Yoshinobu Hagiwara

    2018-03-01

    Full Text Available In this paper, we propose a hierarchical spatial concept formation method based on the Bayesian generative model with multimodal information e.g., vision, position and word information. Since humans have the ability to select an appropriate level of abstraction according to the situation and describe their position linguistically, e.g., “I am in my home” and “I am in front of the table,” a hierarchical structure of spatial concepts is necessary in order for human support robots to communicate smoothly with users. The proposed method enables a robot to form hierarchical spatial concepts by categorizing multimodal information using hierarchical multimodal latent Dirichlet allocation (hMLDA. Object recognition results using convolutional neural network (CNN, hierarchical k-means clustering result of self-position estimated by Monte Carlo localization (MCL, and a set of location names are used, respectively, as features in vision, position, and word information. Experiments in forming hierarchical spatial concepts and evaluating how the proposed method can predict unobserved location names and position categories are performed using a robot in the real world. Results verify that, relative to comparable baseline methods, the proposed method enables a robot to predict location names and position categories closer to predictions made by humans. As an application example of the proposed method in a home environment, a demonstration in which a human support robot moves to an instructed place based on human speech instructions is achieved based on the formed hierarchical spatial concept.

  13. A CLUSTER IN THE MAKING: ALMA REVEALS THE INITIAL CONDITIONS FOR HIGH-MASS CLUSTER FORMATION

    International Nuclear Information System (INIS)

    Rathborne, J. M.; Contreras, Y.; Longmore, S. N.; Bastian, N.; Jackson, J. M.; Alves, J. F.; Bally, J.; Foster, J. B.; Garay, G.; Kruijssen, J. M. D.; Testi, L.; Walsh, A. J.

    2015-01-01

    G0.253+0.016 is a molecular clump that appears to be on the verge of forming a high-mass cluster: its extremely low dust temperature, high mass, and high density, combined with its lack of prevalent star formation, make it an excellent candidate for an Arches-like cluster in a very early stage of formation. Here we present new Atacama Large Millimeter/Sub-millimeter Array observations of its small-scale (∼0.07 pc) 3 mm dust continuum and molecular line emission from 17 different species that probe a range of distinct physical and chemical conditions. The data reveal a complex network of emission features with a complicated velocity structure: there is emission on all spatial scales, the morphology of which ranges from small, compact regions to extended, filamentary structures that are seen in both emission and absorption. The dust column density is well traced by molecules with higher excitation energies and critical densities, consistent with a clump that has a denser interior. A statistical analysis supports the idea that turbulence shapes the observed gas structure within G0.253+0.016. We find a clear break in the turbulent power spectrum derived from the optically thin dust continuum emission at a spatial scale of ∼0.1 pc, which may correspond to the spatial scale at which gravity has overcome the thermal pressure. We suggest that G0.253+0.016 is on the verge of forming a cluster from hierarchical, filamentary structures that arise from a highly turbulent medium. Although the stellar distribution within high-mass Arches-like clusters is compact, centrally condensed, and smooth, the observed gas distribution within G0.253+0.016 is extended, with no high-mass central concentration, and has a complex, hierarchical structure. If this clump gives rise to a high-mass cluster and its stars are formed from this initially hierarchical gas structure, then the resulting cluster must evolve into a centrally condensed structure via a dynamical process

  14. Rotasi Varimax dan Median Hirarki Cluster Pada Program Raskin di Kabupaten Lombok Barat

    Directory of Open Access Journals (Sweden)

    Desy Komalasari

    2015-11-01

    Full Text Available The granting rice program for poor households (Raskin is one of the West Lombok regency government programs for village poverty. The effectiveness of the program relating to 14 criteria for the poor households Raskin recipients (RTS-PM. The 14 criteria have been grouped into several factors using varimax rotation factor analysis, while the RTS-PM have been grouped using hierarchical median cluster analysis. Four factors obtained based on the analysis. First factor was the house existence, the second factor was the financial ability, the third factor was the house existing facilities, and the four factor was the education of the household head and the purchasing power of clothing. The clustering results using hierarchical median cluster analysis formed 3 clusters. The first cluster contains the RTS-PM which have been grouped into first factor; the second cluster contains the RTS-PM which have been grouped into second and third factor; and the third cluster contains the RTS-PM which have been grouped into fourth factor.

  15. Rotasi Varimax dan Median Hirarki Cluster Pada Program Raskin di Kabupaten Lombok Barat

    Directory of Open Access Journals (Sweden)

    Desy Komalasari

    2015-06-01

    Full Text Available The granting rice program for poor households (Raskin is one of the West Lombok regency government programs for village poverty. The effectiveness of the program relating to 14 criteria for the poor households Raskin recipients (RTS-PM. The 14 criteria have been grouped into several factors using varimax rotation factor analysis, while the RTS-PM have been grouped using hierarchical median cluster analysis. Four factors obtained based on the analysis. First factor was the house existence, the second factor was the financial ability, the third factor was the house existing facilities, and the four factor was the education of the household head and the purchasing power of clothing. The clustering results using hierarchical median cluster analysis formed 3 clusters. The first cluster contains the RTS-PM which have been grouped into first factor; the second cluster contains the RTS-PM which have been grouped into second and third factor; and the third cluster contains the RTS-PM which have been grouped into fourth factor.

  16. Peringkasan Tweet Berdasarkan Trending Topic Twitter Dengan Pembobotan TF-IDF dan Single Linkage AngglomerativeHierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Annisa Annisa

    2016-10-01

    Full Text Available Trending topic is a feature provided by twitter that informs something widely discussed by users in a particular time. The form of a trending topic is a hashtag and can be selected by clicking. However, the number of tweets for each trending topics can be very large, so it will be difficult if we want to know all the contents. So, in order to make easy when reading the topic, a small number of tweets can be selected as the main idea of the topic. In this study, we applied the Agglomerative Single Linkage Hierarchical Clustering by calculating the TF-IDF value for each word in advance. We used 100 trending topics, where each topic consists of 50 tweets in Indonesian. For testing, we provided 30 trending topics which consist of 2 until 9 sub-topics. The result is that each trending topics can be summarized into shorter text contains 2 until 9 tweets. We were able to summarize 1 trending topics exactly same as the topic summarized by human expert. However, the rest of topics corresponded partially with human expert.

  17. Intercalating graphene with clusters of Fe3O4 nanocrystals for electrochemical supercapacitors

    Science.gov (United States)

    Ke, Qingqing; Tang, Chunhua; Liu, Yanqiong; Liu, Huajun; Wang, John

    2014-04-01

    A hierarchical nanostructure consisting of graphene sheets intercalated by clusters of Fe3O4 nanocystals is developed for high-performance supercapacitor electrode. Here we show that the negatively charged graphene oxide (GO) and positively charged Fe3O4 clusters enable a strong electrostatic interaction, generating a hierarchical 3D nanostructure, which gives rise to the intercalated composites through a rational hydrothermal process. The electrocapacitive behavior of the resultant composites is systematically investigated by cyclic voltammeter and galvanostatic charge-discharge techniques, where a positive synergistic effect between graphene and Fe3O4 clusters is identified. A maximum specific capacitance of 169 F g-1 is achieved in the Fe3O4 clusters decorated with effectively reduced graphene oxide (Fe3O4-rGO-12h), which is much higher than those of rGO (101 F g-1) and Fe3O4 (68 F g-1) at the current density of 1 Ag-1. Moreover, this intercalated hierarchical nanostructure demonstrates a good capacitance retention, retaining over 88% of the initial capacity after 1000 cycles.

  18. Extension of mixture-of-experts networks for binary classification of hierarchical data.

    Science.gov (United States)

    Ng, Shu-Kay; McLachlan, Geoffrey J

    2007-09-01

    For many applied problems in the context of medically relevant artificial intelligence, the data collected exhibit a hierarchical or clustered structure. Ignoring the interdependence between hierarchical data can result in misleading classification. In this paper, we extend the mechanism for mixture-of-experts (ME) networks for binary classification of hierarchical data. Another extension is to quantify cluster-specific information on data hierarchy by random effects via the generalized linear mixed-effects model (GLMM). The extension of ME networks is implemented by allowing for correlation in the hierarchical data in both the gating and expert networks via the GLMM. The proposed model is illustrated using a real thyroid disease data set. In our study, we consider 7652 thyroid diagnosis records from 1984 to early 1987 with complete information on 20 attribute values. We obtain 10 independent random splits of the data into a training set and a test set in the proportions 85% and 15%. The test sets are used to assess the generalization performance of the proposed model, based on the percentage of misclassifications. For comparison, the results obtained from the ME network with independence assumption are also included. With the thyroid disease data, the misclassification rate on test sets for the extended ME network is 8.9%, compared to 13.9% for the ME network. In addition, based on model selection methods described in Section 2, a network with two experts is selected. These two expert networks can be considered as modeling two groups of patients with high and low incidence rates. Significant variation among the predicted cluster-specific random effects is detected in the patient group with low incidence rate. It is shown that the extended ME network outperforms the ME network for binary classification of hierarchical data. With the thyroid disease data, useful information on the relative log odds of patients with diagnosed conditions at different periods can be

  19. iHAT: interactive Hierarchical Aggregation Table for Genetic Association Data

    Directory of Open Access Journals (Sweden)

    Heinrich Julian

    2012-05-01

    Full Text Available Abstract In the search for single-nucleotide polymorphisms which influence the observable phenotype, genome wide association studies have become an important technique for the identification of associations between genotype and phenotype of a diverse set of sequence-based data. We present a methodology for the visual assessment of single-nucleotide polymorphisms using interactive hierarchical aggregation techniques combined with methods known from traditional sequence browsers and cluster heatmaps. Our tool, the interactive Hierarchical Aggregation Table (iHAT, facilitates the visualization of multiple sequence alignments, associated metadata, and hierarchical clusterings. Different color maps and aggregation strategies as well as filtering options support the user in finding correlations between sequences and metadata. Similar to other visualizations such as parallel coordinates or heatmaps, iHAT relies on the human pattern-recognition ability for spotting patterns that might indicate correlation or anticorrelation. We demonstrate iHAT using artificial and real-world datasets for DNA and protein association studies as well as expression Quantitative Trait Locus data.

  20. ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time.

    Science.gov (United States)

    Cai, Yunpeng; Zheng, Wei; Yao, Jin; Yang, Yujie; Mai, Volker; Mao, Qi; Sun, Yijun

    2017-04-01

    The rapid development of sequencing technology has led to an explosive accumulation of genomic sequence data. Clustering is often the first step to perform in sequence analysis, and hierarchical clustering is one of the most commonly used approaches for this purpose. However, it is currently computationally expensive to perform hierarchical clustering of extremely large sequence datasets due to its quadratic time and space complexities. In this paper we developed a new algorithm called ESPRIT-Forest for parallel hierarchical clustering of sequences. The algorithm achieves subquadratic time and space complexity and maintains a high clustering accuracy comparable to the standard method. The basic idea is to organize sequences into a pseudo-metric based partitioning tree for sub-linear time searching of nearest neighbors, and then use a new multiple-pair merging criterion to construct clusters in parallel using multiple threads. The new algorithm was tested on the human microbiome project (HMP) dataset, currently one of the largest published microbial 16S rRNA sequence dataset. Our experiment demonstrated that with the power of parallel computing it is now compu- tationally feasible to perform hierarchical clustering analysis of tens of millions of sequences. The software is available at http://www.acsu.buffalo.edu/∼yijunsun/lab/ESPRIT-Forest.html.

  1. [Cluster analysis in biomedical researches].

    Science.gov (United States)

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

    2013-01-01

    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.

  2. Dynamical processes in space: Cluster results

    Directory of Open Access Journals (Sweden)

    C. P. Escoubet

    2013-06-01

    Full Text Available After 12 years of operations, the Cluster mission continues to successfully fulfil its scientific objectives. The main goal of the Cluster mission, comprised of four identical spacecraft, is to study in three dimensions small-scale plasma structures in key plasma regions of the Earth's environment: solar wind and bow shock, magnetopause, polar cusps, magnetotail, plasmasphere and auroral zone. During the course of the mission, the relative distance between the four spacecraft has been varied from 20 km to 36 000 km to study the scientific regions of interest at different scales. Since summer 2005, new multi-scale constellations have been implemented, wherein three spacecraft (C1, C2, C3 are separated by 10 000 km, while the fourth one (C4 is at a variable distance ranging between 20 km and 10 000 km from C3. Recent observations were conducted in the auroral acceleration region with the spacecraft separated by 1000s km. We present highlights of the results obtained during the last 12 years on collisionless shocks, magnetopause waves, magnetotail dynamics, plasmaspheric structures, and the auroral acceleration region. In addition, we highlight Cluster results on understanding the impact of Coronal Mass Ejections (CME on the Earth environment. We will also present Cluster data accessibility through the Cluster Science Data System (CSDS, and the Cluster Active Archive (CAA, which was implemented to provide a permanent and public archive of high resolution Cluster data from all instruments.

  3. Short-Term Wind Power Forecasting Based on Clustering Pre-Calculated CFD Method

    Directory of Open Access Journals (Sweden)

    Yimei Wang

    2018-04-01

    Full Text Available To meet the increasing wind power forecasting (WPF demands of newly built wind farms without historical data, physical WPF methods are widely used. The computational fluid dynamics (CFD pre-calculated flow fields (CPFF-based WPF is a promising physical approach, which can balance well the competing demands of computational efficiency and accuracy. To enhance its adaptability for wind farms in complex terrain, a WPF method combining wind turbine clustering with CPFF is first proposed where the wind turbines in the wind farm are clustered and a forecasting is undertaken for each cluster. K-means, hierarchical agglomerative and spectral analysis methods are used to establish the wind turbine clustering models. The Silhouette Coefficient, Calinski-Harabaz index and within-between index are proposed as criteria to evaluate the effectiveness of the established clustering models. Based on different clustering methods and schemes, various clustering databases are built for clustering pre-calculated CFD (CPCC-based short-term WPF. For the wind farm case studied, clustering evaluation criteria show that hierarchical agglomerative clustering has reasonable results, spectral clustering is better and K-means gives the best performance. The WPF results produced by different clustering databases also prove the effectiveness of the three evaluation criteria in turn. The newly developed CPCC model has a much higher WPF accuracy than the CPFF model without using clustering techniques, both on temporal and spatial scales. The research provides supports for both the development and improvement of short-term physical WPF systems.

  4. Hierarchical MAS based control strategy for microgrid

    Energy Technology Data Exchange (ETDEWEB)

    Xiao, Z.; Li, T.; Huang, M.; Shi, J.; Yang, J.; Yu, J. [School of Information Science and Engineering, Yunnan University, Kunming 650091 (China); Xiao, Z. [School of Electrical and Electronic Engineering, Nanyang Technological University, Western Catchment Area, 639798 (Singapore); Wu, W. [Communication Branch of Yunnan Power Grid Corporation, Kunming, Yunnan 650217 (China)

    2010-09-15

    Microgrids have become a hot topic driven by the dual pressures of environmental protection concerns and the energy crisis. In this paper, a challenge for the distributed control of a modern electric grid incorporating clusters of residential microgrids is elaborated and a hierarchical multi-agent system (MAS) is proposed as a solution. The issues of how to realize the hierarchical MAS and how to improve coordination and control strategies are discussed. Based on MATLAB and ZEUS platforms, bilateral switching between grid-connected mode and island mode is performed under control of the proposed MAS to enhance and support its effectiveness. (authors)

  5. Hierarchical cluster analysis of technical replicates to identify interferents in untargeted mass spectrometry metabolomics.

    Science.gov (United States)

    Caesar, Lindsay K; Kvalheim, Olav M; Cech, Nadja B

    2018-08-27

    Mass spectral data sets often contain experimental artefacts, and data filtering prior to statistical analysis is crucial to extract reliable information. This is particularly true in untargeted metabolomics analyses, where the analyte(s) of interest are not known a priori. It is often assumed that chemical interferents (i.e. solvent contaminants such as plasticizers) are consistent across samples, and can be removed by background subtraction from blank injections. On the contrary, it is shown here that chemical contaminants may vary in abundance across each injection, potentially leading to their misidentification as relevant sample components. With this metabolomics study, we demonstrate the effectiveness of hierarchical cluster analysis (HCA) of replicate injections (technical replicates) as a methodology to identify chemical interferents and reduce their contaminating contribution to metabolomics models. Pools of metabolites with varying complexity were prepared from the botanical Angelica keiskei Koidzumi and spiked with known metabolites. Each set of pools was analyzed in triplicate and at multiple concentrations using ultraperformance liquid chromatography coupled to mass spectrometry (UPLC-MS). Before filtering, HCA failed to cluster replicates in the data sets. To identify contaminant peaks, we developed a filtering process that evaluated the relative peak area variance of each variable within triplicate injections. These interferent peaks were found across all samples, but did not show consistent peak area from injection to injection, even when evaluating the same chemical sample. This filtering process identified 128 ions that appear to originate from the UPLC-MS system. Data sets collected for a high number of pools with comparatively simple chemical composition were highly influenced by these chemical interferents, as were samples that were analyzed at a low concentration. When chemical interferent masses were removed, technical replicates clustered in

  6. The association between content of the elements S, Cl, K, Fe, Cu, Zn and Br in normal and cirrhotic liver tissue from Danes and Greenlandic Inuit examined by dual hierarchical clustering analysis.

    Science.gov (United States)

    Laursen, Jens; Milman, Nils; Pind, Niels; Pedersen, Henrik; Mulvad, Gert

    2014-01-01

    Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. Normal liver samples from 45 Greenlandic Inuit, median age 60 years and from 71 Danes, median age 61 years. Cirrhotic liver samples from 27 Danes, median age 71 years. Element content was measured using X-ray fluorescence spectrometry. Dual hierarchical clustering analysis, creating a dual dendrogram, one clustering element contents according to calculated similarities, one clustering elements according to correlation coefficients between the element contents, both using Euclidian distance and Ward Procedure. One dendrogram separated subjects in 7 clusters showing no differences in ethnicity, gender or age. The analysis discriminated between elements in normal and cirrhotic livers. The other dendrogram clustered elements in four clusters: sulphur and chlorine; copper and bromine; potassium and zinc; iron. There were significant correlations between the elements in normal liver samples: S was associated with Cl, K, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical clustering analysis incorporates all elements at the same time and can be used to examine the linkage and interplay between multiple elements in tissue samples. Copyright © 2013 Elsevier GmbH. All rights reserved.

  7. An Analysis of Turkey's PISA 2015 Results Using Two-Level Hierarchical Linear Modelling

    Science.gov (United States)

    Atas, Dogu; Karadag, Özge

    2017-01-01

    In the field of education, most of the data collected are multi-level structured. Cities, city based schools, school based classes and finally students in the classrooms constitute a hierarchical structure. Hierarchical linear models give more accurate results compared to standard models when the data set has a structure going far as individuals,…

  8. Parallel Implementation of the Recursive Approximation of an Unsupervised Hierarchical Segmentation Algorithm. Chapter 5

    Science.gov (United States)

    Tilton, James C.; Plaza, Antonio J. (Editor); Chang, Chein-I. (Editor)

    2008-01-01

    The hierarchical image segmentation algorithm (referred to as HSEG) is a hybrid of hierarchical step-wise optimization (HSWO) and constrained spectral clustering that produces a hierarchical set of image segmentations. HSWO is an iterative approach to region grooving segmentation in which the optimal image segmentation is found at N(sub R) regions, given a segmentation at N(sub R+1) regions. HSEG's addition of constrained spectral clustering makes it a computationally intensive algorithm, for all but, the smallest of images. To counteract this, a computationally efficient recursive approximation of HSEG (called RHSEG) has been devised. Further improvements in processing speed are obtained through a parallel implementation of RHSEG. This chapter describes this parallel implementation and demonstrates its computational efficiency on a Landsat Thematic Mapper test scene.

  9. Dynamic Hierarchical Sleep Scheduling for Wireless Ad-Hoc Sensor Networks

    OpenAIRE

    Chih-Yu Wen; Ying-Chih Chen

    2009-01-01

    This paper presents two scheduling management schemes for wireless sensor networks, which manage the sensors by utilizing the hierarchical network structure and allocate network resources efficiently. A local criterion is used to simultaneously establish the sensing coverage and connectivity such that dynamic cluster-based sleep scheduling can be achieved. The proposed schemes are simulated and analyzed to abstract the network behaviors in a number of settings. The experimental results show t...

  10. Ultrathin mesoporous Co_3O_4 nanosheets-constructed hierarchical clusters as high rate capability and long life anode materials for lithium-ion batteries

    International Nuclear Information System (INIS)

    Wu, Shengming; Xia, Tian; Wang, Jingping; Lu, Feifei; Xu, Chunbo; Zhang, Xianfa; Huo, Lihua; Zhao, Hui

    2017-01-01

    Graphical abstract: Ultrathin mesoporous Co_3O_4 nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment. When tested as anode materials for LIBs, UMCN-HCs achieve high reversible capacity, good long cycling life, and rate capability. - Highlights: • UMCN-HCs show high capacity, excellent stability, and good rate capability. • UMCN-HCs retain a capacity of 1067 mAh g"−"1 after 100 cycles at 100 mA g"−"1. • UMCN-HCs deliver a capacity of 507 mAh g"−"1 after 500 cycles at 2 A g"−"1. - Abstract: Herein, Ultrathin mesoporous Co_3O_4 nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment at 600 °C in air. The products consist of cluster-like Co_3O_4 microarchitectures, which are assembled by numerous ultrathin mesoporous Co_3O_4 nanosheets. When tested as anode materials for lithium-ion batteries, UMCN-HCs deliver a high reversible capacity of 1067 mAh g"−"1 at a current density of 100 mA g"−"1 after 100 cycles. Even at 2 A g"−"1, a stable capacity as high as 507 mAh g"−"1 can be achieved after 500 cycles. The high reversible capacity, excellent cycling stability, and good rate capability of UMCN-HCs may be attributed to their mesoporous sheet-like nanostructure. The sheet-layered structure of UMCN-HCs may buffer the volume change during the lithiation-delithiation process, and the mesoporous characteristic make lithium-ion transfer more easily at the interface between the active electrode and the electrolyte.

  11. A Simple Hierarchical Pooling Data Structure for Loop Closure

    Science.gov (United States)

    2016-10-16

    performance empirically on the KITTI [9], Oxford [6] and TUM RGB- D [29] datasets, as well as demonstrate extensions to general image retrieval on the...of a BoW where each word is an element of a dictionary of descriptors obtained off-line by hierarchical k-means clustering, with each word weighted by...to the inverse docu- ment frequency. This standard pipeline, with different clustering procedures to generate the dictionary and different features

  12. Hierarchical structures of correlations networks among Turkey’s exports and imports by currencies

    Science.gov (United States)

    Kocakaplan, Yusuf; Deviren, Bayram; Keskin, Mustafa

    2012-12-01

    We have examined the hierarchical structures of correlations networks among Turkey’s exports and imports by currencies for the 1996-2010 periods, using the concept of a minimal spanning tree (MST) and hierarchical tree (HT) which depend on the concept of ultrametricity. These trees are useful tools for understanding and detecting the global structure, taxonomy and hierarchy in financial markets. We derived a hierarchical organization and build the MSTs and HTs during the 1996-2001 and 2002-2010 periods. The reason for studying two different sub-periods, namely 1996-2001 and 2002-2010, is that the Euro (EUR) came into use in 2001, and some countries have made their exports and imports with Turkey via the EUR since 2002, and in order to test various time-windows and observe temporal evolution. We have carried out bootstrap analysis to associate a value of the statistical reliability to the links of the MSTs and HTs. We have also used the average linkage cluster analysis (ALCA) to observe the cluster structure more clearly. Moreover, we have obtained the bidimensional minimal spanning tree (BMST) due to economic trade being a bidimensional problem. From the structural topologies of these trees, we have identified different clusters of currencies according to their proximity and economic ties. Our results show that some currencies are more important within the network, due to a tighter connection with other currencies. We have also found that the obtained currencies play a key role for Turkey’s exports and imports and have important implications for the design of portfolio and investment strategies.

  13. Statistical measures of galaxy clustering

    International Nuclear Information System (INIS)

    Porter, D.H.

    1988-01-01

    Consideration is given to the large-scale distribution of galaxies and ways in which this distribution may be statistically measured. Galaxy clustering is hierarchical in nature, so that the positions of clusters of galaxies are themselves spatially clustered. A simple identification of groups of galaxies would be an inadequate description of the true richness of galaxy clustering. Current observations of the large-scale structure of the universe and modern theories of cosmology may be studied with a statistical description of the spatial and velocity distributions of galaxies. 8 refs

  14. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.

    Science.gov (United States)

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A

    2014-10-16

    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.

  15. Cities and regions in Britain through hierarchical percolation

    Science.gov (United States)

    Arcaute, Elsa; Molinero, Carlos; Hatna, Erez; Murcio, Roberto; Vargas-Ruiz, Camilo; Masucci, A. Paolo; Batty, Michael

    2016-04-01

    Urban systems present hierarchical structures at many different scales. These are observed as administrative regional delimitations which are the outcome of complex geographical, political and historical processes which leave almost indelible footprints on infrastructure such as the street network. In this work, we uncover a set of hierarchies in Britain at different scales using percolation theory on the street network and on its intersections which are the primary points of interaction and urban agglomeration. At the larger scales, the observed hierarchical structures can be interpreted as regional fractures of Britain, observed in various forms, from natural boundaries, such as National Parks, to regional divisions based on social class and wealth such as the well-known North-South divide. At smaller scales, cities are generated through recursive percolations on each of the emerging regional clusters. We examine the evolution of the morphology of the system as a whole, by measuring the fractal dimension of the clusters at each distance threshold in the percolation. We observe that this reaches a maximum plateau at a specific distance. The clusters defined at this distance threshold are in excellent correspondence with the boundaries of cities recovered from satellite images, and from previous methods using population density.

  16. Clustering User Behavior in Scientific Collections

    OpenAIRE

    Blixhavn, Øystein Hoel

    2014-01-01

    This master thesis looks at how clustering techniques can be appliedto a collection of scientific documents. Approximately one year of serverlogs from the CERN Document Server (CDS) are analyzed and preprocessed.Based on the findings of this analysis, and a review of thecurrent state of the art, three different clustering methods are selectedfor further work: Simple k-Means, Hierarchical Agglomerative Clustering(HAC) and Graph Partitioning. In addition, a custom, agglomerativeclustering algor...

  17. A new hierarchical method to find community structure in networks

    Science.gov (United States)

    Saoud, Bilal; Moussaoui, Abdelouahab

    2018-04-01

    Community structure is very important to understand a network which represents a context. Many community detection methods have been proposed like hierarchical methods. In our study, we propose a new hierarchical method for community detection in networks based on genetic algorithm. In this method we use genetic algorithm to split a network into two networks which maximize the modularity. Each new network represents a cluster (community). Then we repeat the splitting process until we get one node at each cluster. We use the modularity function to measure the strength of the community structure found by our method, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our method are highly effective at discovering community structure in both computer-generated and real-world network data.

  18. clusterMaker: a multi-algorithm clustering plugin for Cytoscape

    Directory of Open Access Journals (Sweden)

    Morris John H

    2011-11-01

    Full Text Available Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view, k-means, k-medoid, SCPS, AutoSOME, and native (Java MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin cluster

  19. Coherence-based Time Series Clustering for Brain Connectivity Visualization

    KAUST Repository

    Euan, Carolina

    2017-11-19

    We develop the hierarchical cluster coherence (HCC) method for brain signals, a procedure for characterizing connectivity in a network by clustering nodes or groups of channels that display high level of coordination as measured by

  20. Coherence-based Time Series Clustering for Brain Connectivity Visualization

    KAUST Repository

    Euan, Carolina; Sun, Ying; Ombao, Hernando

    2017-01-01

    We develop the hierarchical cluster coherence (HCC) method for brain signals, a procedure for characterizing connectivity in a network by clustering nodes or groups of channels that display high level of coordination as measured by

  1. TWO-STAGE CHARACTER CLASSIFICATION : A COMBINED APPROACH OF CLUSTERING AND SUPPORT VECTOR CLASSIFIERS

    NARCIS (Netherlands)

    Vuurpijl, L.; Schomaker, L.

    2000-01-01

    This paper describes a two-stage classification method for (1) classification of isolated characters and (2) verification of the classification result. Character prototypes are generated using hierarchical clustering. For those prototypes known to sometimes produce wrong classification results, a

  2. Programming Hierarchical Self-Assembly of Patchy Particles into Colloidal Crystals via Colloidal Molecules.

    Science.gov (United States)

    Morphew, Daniel; Shaw, James; Avins, Christopher; Chakrabarti, Dwaipayan

    2018-03-27

    Colloidal self-assembly is a promising bottom-up route to a wide variety of three-dimensional structures, from clusters to crystals. Programming hierarchical self-assembly of colloidal building blocks, which can give rise to structures ordered at multiple levels to rival biological complexity, poses a multiscale design problem. Here we explore a generic design principle that exploits a hierarchy of interaction strengths and employ this design principle in computer simulations to demonstrate the hierarchical self-assembly of triblock patchy colloidal particles into two distinct colloidal crystals. We obtain cubic diamond and body-centered cubic crystals via distinct clusters of uniform size and shape, namely, tetrahedra and octahedra, respectively. Such a conceptual design framework has the potential to reliably encode hierarchical self-assembly of colloidal particles into a high level of sophistication. Moreover, the design framework underpins a bottom-up route to cubic diamond colloidal crystals, which have remained elusive despite being much sought after for their attractive photonic applications.

  3. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.

    Science.gov (United States)

    Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl

    2012-07-13

    Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results

  4. Not all stars form in clusters - measuring the kinematics of OB associations with Gaia

    Science.gov (United States)

    Ward, Jacob L.; Kruijssen, J. M. Diederik

    2018-04-01

    It is often stated that star clusters are the fundamental units of star formation and that most (if not all) stars form in dense stellar clusters. In this monolithic formation scenario, low-density OB associations are formed from the expansion of gravitationally bound clusters following gas expulsion due to stellar feedback. N-body simulations of this process show that OB associations formed this way retain signs of expansion and elevated radial anisotropy over tens of Myr. However, recent theoretical and observational studies suggest that star formation is a hierarchical process, following the fractal nature of natal molecular clouds and allowing the formation of large-scale associations in situ. We distinguish between these two scenarios by characterizing the kinematics of OB associations using the Tycho-Gaia Astrometric Solution catalogue. To this end, we quantify four key kinematic diagnostics: the number ratio of stars with positive radial velocities to those with negative radial velocities, the median radial velocity, the median radial velocity normalized by the tangential velocity, and the radial anisotropy parameter. Each quantity presents a useful diagnostic of whether the association was more compact in the past. We compare these diagnostics to models representing random motion and the expanding products of monolithic cluster formation. None of these diagnostics show evidence of expansion, either from a single cluster or multiple clusters, and the observed kinematics are better represented by a random velocity distribution. This result favours the hierarchical star formation model in which a minority of stars forms in bound clusters and large-scale, hierarchically structured associations are formed in situ.

  5. Result Diversification Based on Query-Specific Cluster Ranking

    NARCIS (Netherlands)

    J. He (Jiyin); E. Meij; M. de Rijke (Maarten)

    2011-01-01

    htmlabstractResult diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking,

  6. Comparative analysis of clustering methods for gene expression time course data

    Directory of Open Access Journals (Sweden)

    Ivan G. Costa

    2004-01-01

    Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.

  7. Recommending the heterogeneous cluster type multi-processor system computing

    International Nuclear Information System (INIS)

    Iijima, Nobukazu

    2010-01-01

    Real-time reactor simulator had been developed by reusing the equipment of the Musashi reactor and its performance improvement became indispensable for research tools to increase sampling rate with introduction of arithmetic units using multi-Digital Signal Processor(DSP) system (cluster). In order to realize the heterogeneous cluster type multi-processor system computing, combination of two kinds of Control Processor (CP) s, Cluster Control Processor (CCP) and System Control Processor (SCP), were proposed with Large System Control Processor (LSCP) for hierarchical cluster if needed. Faster computing performance of this system was well evaluated by simulation results for simultaneous execution of plural jobs and also pipeline processing between clusters, which showed the system led to effective use of existing system and enhancement of the cost performance. (T. Tanaka)

  8. Gene expression data clustering and it’s application in differential analysis of leukemia

    Directory of Open Access Journals (Sweden)

    M. Vahedi

    2008-02-01

    Full Text Available Introduction: DNA microarray technique is one of the most important categories in bioinformatics,which allows the possibility of monitoring thousands of expressed genes has been resulted in creatinggiant data bases of gene expression data, recently. Statistical analysis of such databases includednormalization, clustering, classification and etc.Materials and Methods: Golub et al (1999 collected data bases of leukemia based on the method ofoligonucleotide. The data is on the internet. In this paper, we analyzed gene expression data. It wasclustered by several methods including multi-dimensional scaling, hierarchical and non-hierarchicalclustering. Data set included 20 Acute Lymphoblastic Leukemia (ALL patients and 14 Acute MyeloidLeukemia (AML patients. The results of tow methods of clustering were compared with regard to realgrouping (ALL & AML. R software was used for data analysis.Results: Specificity and sensitivity of divisive hierarchical clustering in diagnosing of ALL patientswere 75% and 92%, respectively. Specificity and sensitivity of partitioning around medoids indiagnosing of ALL patients were 90% and 93%, respectively. These results showed a wellaccomplishment of both methods of clustering. It is considerable that, due to clustering methodsresults, one of the samples was placed in ALL groups, which was in AML group in clinical test.Conclusion: With regard to concordance of the results with real grouping of data, therefore we canuse these methods in the cases where we don't have accurate information of real grouping of data.Moreover, Results of clustering might distinct subgroups of data in such a way that would be necessaryfor concordance with clinical outcomes, laboratory results and so on.

  9. Result diversification based on query-specific cluster ranking

    NARCIS (Netherlands)

    He, J.; Meij, E.; de Rijke, M.

    2011-01-01

    Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification

  10. Clusters of Tourism Consumers in Romania

    Directory of Open Access Journals (Sweden)

    Pelau Corina

    2018-03-01

    Full Text Available The analysis and determination of typologies of tourism consumers has been a major concern for scientists, specialists and companies as well. Knowing the demographic and motivational factors that determine consumers to buy tourism products can have a major impact on the marketing strategy by a more efficient targeting of customers. This article presents the results of a research that aims to determine the factors which influence the buying decision for tourism products and the clusters of consumers resulted from these factors. 90 persons have been surveyed pursuing the determination of the most important factors for buying a tourism product and the correlation between them. The factor analysis and the cluster analysis have been applied with the help of the SPSS program. The results of the factor analysis group the items into six factors. In a second phase, the consumers have been divided into three categories based on a hierarchical Ward cluster analysis. The three clusters have been defined and analyzed and recommendations for the future research have been given.

  11. Assessment of the quality of water by hierarchical cluster and variance analyses of the Koudiat Medouar Watershed, East Algeria

    Science.gov (United States)

    Tiri, Ammar; Lahbari, Noureddine; Boudoukha, Abderrahmane

    2017-12-01

    The assessment of surface water in Koudiat Medouar watershed is very important especially when it comes to pollution of the dam waters by discharges of wastewater from neighboring towns in Oued Timgad, who poured into the basin of the dam, and agricultural lands located along the Oued Reboa. To this end, the multivariable method was used to evaluate the spatial and temporal variation of the water surface quality of the Koudiat Medouar dam, eastern Algeria. The stiff diagram has identified two main hydrochemical facies. The first facies Mg-HCO3 is reflected in the first sampling station (Oued Reboa) and in the second one (Oued Timgad), while the second facies Mg-SO4 is reflected in the third station (Basin Dam). The results obtained by the analysis of variance show that in the three stations all parameters are significant, except for Na, K and HCO3 in the first station (Oued Reboa) and the EC in the second station (Oued Timgad) and at the end NO3 and pH in the third station (Basin Dam). Q-mode hierarchical cluster analysis showed that two main groups in each sampling station. The chemistry of major ions (Mg, Ca, HCO3 and SO4) within the three stations results from anthropogenic impacts and water-rock interaction sources.

  12. Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters as high rate capability and long life anode materials for lithium-ion batteries

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Shengming [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Xia, Tian, E-mail: xiatian@hlju.edu.cn [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Wang, Jingping [Key Laboratory of Superlight Material and Surface Technology, Ministry of Education, College of Materials Science and Chemical Engineering, Harbin Engineering University, Heilongjiang, Harbin 150001 (China); Lu, Feifei [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Xu, Chunbo [Key Laboratory of Superlight Material and Surface Technology, Ministry of Education, College of Materials Science and Chemical Engineering, Harbin Engineering University, Heilongjiang, Harbin 150001 (China); Zhang, Xianfa; Huo, Lihua [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Zhao, Hui, E-mail: zhaohui98@yahoo.com [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China)

    2017-06-01

    Graphical abstract: Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment. When tested as anode materials for LIBs, UMCN-HCs achieve high reversible capacity, good long cycling life, and rate capability. - Highlights: • UMCN-HCs show high capacity, excellent stability, and good rate capability. • UMCN-HCs retain a capacity of 1067 mAh g{sup −1} after 100 cycles at 100 mA g{sup −1}. • UMCN-HCs deliver a capacity of 507 mAh g{sup −1} after 500 cycles at 2 A g{sup −1}. - Abstract: Herein, Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment at 600 °C in air. The products consist of cluster-like Co{sub 3}O{sub 4} microarchitectures, which are assembled by numerous ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets. When tested as anode materials for lithium-ion batteries, UMCN-HCs deliver a high reversible capacity of 1067 mAh g{sup −1} at a current density of 100 mA g{sup −1} after 100 cycles. Even at 2 A g{sup −1}, a stable capacity as high as 507 mAh g{sup −1} can be achieved after 500 cycles. The high reversible capacity, excellent cycling stability, and good rate capability of UMCN-HCs may be attributed to their mesoporous sheet-like nanostructure. The sheet-layered structure of UMCN-HCs may buffer the volume change during the lithiation-delithiation process, and the mesoporous characteristic make lithium-ion transfer more easily at the interface between the active electrode and the electrolyte.

  13. Energy Aware Clustering Algorithms for Wireless Sensor Networks

    Science.gov (United States)

    Rakhshan, Noushin; Rafsanjani, Marjan Kuchaki; Liu, Chenglian

    2011-09-01

    The sensor nodes deployed in wireless sensor networks (WSNs) are extremely power constrained, so maximizing the lifetime of the entire networks is mainly considered in the design. In wireless sensor networks, hierarchical network structures have the advantage of providing scalable and energy efficient solutions. In this paper, we investigate different clustering algorithms for WSNs and also compare these clustering algorithms based on metrics such as clustering distribution, cluster's load balancing, Cluster Head's (CH) selection strategy, CH's role rotation, node mobility, clusters overlapping, intra-cluster communications, reliability, security and location awareness.

  14. Hierarchical structure in the distribution of galaxies

    International Nuclear Information System (INIS)

    Schulman, L.S.; Seiden, P.E.; Technion - Israel Institute of Technology, Haifa; IBM Thomas J. Watson Research Center, Yorktown Heights, NY)

    1986-01-01

    The distribution of galaxies has a hierarchical structure with power-law correlations. This is usually thought to arise from gravity alone acting on an originally uniform distributioon. If, however, the original process of galaxy formation occurs through the stimulated birth of one galaxy due to a nearby recently formed galaxy, and if this process occurs near its percolation threshold, then a hierarchical structure with power-law correlations arises at the time of galaxy formation. If subsequent gravitational evolution within an expanding cosmology is such as to retain power-law correlations, the initial r exp -1 dropoff can steepen to the observed r exp -1.8. The distribution of galaxies obtained by this process produces clustering and voids, as observed. 23 references

  15. Study on distributed re-clustering algorithm for moblie wireless sensor networks

    Directory of Open Access Journals (Sweden)

    XU Chaojie

    2016-04-01

    Full Text Available In mobile wireless sensor networks,node mobility influences the topology of the hierarchically clustered network,thus affects packet delivery ratio and energy consumption of communications in clusters.To reduce the influence of node mobility,a distributed re-clustering algorithm is proposed in this paper.In this algorithm,basing on the clustered network,nodes estimate their current locations with particle algorithm and predict the most possible locations of next time basing on the mobility model.Each boundary node of a cluster periodically estimates the need for re-clustering and re-cluster itself to the optimal cluster through communicating with the cluster headers when needed.The simulation results indicate that,with small re-clustering periods,the proposed algorithm can be effective to keep appropriate communication distance and outperforms existing schemes on packet delivery ratio and energy consumption.

  16. New Heterogeneous Clustering Protocol for Prolonging Wireless Sensor Networks Lifetime

    Directory of Open Access Journals (Sweden)

    Md. Golam Rashed

    2014-06-01

    Full Text Available Clustering in wireless sensor networks is one of the crucial methods for increasing of network lifetime. The network characteristics of existing classical clustering protocols for wireless sensor network are homogeneous. Clustering protocols fail to maintain the stability of the system, especially when nodes are heterogeneous. We have seen that the behavior of Heterogeneous-Hierarchical Energy Aware Routing Protocol (H-HEARP becomes very unstable once the first node dies, especially in the presence of node heterogeneity. In this paper we assume a new clustering protocol whose network characteristics is heterogeneous for prolonging of network lifetime. The computer simulation results demonstrate that the proposed clustering algorithm outperforms than other clustering algorithms in terms of the time interval before the death of the first node (we refer to as stability period. The simulation results also show the high performance of the proposed clustering algorithm for higher values of extra energy brought by more powerful nodes.

  17. A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering

    Science.gov (United States)

    Chahine, Firas Safwan

    2012-01-01

    Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…

  18. Investigating the provenance of iron artifacts of the Royal Iron Factory of Sao Joao de Ipanema by hierarchical cluster analysis of EDS microanalyses of slag inclusions

    Energy Technology Data Exchange (ETDEWEB)

    Mamani-Calcina, Elmer Antonio; Landgraf, Fernando Jose Gomes; Azevedo, Cesar Roberto de Farias, E-mail: c.azevedo@usp.br [Universidade de Sao Paulo (USP), Sao Paulo, SP (Brazil). Escola Politecnica. Departmento de Engenharia Metalurgica e de Materiais

    2017-01-15

    Microstructural characterization techniques, including EDX (Energy Dispersive X-ray Analysis) microanalyses, were used to investigate the slag inclusions in the microstructure of ferrous artifacts of the Royal Iron Factory of Sao Joao de Ipanema (first steel plant of Brazil, XIX century), the D. Pedro II Bridge (located in Bahia, assembled in XIX century and produced in Scotland) and the archaeological sites of Sao Miguel de Missoes (Rio Grande do Sul, Brazil, production site of iron artifacts, the XVIII century) and Afonso Sardinha (Sao Paulo, Brazil production site of iron artifacts, XVI century). The microanalyses results of the main micro constituents of the microstructure of the slag inclusions were investigated by hierarchical cluster analysis and the dendrogram with the microanalyses results of the wüstite phase (using as critical variables the contents of MnO, MgO, Al{sub 2}O{sub 3}, V{sub 2}O{sub 5} and TiO{sub 2}) allowed the identification of four clusters, which successfully represented the samples of the four investigated sites (Ipanema, Sardinha, Missoes and Bahia). Finally, the comparatively low volumetric fraction of slag inclusions in the samples of Ipanema (∼1%) suggested the existence of technological expertise at the iron making processing in the Royal Iron Factory of Sao Joao de Ipanema. (author)

  19. Mining the National Career Assessment Examination Result Using Clustering Algorithm

    Science.gov (United States)

    Pagudpud, M. V.; Palaoag, T. T.; Padirayon, L. M.

    2018-03-01

    Education is an essential process today which elicits authorities to discover and establish innovative strategies for educational improvement. This study applied data mining using clustering technique for knowledge extraction from the National Career Assessment Examination (NCAE) result in the Division of Quirino. The NCAE is an examination given to all grade 9 students in the Philippines to assess their aptitudes in the different domains. Clustering the students is helpful in identifying students’ learning considerations. With the use of the RapidMiner tool, clustering algorithms such as Density-Based Spatial Clustering of Applications with Noise (DBSCAN), k-means, k-medoid, expectation maximization clustering, and support vector clustering algorithms were analyzed. The silhouette indexes of the said clustering algorithms were compared, and the result showed that the k-means algorithm with k = 3 and silhouette index equal to 0.196 is the most appropriate clustering algorithm to group the students. Three groups were formed having 477 students in the determined group (cluster 0), 310 proficient students (cluster 1) and 396 developing students (cluster 2). The data mining technique used in this study is essential in extracting useful information from the NCAE result to better understand the abilities of students which in turn is a good basis for adopting teaching strategies.

  20. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    Directory of Open Access Journals (Sweden)

    Zahra Sharafi

    2017-01-01

    Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  1. Interactive visual exploration and refinement of cluster assignments.

    Science.gov (United States)

    Kern, Michael; Lex, Alexander; Gehlenborg, Nils; Johnson, Chris R

    2017-09-12

    With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data. In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes. Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.

  2. Hierarchical functional modularity in the resting-state human brain.

    Science.gov (United States)

    Ferrarini, Luca; Veer, Ilya M; Baerends, Evelinda; van Tol, Marie-José; Renken, Remco J; van der Wee, Nic J A; Veltman, Dirk J; Aleman, André; Zitman, Frans G; Penninx, Brenda W J H; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, Julien

    2009-07-01

    Functional magnetic resonance imaging (fMRI) studies have shown that anatomically distinct brain regions are functionally connected during the resting state. Basic topological properties in the brain functional connectivity (BFC) map have highlighted the BFC's small-world topology. Modularity, a more advanced topological property, has been hypothesized to be evolutionary advantageous, contributing to adaptive aspects of anatomical and functional brain connectivity. However, current definitions of modularity for complex networks focus on nonoverlapping clusters, and are seriously limited by disregarding inclusive relationships. Therefore, BFC's modularity has been mainly qualitatively investigated. Here, we introduce a new definition of modularity, based on a recently improved clustering measurement, which overcomes limitations of previous definitions, and apply it to the study of BFC in resting state fMRI of 53 healthy subjects. Results show hierarchical functional modularity in the brain. Copyright 2009 Wiley-Liss, Inc

  3. Typing of unknown microorganisms based on quantitative analysis of fatty acids by mass spectrometry and hierarchical clustering

    Energy Technology Data Exchange (ETDEWEB)

    Li Tingting; Dai Ling; Li Lun; Hu Xuejiao; Dong Linjie; Li Jianjian; Salim, Sule Khalfan; Fu Jieying [Key Laboratory of Pesticides and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, Hubei 430079 (China); Zhong Hongying, E-mail: hyzhong@mail.ccnu.edu.cn [Key Laboratory of Pesticides and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, Hubei 430079 (China)

    2011-01-17

    Rapid identification of unknown microorganisms of clinical and agricultural importance is not only critical for accurate diagnosis of infections but also essential for appropriate and prompt treatment. We describe here a rapid method for microorganisms typing based on quantitative analysis of fatty acids by iFAT approach (Isotope-coded Fatty Acid Transmethylation). In this work, lyophilized cell lysates were directly mixed with 0.5 M NaOH solution in d3-methanol and n-hexane. After 1 min of ultrasonication, the top n-hexane layer was combined with a mixture of standard d0-methanol derived fatty acid methylesters with known concentration. Measurement of intensity ratios of d3/d0 labeled fragment ion and molecular ion pairs at the corresponding target fatty acids provides a quantitative basis for hierarchical clustering. In the resultant dendrogram, the Euclidean distance between unknown species and known species quantitatively reveals their differences or shared similarities in fatty acid related pathways. It is of particular interest to apply this method for typing fungal species because fungi has distinguished lipid biosynthetic pathways that have been targeted for lots of drugs or fungicides compared with bacteria and animals. The proposed method has no dependence on the availability of genome or proteome databases. Therefore, it is can be applicable for a broad range of unknown microorganisms or mutant species.

  4. Non-Hierarchical Clustering as a method to analyse an open-ended ...

    African Journals Online (AJOL)

    We show that the use of non-hierarchical analysis allows us to interpret the reasoning of students solving different mathematical problems using Algebra, and to separate them into different groups, that can be recognised and characterised by common traits in their answers, without any prior knowledge on the part of the ...

  5. Investigation on IMCP based clustering in LTE-M communication for smart metering applications

    Directory of Open Access Journals (Sweden)

    Kartik Vishal Deshpande

    2017-06-01

    Full Text Available Machine to Machine (M2M is foreseen as an emerging technology for smart metering applications where devices communicate seamlessly for information transfer. The M2M communication makes use of long term evolution (LTE as its backbone network and it results in long-term evolution for machine type communication (LTE-M network. As huge number of M2M devices is to be handled by single eNB (evolved Node B, clustering is exploited for efficient processing of the network. This paper investigates the proposed Improved M2M Clustering Process (IMCP based clustering technique and it is compared with two well-known clustering algorithms, namely, Low Energy Adaptive Clustering Hierarchical (LEACH and Energy Aware Multihop Multipath Hierarchical (EAMMH techniques. Further, the IMCP algorithm is analyzed with two-tier and three-tier M2M systems for various mobility conditions. The proposed IMCP algorithm improves the last node death by 63.15% and 51.61% as compared to LEACH and EAMMH, respectively. Further, the average energy of each node in IMCP is increased by 89.85% and 81.15%, as compared to LEACH and EAMMH, respectively.

  6. Hierarchical drivers of reef-fish metacommunity structure.

    Science.gov (United States)

    MacNeil, M Aaron; Graham, Nicholas A J; Polunin, Nicholas V C; Kulbicki, Michel; Galzin, René; Harmelin-Vivien, Mireille; Rushton, Steven P

    2009-01-01

    multiple spatial scales; and (3) inter-atoll connectedness was poorly correlated with the nonrandom clustering of reef-fish species. These results demonstrate the importance of modeling hierarchical data and processes in understanding reef-fish metacommunity structure.

  7. The Auroral Field-aligned Acceleration - Cluster Results

    Science.gov (United States)

    Vaivads, A.; Cluster Auroral Team

    The four Cluster satellites cross the auroral field lines at altitudes well above most of acceleration region. Thus, the orbit is appropriate for studies of the generator side of this region. We consider the energy transport towards the acceleration region and different mechanisms for generating the potential drop. Using data from Cluster we can also for the first time study the dynamics of the generator on a minute scale. We present data from a few auroral field crossings where Cluster are in conjunction with DMSP satellites. We use electric and magnetic field data to estimate electrostatic po- tential along the satellite orbit, Poynting flux as well as the presence of plasma waves. These we can compare with data from particle and wave instruments on Cluster and on low latitude satellites to try to make a consistent picture of the acceleration region formation in these cases. Preliminary results show close agreement both between in- tegrated potential values at Cluster and electron peak energies at DMSP as well as close agreement between the integrated Poynting flux values at Cluster and the elec- tron energy flux at DMSP. At the end we draw a parallels between auroral electron acceleration and electron acceleration at the magnetopause.

  8. Open source clustering software.

    Science.gov (United States)

    de Hoon, M J L; Imoto, S; Nolan, J; Miyano, S

    2004-06-12

    We have implemented k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs. Using this library, we have created an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C. The C Clustering Library and the corresponding Python C extension module Pycluster were released under the Python License, while the Perl module Algorithm::Cluster was released under the Artistic License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well as the corresponding command-line program, were released under the same license as the original Cluster code. The complete source code is available at http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster. Alternatively, Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available as part of the Biopython distribution.

  9. Hierarchical organization in the temporal structure of infant-direct speech and song.

    Science.gov (United States)

    Falk, Simone; Kello, Christopher T

    2017-06-01

    Caregivers alter the temporal structure of their utterances when talking and singing to infants compared with adult communication. The present study tested whether temporal variability in infant-directed registers serves to emphasize the hierarchical temporal structure of speech. Fifteen German-speaking mothers sang a play song and told a story to their 6-months-old infants, or to an adult. Recordings were analyzed using a recently developed method that determines the degree of nested clustering of temporal events in speech. Events were defined as peaks in the amplitude envelope, and clusters of various sizes related to periods of acoustic speech energy at varying timescales. Infant-directed speech and song clearly showed greater event clustering compared with adult-directed registers, at multiple timescales of hundreds of milliseconds to tens of seconds. We discuss the relation of this newly discovered acoustic property to temporal variability in linguistic units and its potential implications for parent-infant communication and infants learning the hierarchical structures of speech and language. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Canonical PSO Based K-Means Clustering Approach for Real Datasets.

    Science.gov (United States)

    Dey, Lopamudra; Chakraborty, Sanjay

    2014-01-01

    "Clustering" the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different types of indexes are used to solve different types of problems and indices selection depends on the kind of available data. This paper first proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the effects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. This paper also describes the nature of the clusters and finally compares the performances of these clustering algorithms according to the validity assessment. It also defines which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms.

  11. Experimental results of some cluster tests in NSRR

    International Nuclear Information System (INIS)

    Kobayashi, Shinsho; Ohnishi, Nobuaki; Yoshimura, Tomio; Lussie, W.G.

    1978-01-01

    The NSRR programme is in progress in JAERI using a pulsed reactor to evaluate the behavior of reactor fuels under reactivity accident conditions. This report describes briefly the experimental results and preliminary analysis of two cluster tests. In the cluster configuration of five fuel rods, the power distribution in outer fuel rods are not symmetric due to neutron absorption in central fuel rod. The cladding temperature on the exterior boundaries of the cluster is higher than that in interior. Good agreement was obtained between the calculated and measured cladding temperature histories. In the 3.8$ excess reactivity test, cluster averaged energy deposition of 237 cal/g.UO 2 , cladding melting and deformation were limited to the portions of the fuel rods that were on the exterior boundaries of the cluster. (auth.)

  12. Dynamic Hierarchical Sleep Scheduling for Wireless Ad-Hoc Sensor Networks

    Directory of Open Access Journals (Sweden)

    Chih-Yu Wen

    2009-05-01

    Full Text Available This paper presents two scheduling management schemes for wireless sensor networks, which manage the sensors by utilizing the hierarchical network structure and allocate network resources efficiently. A local criterion is used to simultaneously establish the sensing coverage and connectivity such that dynamic cluster-based sleep scheduling can be achieved. The proposed schemes are simulated and analyzed to abstract the network behaviors in a number of settings. The experimental results show that the proposed algorithms provide efficient network power control and can achieve high scalability in wireless sensor networks.

  13. Dynamic hierarchical sleep scheduling for wireless ad-hoc sensor networks.

    Science.gov (United States)

    Wen, Chih-Yu; Chen, Ying-Chih

    2009-01-01

    This paper presents two scheduling management schemes for wireless sensor networks, which manage the sensors by utilizing the hierarchical network structure and allocate network resources efficiently. A local criterion is used to simultaneously establish the sensing coverage and connectivity such that dynamic cluster-based sleep scheduling can be achieved. The proposed schemes are simulated and analyzed to abstract the network behaviors in a number of settings. The experimental results show that the proposed algorithms provide efficient network power control and can achieve high scalability in wireless sensor networks.

  14. XML documents cluster research based on frequent subpatterns

    Science.gov (United States)

    Ding, Tienan; Li, Wei; Li, Xiongfei

    2015-12-01

    XML data is widely used in the information exchange field of Internet, and XML document data clustering is the hot research topic. In the XML document clustering process, measure differences between two XML documents is time costly, and impact the efficiency of XML document clustering. This paper proposed an XML documents clustering method based on frequent patterns of XML document dataset, first proposed a coding tree structure for encoding the XML document, and translate frequent pattern mining from XML documents into frequent pattern mining from string. Further, using the cosine similarity calculation method and cohesive hierarchical clustering method for XML document dataset by frequent patterns. Because of frequent patterns are subsets of the original XML document data, so the time consumption of XML document similarity measure is reduced. The experiment runs on synthetic dataset and the real datasets, the experimental result shows that our method is efficient.

  15. A Cluster Analysis of Personality Style in Adults with ADHD

    Science.gov (United States)

    Robin, Arthur L.; Tzelepis, Angela; Bedway, Marquita

    2008-01-01

    Objective: The purpose of this study was to use hierarchical linear cluster analysis to examine the normative personality styles of adults with ADHD. Method: A total of 311 adults with ADHD completed the Millon Index of Personality Styles, which consists of 24 scales assessing motivating aims, cognitive modes, and interpersonal behaviors. Results:…

  16. A Comparison of Two Approaches to Beta-Flexible Clustering.

    Science.gov (United States)

    Belbin, Lee; And Others

    1992-01-01

    A method for hierarchical agglomerative polythetic (multivariate) clustering, based on unweighted pair group using arithmetic averages (UPGMA) is compared with the original beta-flexible technique, a weighted average method. Reasons the flexible UPGMA strategy is recommended are discussed, focusing on the ability to recover cluster structure over…

  17. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

    Science.gov (United States)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

    2017-12-01

    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  18. Clustering results - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Clustering results Data detail Data name Clustering results DOI 10.18908/lsdba...se Update History of This Database Site Policy | Contact Us Clustering results - Gclust Server | LSDB Archive ...

  19. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms.

    Science.gov (United States)

    Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John

    2015-09-01

    We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Hierarchical and Complex System Entropy Clustering Analysis Based Validation for Traditional Chinese Medicine Syndrome Patterns of Chronic Atrophic Gastritis.

    Science.gov (United States)

    Zhang, Yin; Liu, Yue; Li, Yannan; Zhao, Xia; Zhuo, Lin; Zhou, Ajian; Zhang, Li; Su, Zeqi; Chen, Cen; Du, Shiyu; Liu, Daming; Ding, Xia

    2018-03-22

    Chronic atrophic gastritis (CAG) is the precancerous stage of gastric carcinoma. Traditional Chinese Medicine (TCM) has been widely used in treating CAG. This study aimed to reveal core pathogenesis of CAG by validating the TCM syndrome patterns and provide evidence for optimization of treatment strategies. This is a cross-sectional study conducted in 4 hospitals in China. Hierarchical clustering analysis (HCA) and complex system entropy clustering analysis (CSECA) were performed, respectively, to achieve syndrome pattern validation. Based on HCA, 15 common factors were assigned to 6 syndrome patterns: liver depression and spleen deficiency and blood stasis in the stomach collateral, internal harassment of phlegm-heat and blood stasis in the stomach collateral, phlegm-turbidity internal obstruction, spleen yang deficiency, internal harassment of phlegm-heat and spleen deficiency, and spleen qi deficiency. By CSECA, 22 common factors were assigned to 7 syndrome patterns: qi deficiency, qi stagnation, blood stasis, phlegm turbidity, heat, yang deficiency, and yin deficiency. Combination of qi deficiency, qi stagnation, blood stasis, phlegm turbidity, heat, yang deficiency, and yin deficiency may play a crucial role in CAG pathogenesis. In accord with this, treatment strategies by TCM herbal prescriptions should be targeted to regulating qi, activating blood, resolving turbidity, clearing heat, removing toxin, nourishing yin, and warming yang. Further explorations are needed to verify and expand the current conclusions.

  1. Clustering Methods Application for Customer Segmentation to Manage Advertisement Campaign

    Directory of Open Access Journals (Sweden)

    Maciej Kutera

    2010-10-01

    Full Text Available Clustering methods are recently so advanced elaborated algorithms for large collection data analysis that they have been already included today to data mining methods. Clustering methods are nowadays larger and larger group of methods, very quickly evolving and having more and more various applications. In the article, our research concerning usefulness of clustering methods in customer segmentation to manage advertisement campaign is presented. We introduce results obtained by using four selected methods which have been chosen because their peculiarities suggested their applicability to our purposes. One of the analyzed method k-means clustering with random selected initial cluster seeds gave very good results in customer segmentation to manage advertisement campaign and these results were presented in details in the article. In contrast one of the methods (hierarchical average linkage was found useless in customer segmentation. Further investigations concerning benefits of clustering methods in customer segmentation to manage advertisement campaign is worth continuing, particularly that finding solutions in this field can give measurable profits for marketing activity.

  2. The Hierarchical Trend Model for property valuation and local price indices

    NARCIS (Netherlands)

    Francke, M.K.; Vos, G.A.

    2002-01-01

    This paper presents a hierarchical trend model (HTM) for selling prices of houses, addressing three main problems: the spatial and temporal dependence of selling prices and the dependency of price index changes on housing quality. In this model the general price trend, cluster-level price trends,

  3. Cluster Oriented Spatio Temporal Multidimensional Data Visualization of Earthquakes in Indonesia

    Directory of Open Access Journals (Sweden)

    Mohammad Nur Shodiq

    2016-03-01

    Full Text Available Spatio temporal data clustering is challenge task. The result of clustering data are utilized to investigate the seismic parameters. Seismic parameters are used to describe the characteristics of earthquake behavior. One of the effective technique to study multidimensional spatio temporal data is visualization. But, visualization of multidimensional data is complicated problem. Because, this analysis consists of observed data cluster and seismic parameters. In this paper, we propose a visualization system, called as IES (Indonesia Earthquake System, for cluster analysis, spatio temporal analysis, and visualize the multidimensional data of seismic parameters. We analyze the cluster analysis by using automatic clustering, that consists of get optimal number of cluster and Hierarchical K-means clustering. We explore the visual cluster and multidimensional data in low dimensional space visualization. We made experiment with observed data, that consists of seismic data around Indonesian archipelago during 2004 to 2014. Keywords: Clustering, visualization, multidimensional data, seismic parameters.

  4. The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    Chen Yidong

    2004-01-01

    Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.

  5. Fast gene ontology based clustering for microarray experiments.

    Science.gov (United States)

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  6. Data Clustering

    Science.gov (United States)

    Wagstaff, Kiri L.

    2012-03-01

    clustering, in which some partial information about item assignments or other components of the resulting output are already known and must be accommodated by the solution. Some algorithms seek a partition of the data set into distinct clusters, while others build a hierarchy of nested clusters that can capture taxonomic relationships. Some produce a single optimal solution, while others construct a probabilistic model of cluster membership. More formally, clustering algorithms operate on a data set X composed of items represented by one or more features (dimensions). These could include physical location, such as right ascension and declination, as well as other properties such as brightness, color, temporal change, size, texture, and so on. Let D be the number of dimensions used to represent each item, xi ∈ RD. The clustering goal is to produce an organization P of the items in X that optimizes an objective function f : P -> R, which quantifies the quality of solution P. Often f is defined so as to maximize similarity within a cluster and minimize similarity between clusters. To that end, many algorithms make use of a measure d : X x X -> R of the distance between two items. A partitioning algorithm produces a set of clusters P = {c1, . . . , ck} such that the clusters are nonoverlapping (c_i intersected with c_j = empty set, i != j) subsets of the data set (Union_i c_i=X). Hierarchical algorithms produce a series of partitions P = {p1, . . . , pn }. For a complete hierarchy, the number of partitions n’= n, the number of items in the data set; the top partition is a single cluster containing all items, and the bottom partition contains n clusters, each containing a single item. For model-based clustering, each cluster c_j is represented by a model m_j , such as the cluster center or a Gaussian distribution. The wide array of available clustering algorithms may seem bewildering, and covering all of them is beyond the scope of this chapter. Choosing among them for a

  7. Using hierarchical clustering of secreted protein families to classify and rank candidate effectors of rust fungi.

    Directory of Open Access Journals (Sweden)

    Diane G O Saunders

    Full Text Available Rust fungi are obligate biotrophic pathogens that cause considerable damage on crop plants. Puccinia graminis f. sp. tritici, the causal agent of wheat stem rust, and Melampsora larici-populina, the poplar leaf rust pathogen, have strong deleterious impacts on wheat and poplar wood production, respectively. Filamentous pathogens such as rust fungi secrete molecules called disease effectors that act as modulators of host cell physiology and can suppress or trigger host immunity. Current knowledge on effectors from other filamentous plant pathogens can be exploited for the characterisation of effectors in the genome of recently sequenced rust fungi. We designed a comprehensive in silico analysis pipeline to identify the putative effector repertoire from the genome of two plant pathogenic rust fungi. The pipeline is based on the observation that known effector proteins from filamentous pathogens have at least one of the following properties: (i contain a secretion signal, (ii are encoded by in planta induced genes, (iii have similarity to haustorial proteins, (iv are small and cysteine rich, (v contain a known effector motif or a nuclear localization signal, (vi are encoded by genes with long intergenic regions, (vii contain internal repeats, and (viii do not contain PFAM domains, except those associated with pathogenicity. We used Markov clustering and hierarchical clustering to classify protein families of rust pathogens and rank them according to their likelihood of being effectors. Using this approach, we identified eight families of candidate effectors that we consider of high value for functional characterization. This study revealed a diverse set of candidate effectors, including families of haustorial expressed secreted proteins and small cysteine-rich proteins. This comprehensive classification of candidate effectors from these devastating rust pathogens is an initial step towards probing plant germplasm for novel resistance components.

  8. Evolution of galaxy cluster scaling and structural properties from XMM observations: probing the physics of structure formation

    International Nuclear Information System (INIS)

    Anokhin, Sergey

    2008-01-01

    Clusters of galaxies are the largest gravitationally bound objects in the Universe. It is possible to study the hierarchical structure formation based on these youngest objects in the Universe. In order to complete the results found with hot clusters, we choose the cold distant galaxy clusters selected from The Southern SHARC catalogue. In the same time, we studied archived galaxy clusters to test the theory and treatment analysis. To study these weak cluster of galaxies, we optimized our treatment analysis: in particular, searching for the best background subtraction and modeling it for our surface brightness profile and spectra. Our results are in a good agreement with Scaling Relation obtained from hot galaxy clusters. (author) [fr

  9. Intraclass Correlation Coefficients in Hierarchical Designs: Evaluation Using Latent Variable Modeling

    Science.gov (United States)

    Raykov, Tenko

    2011-01-01

    Interval estimation of intraclass correlation coefficients in hierarchical designs is discussed within a latent variable modeling framework. A method accomplishing this aim is outlined, which is applicable in two-level studies where participants (or generally lower-order units) are clustered within higher-order units. The procedure can also be…

  10. A method for identifying hierarchical sub-networks / modules and weighting network links based on their similarity in sub-network / module affiliation

    Directory of Open Access Journals (Sweden)

    WenJun Zhang

    2016-06-01

    Full Text Available Some networks, including biological networks, consist of hierarchical sub-networks / modules. Based on my previous study, in present study a method for both identifying hierarchical sub-networks / modules and weighting network links is proposed. It is based on the cluster analysis in which between-node similarity in sets of adjacency nodes is used. Two matrices, linkWeightMat and linkClusterIDs, are achieved by using the algorithm. Two links with both the same weight in linkWeightMat and the same cluster ID in linkClusterIDs belong to the same sub-network / module. Two links with the same weight in linkWeightMat but different cluster IDs in linkClusterIDs belong to two sub-networks / modules at the same hirarchical level. However, a link with an unique cluster ID in linkClusterIDs does not belong to any sub-networks / modules. A sub-network / module of the greater weight is the more connected sub-network / modules. Matlab codes of the algorithm are presented.

  11. Clustering for data mining a data recovery approach

    CERN Document Server

    Mirkin, Boris

    2005-01-01

    Often considered more as an art than a science, the field of clustering has been dominated by learning through examples and by techniques chosen almost through trial-and-error. Even the most popular clustering methods--K-Means for partitioning the data set and Ward's method for hierarchical clustering--have lacked the theoretical attention that would establish a firm relationship between the two methods and relevant interpretation aids.Rather than the traditional set of ad hoc techniques, Clustering for Data Mining: A Data Recovery Approach presents a theory that not only closes gaps in K-Mean

  12. Clustering cliques for graph-based summarization of the biomedical research literature

    DEFF Research Database (Denmark)

    Zhang, Han; Fiszman, Marcelo; Shin, Dongwook

    2013-01-01

    Background: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts).Results: Sem......Rep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm...

  13. Worldwide clustering of the corruption perception

    Science.gov (United States)

    Paulus, Michal; Kristoufek, Ladislav

    2015-06-01

    We inspect a possible clustering structure of the corruption perception among 134 countries. Using the average linkage clustering, we uncover a well-defined hierarchy in the relationships among countries. Four main clusters are identified and they suggest that countries worldwide can be quite well separated according to their perception of corruption. Moreover, we find a strong connection between corruption levels and a stage of development inside the clusters. The ranking of countries according to their corruption perfectly copies the ranking according to the economic performance measured by the gross domestic product per capita of the member states. To the best of our knowledge, this study is the first one to present an application of hierarchical and clustering methods to the specific case of corruption.

  14. Avoiding Boundary Estimates in Hierarchical Linear Models through Weakly Informative Priors

    Science.gov (United States)

    Chung, Yeojin; Rabe-Hesketh, Sophia; Gelman, Andrew; Dorie, Vincent; Liu, Jinchen

    2012-01-01

    Hierarchical or multilevel linear models are widely used for longitudinal or cross-sectional data on students nested in classes and schools, and are particularly important for estimating treatment effects in cluster-randomized trials, multi-site trials, and meta-analyses. The models can allow for variation in treatment effects, as well as…

  15. Forecasting building energy consumption with hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system

    Energy Technology Data Exchange (ETDEWEB)

    Li, Kangji [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China); School of Electricity Information Engineering, Jiangsu University, Zhenjiang 212013 (China); Su, Hongye [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China)

    2010-11-15

    There are several ways to forecast building energy consumption, varying from simple regression to models based on physical principles. In this paper, a new method, namely, the hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system (GA-HANFIS) model is developed. In this model, hierarchical structure decreases the rule base dimension. Both clustering and rule base parameters are optimized by GAs and neural networks (NNs). The model is applied to predict a hotel's daily air conditioning consumption for a period over 3 months. The results obtained by the proposed model are presented and compared with regular method of NNs, which indicates that GA-HANFIS model possesses better performance than NNs in terms of their forecasting accuracy. (author)

  16. Exploitation of Clustering Techniques in Transactional Healthcare Data

    Directory of Open Access Journals (Sweden)

    Naeem Ahmad Mahoto

    2014-03-01

    Full Text Available Healthcare service centres equipped with electronic health systems have improved their resources as well as treatment processes. The dynamic nature of healthcare data of each individual makes it complex and difficult for physicians to manually mediate them; therefore, automatic techniques are essential to manage the quality and standardization of treatment procedures. Exploratory data analysis, patternanalysis and grouping of data is managed using clustering techniques, which work as an unsupervised classification. A number of healthcare applications are developed that use several data mining techniques for classification, clustering and extracting useful information from healthcare data. The challenging issue in this domain is to select adequate data mining algorithm for optimal results. This paper exploits three different clustering algorithms: DBSCAN (Density-Based Clustering, agglomerative hierarchical and k-means in real transactional healthcare data of diabetic patients (taken as case study to analyse their performance in large and dispersed healthcare data. The best solution of cluster sets among the exploited algorithms is evaluated using clustering quality indexes and is selected to identify the possible subgroups of patients having similar treatment patterns

  17. Conceptual hierarchical modeling to describe wetland plant community organization

    Science.gov (United States)

    Little, A.M.; Guntenspergen, G.R.; Allen, T.F.H.

    2010-01-01

    Using multivariate analysis, we created a hierarchical modeling process that describes how differently-scaled environmental factors interact to affect wetland-scale plant community organization in a system of small, isolated wetlands on Mount Desert Island, Maine. We followed the procedure: 1) delineate wetland groups using cluster analysis, 2) identify differently scaled environmental gradients using non-metric multidimensional scaling, 3) order gradient hierarchical levels according to spatiotem-poral scale of fluctuation, and 4) assemble hierarchical model using group relationships with ordination axes and post-hoc tests of environmental differences. Using this process, we determined 1) large wetland size and poor surface water chemistry led to the development of shrub fen wetland vegetation, 2) Sphagnum and water chemistry differences affected fen vs. marsh / sedge meadows status within small wetlands, and 3) small-scale hydrologic differences explained transitions between forested vs. non-forested and marsh vs. sedge meadow vegetation. This hierarchical modeling process can help explain how upper level contextual processes constrain biotic community response to lower-level environmental changes. It creates models with more nuanced spatiotemporal complexity than classification and regression tree procedures. Using this process, wetland scientists will be able to generate more generalizable theories of plant community organization, and useful management models. ?? Society of Wetland Scientists 2009.

  18. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    International Nuclear Information System (INIS)

    Ellefsen, Karl J.; Smith, David B.

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples. - Highlights: • We evaluate a clustering procedure by applying it to geochemical data. • The procedure generates a hierarchy of clusters. • Different levels of the hierarchy show geochemical processes at different spatial scales. • The clustering method is Bayesian finite mixture modeling. • Model parameters are estimated with Hamiltonian Monte Carlo sampling.

  19. Hierarchical architecture of active knits

    International Nuclear Information System (INIS)

    Abel, Julianna; Luntz, Jonathan; Brei, Diann

    2013-01-01

    Nature eloquently utilizes hierarchical structures to form the world around us. Applying the hierarchical architecture paradigm to smart materials can provide a basis for a new genre of actuators which produce complex actuation motions. One promising example of cellular architecture—active knits—provides complex three-dimensional distributed actuation motions with expanded operational performance through a hierarchically organized structure. The hierarchical structure arranges a single fiber of active material, such as shape memory alloys (SMAs), into a cellular network of interlacing adjacent loops according to a knitting grid. This paper defines a four-level hierarchical classification of knit structures: the basic knit loop, knit patterns, grid patterns, and restructured grids. Each level of the hierarchy provides increased architectural complexity, resulting in expanded kinematic actuation motions of active knits. The range of kinematic actuation motions are displayed through experimental examples of different SMA active knits. The results from this paper illustrate and classify the ways in which each level of the hierarchical knit architecture leverages the performance of the base smart material to generate unique actuation motions, providing necessary insight to best exploit this new actuation paradigm. (paper)

  20. Self-similar hierarchical energetics in the ICM of massive galaxy clusters

    Science.gov (United States)

    Miniati, Francesco; Beresnyak, Andrey

    Massive galaxy clusters (GC) are filled with a hot, turbulent and magnetised intra-cluster medium (ICM). They are still forming under the action of gravitational instability, which drives supersonic mass accretion flows. These partially dissipate into heat through a complex network of large scale shocks, and partly excite giant turbulent eddies and cascade. Turbulence dissipation not only contributes to heating of the ICM but also amplifies magnetic energy by way of dynamo action. The pattern of gravitational energy turning into kinetic, thermal, turbulent and magnetic is a fundamental feature of GC hydrodynamics but quantitative modelling has remained a challenge. In this contribution we present results from a recent high resolution, fully cosmological numerical simulation of a massive Coma-like galaxy cluster in which the time dependent turbulent motions of the ICM are resolved (Miniati 2014) and their statistical properties are quantified for the first time (Miniati 2015, Beresnyak & Miniati 2015). We combine these results with independent state-of-the art numerical simulations of MHD turbulence (Beresnyak 2012), which shows that in the nonlinear regime of turbulent dynamo (for magnetic Prandtl numbers>~ 1) the growth rate of the magnetic energy corresponds to a fraction CE ~= 4 - 5 × 10-2 of the turbulent dissipation rate. We thus determine without adjustable parameters the thermal, turbulent and magnetic history of giant GC (Miniati & Beresnyak 2015). We find that the energy components of the ICM are ordered according to a permanent hierarchy, in which the sonic Mach number at the turbulent injection scale is of order unity, the beta of the plasma of order forty and the ratio of turbulent injection scale to Alfvén scale is of order one hundred. These dimensionless numbers remain virtually unaltered throughout the cluster's history, despite evolution of each individual component and the drive towards equipartition of the turbulent dynamo, thus revealing a new

  1. Non-Hierarchical Clustering as a method to analyse an open-ended ...

    African Journals Online (AJOL)

    Apple

    Keywords: algebraic thinking; cluster analysis; mathematics education; quantitative analysis. Introduction. Extensive ..... C1, C2 and C3 represent the three centroids of the three clusters formed. .... 6ALd. All these strategies are algebraic and 'high- ... 1995), of the didactical aspects related to teaching .... Brazil, 18-23 July.

  2. A rapid ATR-FTIR spectroscopic method for detection of sibutramine adulteration in tea and coffee based on hierarchical cluster and principal component analyses.

    Science.gov (United States)

    Cebi, Nur; Yilmaz, Mustafa Tahsin; Sagdic, Osman

    2017-08-15

    Sibutramine may be illicitly included in herbal slimming foods and supplements marketed as "100% natural" to enhance weight loss. Considering public health and legal regulations, there is an urgent need for effective, rapid and reliable techniques to detect sibutramine in dietetic herbal foods, teas and dietary supplements. This research comprehensively explored, for the first time, detection of sibutramine in green tea, green coffee and mixed herbal tea using ATR-FTIR spectroscopic technique combined with chemometrics. Hierarchical cluster analysis and PCA principle component analysis techniques were employed in spectral range (2746-2656cm -1 ) for classification and discrimination through Euclidian distance and Ward's algorithm. Unadulterated and adulterated samples were classified and discriminated with respect to their sibutramine contents with perfect accuracy without any false prediction. The results suggest that existence of the active substance could be successfully determined at the levels in the range of 0.375-12mg in totally 1.75g of green tea, green coffee and mixed herbal tea by using FTIR-ATR technique combined with chemometrics. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Robustness of Multiple Clustering Algorithms on Hyperspectral Images

    National Research Council Canada - National Science Library

    Williams, Jason P

    2007-01-01

    .... Various clustering algorithms were employed, including a hierarchical method, ISODATA, K-means, and X-means, and were used on a simple two dimensional dataset in order to discover potential problems with the algorithms...

  4. Fast Gene Ontology based clustering for microarray experiments

    Directory of Open Access Journals (Sweden)

    Ovaska Kristian

    2008-11-01

    Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  5. Spatial-area selective retrieval of multiple object-place associations in a hierarchical cognitive map formed by theta phase coding.

    Science.gov (United States)

    Sato, Naoyuki; Yamaguchi, Yoko

    2009-06-01

    The human cognitive map is known to be hierarchically organized consisting of a set of perceptually clustered landmarks. Patient studies have demonstrated that these cognitive maps are maintained by the hippocampus, while the neural dynamics are still poorly understood. The authors have shown that the neural dynamic "theta phase precession" observed in the rodent hippocampus may be capable of forming hierarchical cognitive maps in humans. In the model, a visual input sequence consisting of object and scene features in the central and peripheral visual fields, respectively, results in the formation of a hierarchical cognitive map for object-place associations. Surprisingly, it is possible for such a complex memory structure to be formed in a few seconds. In this paper, we evaluate the memory retrieval of object-place associations in the hierarchical network formed by theta phase precession. The results show that multiple object-place associations can be retrieved with the initial cue of a scene input. Importantly, according to the wide-to-narrow unidirectional connections among scene units, the spatial area for object-place retrieval can be controlled by the spatial area of the initial cue input. These results indicate that the hierarchical cognitive maps have computational advantages on a spatial-area selective retrieval of multiple object-place associations. Theta phase precession dynamics is suggested as a fundamental neural mechanism of the human cognitive map.

  6. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    International Nuclear Information System (INIS)

    Harmon, S; Wendelberger, B; Jeraj, R

    2014-01-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [ 18 F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI mean = 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI range : 0.2301–1). Conclusion: Using commonly-used clustering algorithms, we found

  7. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Harmon, S; Wendelberger, B [University of Wisconsin-Madison, Madison, WI (United States); Jeraj, R [University of Wisconsin-Madison, Madison, WI (United States); University of Ljubljana (Slovenia)

    2014-06-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [{sup 18}F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI{sub mean}= 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI{sub range}: 0.2301–1). Conclusion: Using commonly-used clustering algorithms

  8. Self Organizing Maps to efficiently cluster and functionally interpret protein conformational ensembles

    Directory of Open Access Journals (Sweden)

    Fabio Stella

    2013-09-01

    Full Text Available An approach that combines Self-Organizing maps, hierarchical clustering and network components is presented, aimed at comparing protein conformational ensembles obtained from multiple Molecular Dynamic simulations. As a first result the original ensembles can be summarized by using only the representative conformations of the clusters obtained. In addition the network components analysis allows to discover and interpret the dynamic behavior of the conformations won by each neuron. The results showed the ability of this approach to efficiently derive a functional interpretation of the protein dynamics described by the original conformational ensemble, highlighting its potential as a support for protein engineering.

  9. Hierarchical modularization of biochemical pathways using fuzzy-c means clustering.

    Science.gov (United States)

    de Luis Balaguer, Maria A; Williams, Cranos M

    2014-08-01

    Biological systems that are representative of regulatory, metabolic, or signaling pathways can be highly complex. Mathematical models that describe such systems inherit this complexity. As a result, these models can often fail to provide a path toward the intuitive comprehension of these systems. More coarse information that allows a perceptive insight of the system is sometimes needed in combination with the model to understand control hierarchies or lower level functional relationships. In this paper, we present a method to identify relationships between components of dynamic models of biochemical pathways that reside in different functional groups. We find primary relationships and secondary relationships. The secondary relationships reveal connections that are present in the system, which current techniques that only identify primary relationships are unable to show. We also identify how relationships between components dynamically change over time. This results in a method that provides the hierarchy of the relationships among components, which can help us to understand the low level functional structure of the system and to elucidate potential hierarchical control. As a proof of concept, we apply the algorithm to the epidermal growth factor signal transduction pathway, and to the C3 photosynthesis pathway. We identify primary relationships among components that are in agreement with previous computational decomposition studies, and identify secondary relationships that uncover connections among components that current computational approaches were unable to reveal.

  10. Hierarchical clustering of Alzheimer and 'normal' brains using elemental concentrations and glucose metabolism determined by PIXE, INAA and PET

    International Nuclear Information System (INIS)

    Cutts, D.A.; Spyrou, N.M.

    2001-01-01

    Brain tissue samples, obtained from the Alzheimer Disease Brain Bank, Institute of Psychiatry, London, were taken from both left and right hemispheres of three regions of the cerebrum, namely the frontal, parietal and occipital lobes for both Alzheimer and 'normal' subjects. Trace element concentrations in the frontal lobe were determined for twenty six Alzheimer (15 male, 11 female) and twenty six 'normal' (8 male, 18 female) brain tissue samples. In the parietal lobe ten Alzheimer (2 male, 8 female) and ten 'normal' (8 male, 2 female) samples were taken along with ten Alzheimer (4 male, 6 female) and ten 'normal' (6 male, 4 female) from the occipital lobe. For the frontal lobe trace element concentrations were determined using proton induced X-ray emission (PIXE) analysis while in parietal and occipital regions instrumental neutron activation analysis (INAA) was used. Additionally eighteen Alzheimer (9 male, 9 female) and eighteen age matched 'normal' (8 male, 10 female) living subjects were examined using positron emission tomography (PET) in order to determine regional cerebral metabolic rates of glucose (rCMRGlu). The rCMRGlu of 36 regions of the brain was investigated including frontal, occipital and parietal lobes as in the trace element study. Hierarchical cluster analysis was applied to the trace element and glucose metabolism data to discover which variables in the resulting dendrograms displayed the most significant separation between Alzheimer and 'normal' subjects. (author)

  11. Hierarchical temporal structure in music, speech and animal vocalizations: jazz is like a conversation, humpbacks sing like hermit thrushes.

    Science.gov (United States)

    Kello, Christopher T; Bella, Simone Dalla; Médé, Butovens; Balasubramaniam, Ramesh

    2017-10-01

    Humans talk, sing and play music. Some species of birds and whales sing long and complex songs. All these behaviours and sounds exhibit hierarchical structure-syllables and notes are positioned within words and musical phrases, words and motives in sentences and musical phrases, and so on. We developed a new method to measure and compare hierarchical temporal structures in speech, song and music. The method identifies temporal events as peaks in the sound amplitude envelope, and quantifies event clustering across a range of timescales using Allan factor (AF) variance. AF variances were analysed and compared for over 200 different recordings from more than 16 different categories of signals, including recordings of speech in different contexts and languages, musical compositions and performances from different genres. Non-human vocalizations from two bird species and two types of marine mammals were also analysed for comparison. The resulting patterns of AF variance across timescales were distinct to each of four natural categories of complex sound: speech, popular music, classical music and complex animal vocalizations. Comparisons within and across categories indicated that nested clustering in longer timescales was more prominent when prosodic variation was greater, and when sounds came from interactions among individuals, including interactions between speakers, musicians, and even killer whales. Nested clustering also was more prominent for music compared with speech, and reflected beat structure for popular music and self-similarity across timescales for classical music. In summary, hierarchical temporal structures reflect the behavioural and social processes underlying complex vocalizations and musical performances. © 2017 The Author(s).

  12. A Hybrid Fuzzy Multi-hop Unequal Clustering Algorithm for Dense Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Shawkat K. Guirguis

    2017-01-01

    Full Text Available Clustering is carried out to explore and solve power dissipation problem in wireless sensor network (WSN. Hierarchical network architecture, based on clustering, can reduce energy consumption, balance traffic load, improve scalability, and prolong network lifetime. However, clustering faces two main challenges: hotspot problem and searching for effective techniques to perform clustering. This paper introduces a fuzzy unequal clustering technique for heterogeneous dense WSNs to determine both final cluster heads and their radii. Proposed fuzzy system blends three effective parameters together which are: the distance to the base station, the density of the cluster, and the deviation of the noders residual energy from the average network energy. Our objectives are achieving gain for network lifetime, energy distribution, and energy consumption. To evaluate the proposed algorithm, WSN clustering based routing algorithms are analyzed, simulated, and compared with obtained results. These protocols are LEACH, SEP, HEED, EEUC, and MOFCA.

  13. Identifying Hierarchical and Overlapping Protein Complexes Based on Essential Protein-Protein Interactions and “Seed-Expanding” Method

    Directory of Open Access Journals (Sweden)

    Jun Ren

    2014-01-01

    Full Text Available Many evidences have demonstrated that protein complexes are overlapping and hierarchically organized in PPI networks. Meanwhile, the large size of PPI network wants complex detection methods have low time complexity. Up to now, few methods can identify overlapping and hierarchical protein complexes in a PPI network quickly. In this paper, a novel method, called MCSE, is proposed based on λ-module and “seed-expanding.” First, it chooses seeds as essential PPIs or edges with high edge clustering values. Then, it identifies protein complexes by expanding each seed to a λ-module. MCSE is suitable for large PPI networks because of its low time complexity. MCSE can identify overlapping protein complexes naturally because a protein can be visited by different seeds. MCSE uses the parameter λ_th to control the range of seed expanding and can detect a hierarchical organization of protein complexes by tuning the value of λ_th. Experimental results of S. cerevisiae show that this hierarchical organization is similar to that of known complexes in MIPS database. The experimental results also show that MCSE outperforms other previous competing algorithms, such as CPM, CMC, Core-Attachment, Dpclus, HC-PIN, MCL, and NFC, in terms of the functional enrichment and matching with known protein complexes.

  14. A new clustering algorithm for scanning electron microscope images

    Science.gov (United States)

    Yousef, Amr; Duraisamy, Prakash; Karim, Mohammad

    2016-04-01

    A scanning electron microscope (SEM) is a type of electron microscope that produces images of a sample by scanning it with a focused beam of electrons. The electrons interact with the sample atoms, producing various signals that are collected by detectors. The gathered signals contain information about the sample's surface topography and composition. The electron beam is generally scanned in a raster scan pattern, and the beam's position is combined with the detected signal to produce an image. The most common configuration for an SEM produces a single value per pixel, with the results usually rendered as grayscale images. The captured images may be produced with insufficient brightness, anomalous contrast, jagged edges, and poor quality due to low signal-to-noise ratio, grained topography and poor surface details. The segmentation of the SEM images is a tackling problems in the presence of the previously mentioned distortions. In this paper, we are stressing on the clustering of these type of images. In that sense, we evaluate the performance of the well-known unsupervised clustering and classification techniques such as connectivity based clustering (hierarchical clustering), centroid-based clustering, distribution-based clustering and density-based clustering. Furthermore, we propose a new spatial fuzzy clustering technique that works efficiently on this type of images and compare its results against these regular techniques in terms of clustering validation metrics.

  15. Clustering and Bayesian hierarchical modeling for the definition of informative prior distributions in hydrogeology

    Science.gov (United States)

    Cucchi, K.; Kawa, N.; Hesse, F.; Rubin, Y.

    2017-12-01

    In order to reduce uncertainty in the prediction of subsurface flow and transport processes, practitioners should use all data available. However, classic inverse modeling frameworks typically only make use of information contained in in-situ field measurements to provide estimates of hydrogeological parameters. Such hydrogeological information about an aquifer is difficult and costly to acquire. In this data-scarce context, the transfer of ex-situ information coming from previously investigated sites can be critical for improving predictions by better constraining the estimation procedure. Bayesian inverse modeling provides a coherent framework to represent such ex-situ information by virtue of the prior distribution and combine them with in-situ information from the target site. In this study, we present an innovative data-driven approach for defining such informative priors for hydrogeological parameters at the target site. Our approach consists in two steps, both relying on statistical and machine learning methods. The first step is data selection; it consists in selecting sites similar to the target site. We use clustering methods for selecting similar sites based on observable hydrogeological features. The second step is data assimilation; it consists in assimilating data from the selected similar sites into the informative prior. We use a Bayesian hierarchical model to account for inter-site variability and to allow for the assimilation of multiple types of site-specific data. We present the application and validation of the presented methods on an established database of hydrogeological parameters. Data and methods are implemented in the form of an open-source R-package and therefore facilitate easy use by other practitioners.

  16. Clusters of Galaxies

    Science.gov (United States)

    Huchtmeier, W. K.; Richter, O. G.; Materne, J.

    1981-09-01

    The large-scale structure of the universe is dominated by clustering. Most galaxies seem to be members of pairs, groups, clusters, and superclusters. To that degree we are able to recognize a hierarchical structure of the universe. Our local group of galaxies (LG) is centred on two large spiral galaxies: the Andromeda nebula and our own galaxy. Three sr:naller galaxies - like M 33 - and at least 23 dwarf galaxies (KraanKorteweg and Tammann, 1979, Astronomische Nachrichten, 300, 181) can be found in the evironment of these two large galaxies. Neighbouring groups have comparable sizes (about 1 Mpc in extent) and comparable numbers of bright members. Small dwarf galaxies cannot at present be observed at great distances.

  17. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    Science.gov (United States)

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  18. Calculations of light scattering matrices for stochastic ensembles of nanosphere clusters

    International Nuclear Information System (INIS)

    Bunkin, N.F.; Shkirin, A.V.; Suyazov, N.V.; Starosvetskiy, A.V.

    2013-01-01

    Results of the calculation of the light scattering matrices for systems of stochastic nanosphere clusters are presented. A mathematical model of spherical particle clustering with allowance for cluster–cluster aggregation is used. The fractal properties of cluster structures are explored at different values of the model parameter that governs cluster–cluster interaction. General properties of the light scattering matrices of nanosphere-cluster ensembles as dependent on their mean fractal dimension have been found. The scattering-matrix calculations were performed for finite samples of 10 3 random clusters, made up of polydisperse spherical nanoparticles, having lognormal size distribution with the effective radius 50 nm and effective variance 0.02; the mean number of monomers in a cluster and its standard deviation were set to 500 and 70, respectively. The implemented computation environment, modeling the scattering matrices for overall sequences of clusters, is based upon T-matrix program code for a given single cluster of spheres, which was developed in [1]. The ensemble-averaged results have been compared with orientation-averaged ones calculated for individual clusters. -- Highlights: ► We suggested a hierarchical model of cluster growth allowing for cluster–cluster aggregation. ► We analyzed the light scattering by whole ensembles of nanosphere clusters. ► We studied the evolution of the light scattering matrix when changing the fractal dimension

  19. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius

    2017-07-03

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  20. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius; Huser, Raphaë l; Prasad, Avinash

    2017-01-01

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  1. Parallel hierarchical radiosity rendering

    Energy Technology Data Exchange (ETDEWEB)

    Carter, Michael [Iowa State Univ., Ames, IA (United States)

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  2. A semantics-based method for clustering of Chinese web search results

    Science.gov (United States)

    Zhang, Hui; Wang, Deqing; Wang, Li; Bi, Zhuming; Chen, Yong

    2014-01-01

    Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable information, however, is still likely submerged in the ocean of search results from those tools. By clustering the results into different groups based on subjects automatically, a search engine with the clustering feature allows users to select most relevant results quickly. In this paper, we propose an online semantics-based method to cluster Chinese web search results. First, we employ the generalised suffix tree to extract the longest common substrings (LCSs) from search snippets. Second, we use the HowNet to calculate the similarities of the words derived from the LCSs, and extract the most representative features by constructing the vocabulary chain. Third, we construct a vector of text features and calculate snippets' semantic similarities. Finally, we improve the Chameleon algorithm to cluster snippets. Extensive experimental results have shown that the proposed algorithm has outperformed over the suffix tree clustering method and other traditional clustering methods.

  3. Hierarchical cluster analysis and chemical characterisation of Myrtus communis L. essential oil from Yemen region and its antimicrobial, antioxidant and anti-colorectal adenocarcinoma properties.

    Science.gov (United States)

    Anwar, Sirajudheen; Crouch, Rebecca A; Awadh Ali, Nasser A; Al-Fatimi, Mohamed A; Setzer, William N; Wessjohann, Ludger

    2017-09-01

    The hydrodistilled essential oil obtained from the dried leaves of Myrtus communis, collected in Yemen, was analysed by GC-MS. Forty-one compounds were identified, representing 96.3% of the total oil. The major constituents of essential oil were oxygenated monoterpenoids (87.1%), linalool (29.1%), 1,8-cineole (18.4%), α-terpineol (10.8%), geraniol (7.3%) and linalyl acetate (7.4%). The essential oil was assessed for its antimicrobial activity using a disc diffusion assay and resulted in moderate to potent antibacterial and antifungal activities targeting mainly Bacillus subtilis, Staphylococcus aureus and Candida albicans. The oil moderately reduced the diphenylpicrylhydrazyl radical (IC 50  = 4.2 μL/mL or 4.1 mg/mL). In vitro cytotoxicity evaluation against HT29 (human colonic adenocarcinoma cells) showed that the essential oil exhibited a moderate antitumor effect with IC 50 of 110 ± 4 μg/mL. Hierarchical cluster analysis of M. communis has been carried out based on the chemical compositions of 99 samples reported in the literature, including Yemeni sample.

  4. A Link-Based Cluster Ensemble Approach For Improved Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    P.Balaji

    2015-01-01

    Full Text Available Abstract It is difficult from possibilities to select a most suitable effective way of clustering algorithm and its dataset for a defined set of gene expression data because we have a huge number of ways and huge number of gene expressions. At present many researchers are preferring to use hierarchical clustering in different forms this is no more totally optimal. Cluster ensemble research can solve this type of problem by automatically merging multiple data partitions from a wide range of different clusterings of any dimensions to improve both the quality and robustness of the clustering result. But we have many existing ensemble approaches using an association matrix to condense sample-cluster and co-occurrence statistics and relations within the ensemble are encapsulated only at raw level while the existing among clusters are totally discriminated. Finding these missing associations can greatly expand the capability of those ensemble methodologies for microarray data clustering. We propose general K-means cluster ensemble approach for the clustering of general categorical data into required number of partitions.

  5. Exploring the individual patterns of spiritual well-being in people newly diagnosed with advanced cancer: a cluster analysis.

    Science.gov (United States)

    Bai, Mei; Dixon, Jane; Williams, Anna-Leila; Jeon, Sangchoon; Lazenby, Mark; McCorkle, Ruth

    2016-11-01

    Research shows that spiritual well-being correlates positively with quality of life (QOL) for people with cancer, whereas contradictory findings are frequently reported with respect to the differentiated associations between dimensions of spiritual well-being, namely peace, meaning and faith, and QOL. This study aimed to examine individual patterns of spiritual well-being among patients newly diagnosed with advanced cancer. Cluster analysis was based on the twelve items of the 12-item Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being Scale at Time 1. A combination of hierarchical and k-means (non-hierarchical) clustering methods was employed to jointly determine the number of clusters. Self-rated health, depressive symptoms, peace, meaning and faith, and overall QOL were compared at Time 1 and Time 2. Hierarchical and k-means clustering methods both suggested four clusters. Comparison of the four clusters supported statistically significant and clinically meaningful differences in QOL outcomes among clusters while revealing contrasting relations of faith with QOL. Cluster 1, Cluster 3, and Cluster 4 represented high, medium, and low levels of overall QOL, respectively, with correspondingly high, medium, and low levels of peace, meaning, and faith. Cluster 2 was distinguished from other clusters by its medium levels of overall QOL, peace, and meaning and low level of faith. This study provides empirical support for individual difference in response to a newly diagnosed cancer and brings into focus conceptual and methodological challenges associated with the measure of spiritual well-being, which may partly contribute to the attenuated relation between faith and QOL.

  6. Hierarchical Cluster Analysis of Three-Dimensional Reconstructions of Unbiased Sampled Microglia Shows not Continuous Morphological Changes from Stage 1 to 2 after Multiple Dengue Infections in Callithrix penicillata

    Science.gov (United States)

    Diniz, Daniel G.; Silva, Geane O.; Naves, Thaís B.; Fernandes, Taiany N.; Araújo, Sanderson C.; Diniz, José A. P.; de Farias, Luis H. S.; Sosthenes, Marcia C. K.; Diniz, Cristovam G.; Anthony, Daniel C.; da Costa Vasconcelos, Pedro F.; Picanço Diniz, Cristovam W.

    2016-01-01

    It is known that microglial morphology and function are related, but few studies have explored the subtleties of microglial morphological changes in response to specific pathogens. In the present report we quantitated microglia morphological changes in a monkey model of dengue disease with virus CNS invasion. To mimic multiple infections that usually occur in endemic areas, where higher dengue infection incidence and abundant mosquito vectors carrying different serotypes coexist, subjects received once a week subcutaneous injections of DENV3 (genotype III)-infected culture supernatant followed 24 h later by an injection of anti-DENV2 antibody. Control animals received either weekly anti-DENV2 antibodies, or no injections. Brain sections were immunolabeled for DENV3 antigens and IBA-1. Random and systematic microglial samples were taken from the polymorphic layer of dentate gyrus for 3-D reconstructions, where we found intense immunostaining for TNFα and DENV3 virus antigens. We submitted all bi- or multimodal morphological parameters of microglia to hierarchical cluster analysis and found two major morphological phenotypes designated types I and II. Compared to type I (stage 1), type II microglia were more complex; displaying higher number of nodes, processes and trees and larger surface area and volumes (stage 2). Type II microglia were found only in infected monkeys, whereas type I microglia was found in both control and infected subjects. Hierarchical cluster analysis of morphological parameters of 3-D reconstructions of random and systematic selected samples in control and ADE dengue infected monkeys suggests that microglia morphological changes from stage 1 to stage 2 may not be continuous. PMID:27047345

  7. Coping profiles, perceived stress and health-related behaviors: a cluster analysis approach.

    Science.gov (United States)

    Doron, Julie; Trouillet, Raphael; Maneveau, Anaïs; Ninot, Grégory; Neveu, Dorine

    2015-03-01

    Using cluster analytical procedure, this study aimed (i) to determine whether people could be differentiated on the basis of coping profiles (or unique combinations of coping strategies); and (ii) to examine the relationships between these profiles and perceived stress and health-related behaviors. A sample of 578 French students (345 females, 233 males; M(age)= 21.78, SD(age)= 2.21) completed the Perceived Stress Scale-14 ( Bruchon-Schweitzer, 2002), the Brief COPE ( Muller and Spitz, 2003) and a series of items measuring health-related behaviors. A two-phased cluster analytic procedure (i.e. hierarchical and non-hierarchical-k-means) was employed to derive clusters of coping strategy profiles. The results yielded four distinctive coping profiles: High Copers, Adaptive Copers, Avoidant Copers and Low Copers. The results showed that clusters differed significantly in perceived stress and health-related behaviors. High Copers and Avoidant Copers displayed higher levels of perceived stress and engaged more in unhealthy behavior, compared with Adaptive Copers and Low Copers who reported lower levels of stress and engaged more in healthy behaviors. These findings suggested that individuals' relative reliance on some strategies and de-emphasis on others may be a more advantageous way of understanding the manner in which individuals cope with stress. Therefore, cluster analysis approach may provide an advantage over more traditional statistical techniques by identifying distinct coping profiles that might best benefit from interventions. Future research should consider coping profiles to provide a deeper understanding of the relationships between coping strategies and health outcomes and to identify risk groups. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  8. Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data.

    Science.gov (United States)

    Jeon, Jihyoun; Hsu, Li; Gorfine, Malka

    2012-07-01

    Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.

  9. Bulgarian clusters under development: Political framework and results

    Directory of Open Access Journals (Sweden)

    Bankova Yovka

    2011-01-01

    Full Text Available The idea of clusters is not new but nowadays clusters are in a highlight again. Through cluster policies the countries aim at raising their national competitiveness. The paper deals with two objectives - discussion and evaluation of the strategic framework for clusters in Bulgaria and an analysis of the state of Bulgarian clusters. The paper presents briefly general issues concerning the national competitiveness and clusters as being one of the possible instruments to achieve a sustainable competitiveness. The practice of the policy in the EU in the field of clusters is the basis for conclusions about the role of the governments. The second part deals with the strategic framework for the cluster initiatives in Bulgaria and with a selection of indicators about the SMEs and clusters in the country. On this basis a conclusion about the development stage of Bulgarian clusters is derived.

  10. CATCHprofiles: Clustering and Alignment Tool for ChIP Profiles

    DEFF Research Database (Denmark)

    G. G. Nielsen, Fiona; Galschiøt Markus, Kasper; Møllegaard Friborg, Rune

    2012-01-01

    IP-profiling data and detect potentially meaningful patterns, the areas of enrichment must be aligned and clustered, which is an algorithmically and computationally challenging task. We have developed CATCHprofiles, a novel tool for exhaustive pattern detection in ChIP profiling data. CATCHprofiles is built upon...... a computationally efficient implementation for the exhaustive alignment and hierarchical clustering of ChIP profiling data. The tool features a graphical interface for examination and browsing of the clustering results. CATCHprofiles requires no prior knowledge about functional sites, detects known binding patterns...... it an invaluable tool for explorative research based on ChIP profiling data. CATCHprofiles and the CATCH algorithm run on all platforms and is available for free through the CATCH website: http://catch.cmbi.ru.nl/. User support is available by subscribing to the mailing list catch-users@bioinformatics.org....

  11. Coronal Mass Ejection Data Clustering and Visualization of Decision Trees

    Science.gov (United States)

    Ma, Ruizhe; Angryk, Rafal A.; Riley, Pete; Filali Boubrahimi, Soukaina

    2018-05-01

    Coronal mass ejections (CMEs) can be categorized as either “magnetic clouds” (MCs) or non-MCs. Features such as a large magnetic field, low plasma-beta, and low proton temperature suggest that a CME event is also an MC event; however, so far there is neither a definitive method nor an automatic process to distinguish the two. Human labeling is time-consuming, and results can fluctuate owing to the imprecise definition of such events. In this study, we approach the problem of MC and non-MC distinction from a time series data analysis perspective and show how clustering can shed some light on this problem. Although many algorithms exist for traditional data clustering in the Euclidean space, they are not well suited for time series data. Problems such as inadequate distance measure, inaccurate cluster center description, and lack of intuitive cluster representations need to be addressed for effective time series clustering. Our data analysis in this work is twofold: clustering and visualization. For clustering we compared the results from the popular hierarchical agglomerative clustering technique to a distance density clustering heuristic we developed previously for time series data clustering. In both cases, dynamic time warping will be used for similarity measure. For classification as well as visualization, we use decision trees to aggregate single-dimensional clustering results to form a multidimensional time series decision tree, with averaged time series to present each decision. In this study, we achieved modest accuracy and, more importantly, an intuitive interpretation of how different parameters contribute to an MC event.

  12. Formal And Informal Macro-Regional Transport Clusters As A Primary Step In The Design And Implementation Of Cluster-Based Strategies

    Directory of Open Access Journals (Sweden)

    Nežerenko Olga

    2015-09-01

    Full Text Available The aim of the study is the identification of a formal macro-regional transport and logistics cluster and its development trends on a macro-regional level in 2007-2011 by means of the hierarchical cluster analysis. The central approach of the study is based on two concepts: 1 the concept of formal and informal macro-regions, and 2 the concept of clustering which is based on the similarities shared by the countries of a macro-region and tightly related to the concept of macro-region. The authors seek to answer the question whether the formation of a formal transport cluster could provide the BSR a stable competitive position in the global transportation and logistics market.

  13. Evolution of the cluster X-ray luminosity function

    DEFF Research Database (Denmark)

    Mullis, C.R.; Vikhlinin, A.; Henry, J.P.

    2004-01-01

    We report measurements of the cluster X-ray luminosity function out to z = 0.8 based on the final sample of 201 galaxy systems from the 160 Square Degree ROSAT Cluster Survey. There is little evidence for any measurable change in cluster abundance out to z similar to 0.6 at luminosities of less...... than a few times 10(44) h(50)(-2) ergs s(-1) (0.5 - 2.0 keV). However, for 0.6 cluster deficit using integrated number counts...... independently confirm the presence of evolution. Whereas the bulk of the cluster population does not evolve, the most luminous and presumably most massive structures evolve appreciably between z = 0.8 and the present. Interpreted in the context of hierarchical structure formation, we are probing sufficiently...

  14. Methodology сomparative statistical analysis of Russian industry based on cluster analysis

    Directory of Open Access Journals (Sweden)

    Sergey S. Shishulin

    2017-01-01

    Full Text Available The article is devoted to researching of the possibilities of applying multidimensional statistical analysis in the study of industrial production on the basis of comparing its growth rates and structure with other developed and developing countries of the world. The purpose of this article is to determine the optimal set of statistical methods and the results of their application to industrial production data, which would give the best access to the analysis of the result.Data includes such indicators as output, output, gross value added, the number of employed and other indicators of the system of national accounts and operational business statistics. The objects of observation are the industry of the countrys of the Customs Union, the United States, Japan and Erope in 2005-2015. As the research tool used as the simplest methods of transformation, graphical and tabular visualization of data, and methods of statistical analysis. In particular, based on a specialized software package (SPSS, the main components method, discriminant analysis, hierarchical methods of cluster analysis, Ward’s method and k-means were applied.The application of the method of principal components to the initial data makes it possible to substantially and effectively reduce the initial space of industrial production data. Thus, for example, in analyzing the structure of industrial production, the reduction was from fifteen industries to three basic, well-interpreted factors: the relatively extractive industries (with a low degree of processing, high-tech industries and consumer goods (medium-technology sectors. At the same time, as a result of comparison of the results of application of cluster analysis to the initial data and data obtained on the basis of the principal components method, it was established that clustering industrial production data on the basis of new factors significantly improves the results of clustering.As a result of analyzing the parameters of

  15. HIERARCHICAL FRAGMENTATION AND JET-LIKE OUTFLOWS IN IRDC G28.34+0.06: A GROWING MASSIVE PROTOSTAR CLUSTER

    International Nuclear Information System (INIS)

    Wang Ke; Wu Yuefang; Zhang Huawei; Zhang Qizhou

    2011-01-01

    We present Submillimeter Array (SMA) λ = 0.88 mm observations of an infrared dark cloud G28.34+0.06. Located in the quiescent southern part of the G28.34 cloud, the region of interest is a massive (>10 3 M sun ) molecular clump P1 with a luminosity of ∼10 3 L sun , where our previous SMA observations at 1.3 mm have revealed a string of five dust cores of 22-64 M sun along the 1 pc IR-dark filament. The cores are well aligned at a position angle (P.A.) of 48 deg. and regularly spaced at an average projected separation of 0.16 pc. The new high-resolution, high-sensitivity 0.88 mm image further resolves the five cores into 10 compact condensations of 1.4-10.6 M sun , with sizes of a few thousand AU. The spatial structure at clump (∼1 pc) and core (∼0.1 pc) scales indicates a hierarchical fragmentation. While the clump fragmentation is consistent with a cylindrical collapse, the observed fragment masses are much larger than the expected thermal Jeans masses. All the cores are driving CO (3-2) outflows up to 38 km s -1 , the majority of which are bipolar, jet-like outflows. The moderate luminosity of the P1 clump sets a limit on the mass of protostars of 3-7 M sun . Because of the large reservoir of dense molecular gas in the immediate medium and ongoing accretion as evident by the jet-like outflows, we speculate that P1 will grow and eventually form a massive star cluster. This study provides a first glimpse of massive, clustered star formation that currently undergoes through an intermediate-mass stage.

  16. Delineation of Stenotrophomonas maltophilia isolates from cystic fibrosis patients by fatty acid methyl ester profiles and matrix-assisted laser desorption/ionization time-of-flight mass spectra using hierarchical cluster analysis and principal component analysis.

    Science.gov (United States)

    Vidigal, Pedrina Gonçalves; Mosel, Frank; Koehling, Hedda Luise; Mueller, Karl Dieter; Buer, Jan; Rath, Peter Michael; Steinmann, Joerg

    2014-12-01

    Stenotrophomonas maltophilia is an opportunist multidrug-resistant pathogen that causes a wide range of nosocomial infections. Various cystic fibrosis (CF) centres have reported an increasing prevalence of S. maltophilia colonization/infection among patients with this disease. The purpose of this study was to assess specific fingerprints of S. maltophilia isolates from CF patients (n = 71) by investigating fatty acid methyl esters (FAMEs) through gas chromatography (GC) and highly abundant proteins by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), and to compare them with isolates obtained from intensive care unit (ICU) patients (n = 20) and the environment (n = 11). Principal component analysis (PCA) of GC-FAME patterns did not reveal a clustering corresponding to distinct CF, ICU or environmental types. Based on the peak area index, it was observed that S. maltophilia isolates from CF patients produced significantly higher amounts of fatty acids in comparison with ICU patients and the environmental isolates. Hierarchical cluster analysis (HCA) based on the MALDI-TOF MS peak profiles of S. maltophilia revealed the presence of five large clusters, suggesting a high phenotypic diversity. Although HCA of MALDI-TOF mass spectra did not result in distinct clusters predominantly composed of CF isolates, PCA revealed the presence of a distinct cluster composed of S. maltophilia isolates from CF patients. Our data suggest that S. maltophilia colonizing CF patients tend to modify not only their fatty acid patterns but also their protein patterns as a response to adaptation in the unfavourable environment of the CF lung. © 2014 The Authors.

  17. IDENTIFICAÇÃO DE CLUSTERS INTERNACIONAIS COM BASE NAS DIMENSÕES CULTURAIS DE HOFSTEDE. / Identification of international clusters based on the hofstede’s cultural dimensions

    Directory of Open Access Journals (Sweden)

    Valderí de Castro Alcântara1

    2012-08-01

    , K-Means Cluster Analysis and Discriminant Analysis to determine and validate groupings of countries based on Hofstede’s cultural dimensions (Distance Index, Individualism, Masculinity and Uncertainty Avoidance Index. The results led to four clusters: Cluster 1 - countries with masculine culture and individualistic; Cluster 2 - collectivistic and uncertainty averse; Cluster 3 - feminine culture and low hierarchical distance and Cluster 4 - culturewith high hierarchical distance and propensity to uncertainty.

  18. A study of hierarchical structure on South China industrial electricity-consumption correlation

    Science.gov (United States)

    Yao, Can-Zhong; Lin, Ji-Nan; Liu, Xiao-Feng

    2016-02-01

    Based on industrial electricity-consumption data of five southern provinces of China from 2005 to 2013, we study the industrial correlation mechanism with MST (minimal spanning tree) and HT (hierarchical tree) models. First, we comparatively analyze the industrial electricity-consumption correlation structure in pre-crisis and after-crisis period using MST model and Bootstrap technique of statistical reliability test of links. Results exhibit that all industrial electricity-consumption trees of five southern provinces of China in pre-crisis and after-crisis time are in formation of chain, and the "center-periphery structure" of those chain-like trees is consistent with industrial specialization in classical industrial chain theory. Additionally, the industrial structure of some provinces is reorganized and transferred in pre-crisis and after-crisis time. Further, the comparative analysis with hierarchical tree and Bootstrap technique demonstrates that as for both observations of GD and overall NF, the industrial electricity-consumption correlation is non-significant clustered in pre-crisis period, whereas it turns significant clustered in after-crisis time. Therefore we propose that in perspective of electricity-consumption, their industrial structures are directed to optimized organization and global correlation. Finally, the analysis of distance of HTs verifies that industrial reorganization and development may strengthen market integration, coordination and correlation of industrial production. Except GZ, other four provinces have a shorter distance of industrial electricity-consumption correlation in after-crisis period, revealing a better performance of regional specialization and integration.

  19. Synchronous Firefly Algorithm for Cluster Head Selection in WSN

    Directory of Open Access Journals (Sweden)

    Madhusudhanan Baskaran

    2015-01-01

    Full Text Available Wireless Sensor Network (WSN consists of small low-cost, low-power multifunctional nodes interconnected to efficiently aggregate and transmit data to sink. Cluster-based approaches use some nodes as Cluster Heads (CHs and organize WSNs efficiently for aggregation of data and energy saving. A CH conveys information gathered by cluster nodes and aggregates/compresses data before transmitting it to a sink. However, this additional responsibility of the node results in a higher energy drain leading to uneven network degradation. Low Energy Adaptive Clustering Hierarchy (LEACH offsets this by probabilistically rotating cluster heads role among nodes with energy above a set threshold. CH selection in WSN is NP-Hard as optimal data aggregation with efficient energy savings cannot be solved in polynomial time. In this work, a modified firefly heuristic, synchronous firefly algorithm, is proposed to improve the network performance. Extensive simulation shows the proposed technique to perform well compared to LEACH and energy-efficient hierarchical clustering. Simulations show the effectiveness of the proposed method in decreasing the packet loss ratio by an average of 9.63% and improving the energy efficiency of the network when compared to LEACH and EEHC.

  20. tclust: An R Package for a Trimming Approach to Cluster Analysis

    Directory of Open Access Journals (Sweden)

    2012-04-01

    Full Text Available Outlying data can heavily influence standard clustering methods. At the same time, clustering principles can be useful when robustifying statistical procedures. These two reasons motivate the development of feasible robust model-based clustering approaches. With this in mind, an R package for performing non-hierarchical robust clustering, called tclust, is presented here. Instead of trying to “fit” noisy data, a proportion α of the most outlying observations is trimmed. The tclust package efficiently handles different cluster scatter constraints. Graphical exploratory tools are also provided to help the user make sensible choices for the trimming proportion as well as the number of clusters to search for.

  1. Cluster fusion-fission dynamics in the Singapore stock exchange

    Science.gov (United States)

    Teh, Boon Kin; Cheong, Siew Ann

    2015-10-01

    In this paper, we investigate how the cross-correlations between stocks in the Singapore stock exchange (SGX) evolve over 2008 and 2009 within overlapping one-month time windows. In particular, we examine how these cross-correlations change before, during, and after the Sep-Oct 2008 Lehman Brothers Crisis. To do this, we extend the complete-linkage hierarchical clustering algorithm, to obtain robust clusters of stocks with stronger intracluster correlations, and weaker intercluster correlations. After we identify the robust clusters in all time windows, we visualize how these change in the form of a fusion-fission diagram. Such a diagram depicts graphically how the cluster sizes evolve, the exchange of stocks between clusters, as well as how strongly the clusters mix. From the fusion-fission diagram, we see a giant cluster growing and disintegrating in the SGX, up till the Lehman Brothers Crisis in September 2008 and the market crashes of October 2008. After the Lehman Brothers Crisis, clusters in the SGX remain small for few months before giant clusters emerge once again. In the aftermath of the crisis, we also find strong mixing of component stocks between clusters. As a result, the correlation between initially strongly-correlated pairs of stocks decay exponentially with average life time of about a month. These observations impact strongly how portfolios and trading strategies should be formulated.

  2. XCluSim: a visual analytics tool for interactively comparing multiple clustering results of bioinformatics data

    Science.gov (United States)

    2015-01-01

    Background Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them. However, such a comparison task with multiple clustering results is cognitively demanding and laborious. Results In this paper, we present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results based on the Visual Information Seeking Mantra. We build a taxonomy for categorizing existing techniques of clustering results visualization in terms of the Gestalt principles of grouping. Using the taxonomy, we choose the most appropriate interactive visualizations for presenting individual clustering results from different types of clustering algorithms. The efficacy of XCluSim is shown through case studies with a bioinformatician. Conclusions Compared to other relevant tools, XCluSim enables users to compare multiple clustering results in a more scalable manner. Moreover, XCluSim supports diverse clustering algorithms and dedicated visualizations and interactions for different types of clustering results, allowing more effective exploration of details on demand. Through case studies with a bioinformatics researcher, we received positive feedback on the functionalities of XCluSim, including its ability to help identify stably clustered items across multiple clustering results. PMID:26328893

  3. A roadmap of clustering algorithms: finding a match for a biomedical application.

    Science.gov (United States)

    Andreopoulos, Bill; An, Aijun; Wang, Xiaogang; Schroeder, Michael

    2009-05-01

    Clustering is ubiquitously applied in bioinformatics with hierarchical clustering and k-means partitioning being the most popular methods. Numerous improvements of these two clustering methods have been introduced, as well as completely different approaches such as grid-based, density-based and model-based clustering. For improved bioinformatics analysis of data, it is important to match clusterings to the requirements of a biomedical application. In this article, we present a set of desirable clustering features that are used as evaluation criteria for clustering algorithms. We review 40 different clustering algorithms of all approaches and datatypes. We compare algorithms on the basis of desirable clustering features, and outline algorithms' benefits and drawbacks as a basis for matching them to biomedical applications.

  4. Accurate detection of hierarchical communities in complex networks based on nonlinear dynamical evolution

    Science.gov (United States)

    Zhuo, Zhao; Cai, Shi-Min; Tang, Ming; Lai, Ying-Cheng

    2018-04-01

    One of the most challenging problems in network science is to accurately detect communities at distinct hierarchical scales. Most existing methods are based on structural analysis and manipulation, which are NP-hard. We articulate an alternative, dynamical evolution-based approach to the problem. The basic principle is to computationally implement a nonlinear dynamical process on all nodes in the network with a general coupling scheme, creating a networked dynamical system. Under a proper system setting and with an adjustable control parameter, the community structure of the network would "come out" or emerge naturally from the dynamical evolution of the system. As the control parameter is systematically varied, the community hierarchies at different scales can be revealed. As a concrete example of this general principle, we exploit clustered synchronization as a dynamical mechanism through which the hierarchical community structure can be uncovered. In particular, for quite arbitrary choices of the nonlinear nodal dynamics and coupling scheme, decreasing the coupling parameter from the global synchronization regime, in which the dynamical states of all nodes are perfectly synchronized, can lead to a weaker type of synchronization organized as clusters. We demonstrate the existence of optimal choices of the coupling parameter for which the synchronization clusters encode accurate information about the hierarchical community structure of the network. We test and validate our method using a standard class of benchmark modular networks with two distinct hierarchies of communities and a number of empirical networks arising from the real world. Our method is computationally extremely efficient, eliminating completely the NP-hard difficulty associated with previous methods. The basic principle of exploiting dynamical evolution to uncover hidden community organizations at different scales represents a "game-change" type of approach to addressing the problem of community

  5. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng

    2016-08-26

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated protein structure classifications (e.g., SCOP and CATH) are very successful, but recently suffer the slow updating problem because of the increased throughput of newly solved protein structures. Thus, fully automatic methods to cluster proteins in the protein structure space have been designed and developed. In this study, we observed that the SCOP superfamilies are highly consistent with clustering trees representing hierarchical clustering procedures, but the tree cutting is very challenging and becomes the bottleneck of clustering accuracy. To overcome this challenge, we proposed a novel density-based K-nearest uphill clustering method that effectively eliminates noisy pairwise protein structure similarities and identifies density peaks as cluster centers. Specifically, the density peaks are identified based on K-nearest uphills (i.e., proteins with higher densities) and K-nearest neighbors. To our knowledge, this is the first attempt to apply and develop density-based clustering methods in the protein structure space. Our results show that our density-based clustering method outperforms the state-of-the-art clustering methods previously applied to the problem. Moreover, we observed that computational methods and human experts could produce highly similar clusters at high precision values, while computational methods also suggest to split some large superfamilies into smaller clusters. © 2016 Elsevier B.V.

  6. Au55, a stable glassy cluster: results of ab initio calculations

    Directory of Open Access Journals (Sweden)

    Dieter Vollath

    2017-10-01

    Full Text Available Structure and properties of small nanoparticles are still under discussion. Moreover, some thermodynamic properties and the structural behavior still remain partially unknown. One of the best investigated nanoparticles is the Au55 cluster, which has been analyzed experimentally and theoretically. However, up to now, the results of these studies are still inconsistent. Consequently, we have carried out the present ab initio study of the Au55 cluster, using up-to-date computational concepts, in order to clarify these issues. Our calculations have confirmed the experimental result that the thermodynamically most stable structure is not crystalline, but it is glassy. The non-crystalline structure of this cluster was validated by comparison of the coordination numbers with those of a crystalline cluster. It was found that, in contrast to bulk materials, glass formation is connected to an energy release that is close to the melting enthalpy of bulk gold. Additionally, the surface energy of this cluster was calculated using two different theoretical approaches resulting in values close to the surface energy for bulk gold. It shall be emphasized that it is now possible to give a confidence interval for the value of the surface energy.

  7. Clustering Suicide Attempters: Impulsive-Ambivalent, Well-Planned, or Frequent.

    Science.gov (United States)

    Lopez-Castroman, Jorge; Nogue, Erika; Guillaume, Sebastien; Picot, Marie Christine; Courtet, Philippe

    2016-06-01

    Attempts to predict suicidal behavior within high-risk populations have so far shown insufficient accuracy. Although several psychosocial and clinical features have been consistently associated with suicide attempts, investigations of latent structure in well-characterized populations of suicide attempters are lacking. We analyzed a sample of 1,009 hospitalized suicide attempters that were recruited between 1999 and 2012. Eleven clinically relevant items related to the characteristics of suicidal behavior were submitted to a Hierarchical Ascendant Classification. Phenotypic profiles were compared between the resulting clusters. A decisional tree was constructed to facilitate the differentiation of individuals classified within the first 2 clusters. Most individuals were included in a cluster characterized by less lethal means and planning ("impulse-ambivalent"). A second cluster featured more carefully planned attempts ("well-planned"), more alcohol or drug use before the attempt, and more precautions to avoid interruptions. Finally, a small, third cluster included individuals reporting more attempts ("frequent"), more often serious or violent attempts, and an earlier age at first attempt. Differences across clusters by demographic and clinical characteristics were also found, particularly with the third cluster whose participants had experienced high levels of childhood abuse. Cluster analysis consistently supported 3 distinct clusters of individuals with specific features in their suicidal behaviors and phenotypic profiles that could help clinicians to better focus prevention strategies. © Copyright 2016 Physicians Postgraduate Press, Inc.

  8. Clusters and Groups of Galaxies : International Meeting

    CERN Document Server

    Giuricin, G; Mezzetti, M

    1984-01-01

    The large-scale structure of the Universe and systems Clusters, and Groups of galaxies are topics like Superclusters, They fully justify the meeting on "Clusters of great interest. and Groups of Galaxies". The topics covered included the spatial distribution and the clustering of galaxies; the properties of Superclusters, Clusters and Groups of galaxies; radio and X-ray observations; the problem of unseen matter; theories concerning hierarchical clustering, pancakes, cluster and galaxy formation and evolution. The meeting was held at the International Center for Theoretical Physics in Trieste (Italy) from September 13 to September 16, 1983. It was attended by about 150 participants from 22 nations who presented 67 invited lectures (il) and contributed papers (cp), and 45 poster papers (pp). The Scientific Organizing Committee consisted of F. Bertola, P. Biermann, A. Cavaliere, N. Dallaporta, D. Gerba1, M. Hack, J . V . Peach, D. Sciama (Chairman), G. Setti, M. Tarenghi. We are particularly indebted to D. Scia...

  9. Stochastic clustering of material surface under high-heat plasma load

    Science.gov (United States)

    Budaev, Viacheslav P.

    2017-11-01

    The results of a study of a surface formed by high-temperature plasma loads on various materials such as tungsten, carbon and stainless steel are presented. High-temperature plasma irradiation leads to an inhomogeneous stochastic clustering of the surface with self-similar granularity - fractality on the scale from nanoscale to macroscales. Cauliflower-like structure of tungsten and carbon materials are formed under high heat plasma load in fusion devices. The statistical characteristics of hierarchical granularity and scale invariance are estimated. They differ qualitatively from the roughness of the ordinary Brownian surface, which is possibly due to the universal mechanisms of stochastic clustering of material surface under the influence of high-temperature plasma.

  10. Intracluster age gradients in numerous young stellar clusters

    Science.gov (United States)

    Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Bate, M. R.; Broos, P. S.; Garmire, G. P.

    2018-05-01

    The pace and pattern of star formation leading to rich young stellar clusters is quite uncertain. In this context, we analyse the spatial distribution of ages within 19 young (median t ≲ 3 Myr on the Siess et al. time-scale), morphologically simple, isolated, and relatively rich stellar clusters. Our analysis is based on young stellar object (YSO) samples from the Massive Young Star-Forming Complex Study in Infrared and X-ray and Star Formation in Nearby Clouds surveys, and a new estimator of pre-main sequence (PMS) stellar ages, AgeJX, derived from X-ray and near-infrared photometric data. Median cluster ages are computed within four annular subregions of the clusters. We confirm and extend the earlier result of Getman et al. (2014): 80 per cent of the clusters show age trends where stars in cluster cores are younger than in outer regions. Our cluster stacking analyses establish the existence of an age gradient to high statistical significance in several ways. Time-scales vary with the choice of PMS evolutionary model; the inferred median age gradient across the studied clusters ranges from 0.75 to 1.5 Myr pc-1. The empirical finding reported in the present study - late or continuing formation of stars in the cores of star clusters with older stars dispersed in the outer regions - has a strong foundation with other observational studies and with the astrophysical models like the global hierarchical collapse model of Vázquez-Semadeni et al.

  11. Simultaneous determination of 19 flavonoids in commercial trollflowers by using high-performance liquid chromatography and classification of samples by hierarchical clustering analysis.

    Science.gov (United States)

    Song, Zhiling; Hashi, Yuki; Sun, Hongyang; Liang, Yi; Lan, Yuexiang; Wang, Hong; Chen, Shizhong

    2013-12-01

    The flowers of Trollius species, named Jin Lianhua in Chinese, are widely used traditional Chinese herbs with vital biological activity that has been used for several decades in China to treat upper respiratory infections, pharyngitis, tonsillitis, and bronchitis. We developed a rapid and reliable method for simultaneous quantitative analysis of 19 flavonoids in trollflowers by using high-performance liquid chromatography (HPLC). Chromatography was performed on Inertsil ODS-3 C18 column, with gradient elution methanol-acetonitrile-water with 0.02% (v/v) formic acid. Content determination was used to evaluate the quality of commercial trollflowers from different regions in China, while three Trollius species (Trollius chinensis Bunge, Trollius ledebouri Reichb, Trollius buddae Schipcz) were explicitly distinguished by using hierarchical clustering analysis. The linearity, precision, accuracy, limit of detection, and limit of quantification were validated for the quantification method, which proved sensitive, accurate and reproducible indicating that the proposed approach was applicable for the routine analysis and quality control of trollflowers. © 2013.

  12. Superhydrophobic polytetrafluoroethylene thin films with hierarchical roughness deposited using a single step vapor phase technique

    International Nuclear Information System (INIS)

    Gupta, Sushant; Arjunan, Arul Chakkaravarthi; Deshpande, Sameer; Seal, Sudipta; Singh, Deepika; Singh, Rajiv K.

    2009-01-01

    Superhydrophobic polytetrafluoroethylene films with hierarchical surface roughness were deposited using pulse electron deposition technique. We were able to modulate roughness of the deposited films by controlling the beam energy and hence the electron penetration depth. The films deposited at higher beam energy showed contact angle as high as 166 o . The scanning electron and atomic force microscope studies revealed clustered growth and two level sub-micron asperities on films deposited at higher energies. Such dual-scale hierarchical roughness and heterogeneities at the water-surface interface was attributed to the observed contact angle and thus its superhydrophobic nature.

  13. Superhydrophobic polytetrafluoroethylene thin films with hierarchical roughness deposited using a single step vapor phase technique

    Energy Technology Data Exchange (ETDEWEB)

    Gupta, Sushant, E-mail: sushant3@ufl.ed [Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611 (United States); Arjunan, Arul Chakkaravarthi [Sinmat Incorporated, 2153 SE Hawthorne Road, 129, Gainesville, Florida 32641 (United States); Deshpande, Sameer; Seal, Sudipta [Advanced Material Processing and Analysis Center, University of Central Florida, Orlando, Florida 32816 (United States); Singh, Deepika [Sinmat Incorporated, 2153 SE Hawthorne Road, 129, Gainesville, Florida 32641 (United States); Singh, Rajiv K. [Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611 (United States)

    2009-06-30

    Superhydrophobic polytetrafluoroethylene films with hierarchical surface roughness were deposited using pulse electron deposition technique. We were able to modulate roughness of the deposited films by controlling the beam energy and hence the electron penetration depth. The films deposited at higher beam energy showed contact angle as high as 166{sup o}. The scanning electron and atomic force microscope studies revealed clustered growth and two level sub-micron asperities on films deposited at higher energies. Such dual-scale hierarchical roughness and heterogeneities at the water-surface interface was attributed to the observed contact angle and thus its superhydrophobic nature.

  14. Investigation of major international and Turkish companies via hierarchical methods and bootstrap approach

    Science.gov (United States)

    Kantar, E.; Deviren, B.; Keskin, M.

    2011-11-01

    We present a study, within the scope of econophysics, of the hierarchical structure of 98 among the largest international companies including 18 among the largest Turkish companies, namely Banks, Automobile, Software-hardware, Telecommunication Services, Energy and the Oil-Gas sectors, viewed as a network of interacting companies. We analyze the daily time series data of the Boerse-Frankfurt and Istanbul Stock Exchange. We examine the topological properties among the companies over the period 2006-2010 by using the concept of hierarchical structure methods (the minimal spanning tree (MST) and the hierarchical tree (HT)). The period is divided into three subperiods, namely 2006-2007, 2008 which was the year of global economic crisis, and 2009-2010, in order to test various time-windows and observe temporal evolution. We carry out bootstrap analyses to associate the value of statistical reliability to the links of the MSTs and HTs. We also use average linkage clustering analysis (ALCA) in order to better observe the cluster structure. From these studies, we find that the interactions among the Banks/Energy sectors and the other sectors were reduced after the global economic crisis; hence the effects of the Banks and Energy sectors on the correlations of all companies were decreased. Telecommunication Services were also greatly affected by the crisis. We also observed that the Automobile and Banks sectors, including Turkish companies as well as some companies from the USA, Japan and Germany were strongly correlated with each other in all periods.

  15. Processing of hierarchical syntactic structure in music.

    Science.gov (United States)

    Koelsch, Stefan; Rohrmeier, Martin; Torrecuso, Renzo; Jentschke, Sebastian

    2013-09-17

    Hierarchical structure with nested nonlocal dependencies is a key feature of human language and can be identified theoretically in most pieces of tonal music. However, previous studies have argued against the perception of such structures in music. Here, we show processing of nonlocal dependencies in music. We presented chorales by J. S. Bach and modified versions in which the hierarchical structure was rendered irregular whereas the local structure was kept intact. Brain electric responses differed between regular and irregular hierarchical structures, in both musicians and nonmusicians. This finding indicates that, when listening to music, humans apply cognitive processes that are capable of dealing with long-distance dependencies resulting from hierarchically organized syntactic structures. Our results reveal that a brain mechanism fundamental for syntactic processing is engaged during the perception of music, indicating that processing of hierarchical structure with nested nonlocal dependencies is not just a key component of human language, but a multidomain capacity of human cognition.

  16. THE XMM CLUSTER SURVEY: THE BUILD-UP OF STELLAR MASS IN BRIGHTEST CLUSTER GALAXIES AT HIGH REDSHIFT

    International Nuclear Information System (INIS)

    Stott, J. P.; Collins, C. A.; Hilton, M.; Capozzi, D.; Sahlen, M.; Lloyd-Davies, E.; Hosmer, M.; Liddle, A. R.; Mehrtens, N.; Romer, A. K.; Miller, C. J.; Stanford, S. A.; Viana, P. T. P.; Davidson, M.; Hoyle, B.; Kay, S. T.; Nichol, R. C.

    2010-01-01

    We present deep J- and K s -band photometry of 20 high redshift galaxy clusters between z = 0.8 and1.5, 19 of which are observed with the MOIRCS instrument on the Subaru telescope. By using near-infrared light as a proxy for stellar mass we find the surprising result that the average stellar mass of Brightest Cluster Galaxies (BCGs) has remained constant at ∼9 x 10 11 M sun since z ∼ 1.5. We investigate the effect on this result of differing star formation histories generated by three well-known and independent stellar population codes and find it to be robust for reasonable, physically motivated choices of age and metallicity. By performing Monte Carlo simulations we find that the result is unaffected by any correlation between BCG mass and cluster mass in either the observed or model clusters. The large stellar masses imply that the assemblage of these galaxies took place at the same time as the initial burst of star formation. This result leads us to conclude that dry merging has had little effect on the average stellar mass of BCGs over the last 9-10 Gyr in stark contrast to the predictions of semi-analytic models, based on the hierarchical merging of dark matter halos, which predict a more protracted mass build-up over a Hubble time. However, we discuss that there is potential for reconciliation between observation and theory if there is a significant growth of material in the intracluster light over the same period.

  17. Hierarchical clustering of gene expression patterns in the Eomes + lineage of excitatory neurons during early neocortical development

    Directory of Open Access Journals (Sweden)

    Cameron David A

    2012-08-01

    Full Text Available Abstract Background Cortical neurons display dynamic patterns of gene expression during the coincident processes of differentiation and migration through the developing cerebrum. To identify genes selectively expressed by the Eomes + (Tbr2 lineage of excitatory cortical neurons, GFP-expressing cells from Tg(Eomes::eGFP Gsat embryos were isolated to > 99% purity and profiled. Results We report the identification, validation and spatial grouping of genes selectively expressed within the Eomes + cortical excitatory neuron lineage during early cortical development. In these neurons 475 genes were expressed ≥ 3-fold, and 534 genes ≤ 3-fold, compared to the reference population of neuronal precursors. Of the up-regulated genes, 328 were represented at the Genepaint in situ hybridization database and 317 (97% were validated as having spatial expression patterns consistent with the lineage of differentiating excitatory neurons. A novel approach for quantifying in situ hybridization patterns (QISP across the cerebral wall was developed that allowed the hierarchical clustering of genes into putative co-regulated groups. Forty four candidate genes were identified that show spatial expression with Intermediate Precursor Cells, 49 candidate genes show spatial expression with Multipolar Neurons, while the remaining 224 genes achieved peak expression in the developing cortical plate. Conclusions This analysis of differentiating excitatory neurons revealed the expression patterns of 37 transcription factors, many chemotropic signaling molecules (including the Semaphorin, Netrin and Slit signaling pathways, and unexpected evidence for non-canonical neurotransmitter signaling and changes in mechanisms of glucose metabolism. Over half of the 317 identified genes are associated with neuronal disease making these findings a valuable resource for studies of neurological development and disease.

  18. A hybrid clustering approach to recognition of protein families in 114 microbial genomes

    Directory of Open Access Journals (Sweden)

    Gogarten J Peter

    2004-04-01

    Full Text Available Abstract Background Grouping proteins into sequence-based clusters is a fundamental step in many bioinformatic analyses (e.g., homology-based prediction of structure or function. Standard clustering methods such as single-linkage clustering capture a history of cluster topologies as a function of threshold, but in practice their usefulness is limited because unrelated sequences join clusters before biologically meaningful families are fully constituted, e.g. as the result of matches to so-called promiscuous domains. Use of the Markov Cluster algorithm avoids this non-specificity, but does not preserve topological or threshold information about protein families. Results We describe a hybrid approach to sequence-based clustering of proteins that combines the advantages of standard and Markov clustering. We have implemented this hybrid approach over a relational database environment, and describe its application to clustering a large subset of PDB, and to 328577 proteins from 114 fully sequenced microbial genomes. To demonstrate utility with difficult problems, we show that hybrid clustering allows us to constitute the paralogous family of ATP synthase F1 rotary motor subunits into a single, biologically interpretable hierarchical grouping that was not accessible using either single-linkage or Markov clustering alone. We describe validation of this method by hybrid clustering of PDB and mapping SCOP families and domains onto the resulting clusters. Conclusion Hybrid (Markov followed by single-linkage clustering combines the advantages of the Markov Cluster algorithm (avoidance of non-specific clusters resulting from matches to promiscuous domains and single-linkage clustering (preservation of topological information as a function of threshold. Within the individual Markov clusters, single-linkage clustering is a more-precise instrument, discerning sub-clusters of biological relevance. Our hybrid approach thus provides a computationally efficient

  19. Clusters of galaxies as tools in observational cosmology : results from x-ray analysis

    International Nuclear Information System (INIS)

    Weratschnig, J.M.

    2009-01-01

    Clusters of galaxies are the largest gravitationally bound structures in the universe. They can be used as ideal tools to study large scale structure formation (e.g. when studying merger clusters) and provide highly interesting environments to analyse several characteristic interaction processes (like ram pressure stripping of galaxies, magnetic fields). In this dissertation thesis, we have studied several clusters of galaxies using X-ray observations. To obtain scientific results, we have applied different data reduction and analysis methods. With a combination of morphological and spectral analysis, the merger cluster Abell 514 was studied in much detail. It has a highly interesting morphology and shows signs for an ongoing merger as well as a shock. using a new method to detect substructure, we have analysed several clusters to determine whether any substructure is present in the X-ray image. This hints towards a real structure in the distribution of the intra-cluster medium (ICM) and is evidence for ongoing mergers. The results from this analysis are extensively used with the cluster of galaxies Abell S1136. Here, we study the ICM distribution and compare its structure with the spatial distribution of star forming galaxies. Cluster magnetic fields are another important topic of my thesis. They can be studied in Radio observations, which can be put into relation with results from X-ray observations. using observational data from several clusters, we could support the theory that cluster magnetic fields are frozen into the ICM. (author)

  20. Geographical Characterization of Tunisian Olive Tree Leaves (cv. Chemlali) Using HPLC-ESI-TOF and IT/MS Fingerprinting with Hierarchical Cluster Analysis

    Science.gov (United States)

    Arráez Román, David; Gómez Caravaca, Ana María; Zarrouk, Mokhtar

    2018-01-01

    The olive plant has been extensively studied for its nutritional value, whereas its leaves have been specifically recognized as a processing by-product. Leaves are considered by-products of olive farming, representing a significant material arriving to the olive mill. They have been considered for centuries as an important herbal remedy in Mediterranean countries. Their beneficial properties are generally attributed to the presence of a range of phytochemicals such as secoiridoids, triterpenes, lignans, and flavonoids. With the aim to study the impact of geographical location on the phenolic compounds, Olea europaea leaves were handpicked from the Tunisian cultivar “Chemlali” from nine regions in the north, center, and south of Tunisia. The ground leaves were then extracted with methanol : water 80% (v/v) and analyzed by using high-performance liquid chromatography coupled to electrospray time of flight and ion trap mass spectrometry analyzers. A total of 38 compounds could be identified. Their contents showed significant variation among samples from different regions. Hierarchical cluster analysis was applied to highlight similarities in the phytochemical composition observed between the samples of different regions. PMID:29725553

  1. Hopfield-K-Means clustering algorithm: A proposal for the segmentation of electricity customers

    Energy Technology Data Exchange (ETDEWEB)

    Lopez, Jose J.; Aguado, Jose A.; Martin, F.; Munoz, F.; Rodriguez, A.; Ruiz, Jose E. [Department of Electrical Engineering, University of Malaga, C/ Dr. Ortiz Ramos, sn., Escuela de Ingenierias, 29071 Malaga (Spain)

    2011-02-15

    Customer classification aims at providing electric utilities with a volume of information to enable them to establish different types of tariffs. Several methods have been used to segment electricity customers, including, among others, the hierarchical clustering, Modified Follow the Leader and K-Means methods. These, however, entail problems with the pre-allocation of the number of clusters (Follow the Leader), randomness of the solution (K-Means) and improvement of the solution obtained (hierarchical algorithm). Another segmentation method used is Hopfield's autonomous recurrent neural network, although the solution obtained only guarantees that it is a local minimum. In this paper, we present the Hopfield-K-Means algorithm in order to overcome these limitations. This approach eliminates the randomness of the initial solution provided by K-Means based algorithms and it moves closer to the global optimun. The proposed algorithm is also compared against other customer segmentation and characterization techniques, on the basis of relative validation indexes. Finally, the results obtained by this algorithm with a set of 230 electricity customers (residential, industrial and administrative) are presented. (author)

  2. Hopfield-K-Means clustering algorithm: A proposal for the segmentation of electricity customers

    International Nuclear Information System (INIS)

    Lopez, Jose J.; Aguado, Jose A.; Martin, F.; Munoz, F.; Rodriguez, A.; Ruiz, Jose E.

    2011-01-01

    Customer classification aims at providing electric utilities with a volume of information to enable them to establish different types of tariffs. Several methods have been used to segment electricity customers, including, among others, the hierarchical clustering, Modified Follow the Leader and K-Means methods. These, however, entail problems with the pre-allocation of the number of clusters (Follow the Leader), randomness of the solution (K-Means) and improvement of the solution obtained (hierarchical algorithm). Another segmentation method used is Hopfield's autonomous recurrent neural network, although the solution obtained only guarantees that it is a local minimum. In this paper, we present the Hopfield-K-Means algorithm in order to overcome these limitations. This approach eliminates the randomness of the initial solution provided by K-Means based algorithms and it moves closer to the global optimun. The proposed algorithm is also compared against other customer segmentation and characterization techniques, on the basis of relative validation indexes. Finally, the results obtained by this algorithm with a set of 230 electricity customers (residential, industrial and administrative) are presented. (author)

  3. Modulated modularity clustering as an exploratory tool for functional genomic inference.

    Directory of Open Access Journals (Sweden)

    Eric A Stone

    2009-05-01

    Full Text Available In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC, seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation.

  4. DOCUMENT REPRESENTATION FOR CLUSTERING OF SCIENTIFIC ABSTRACTS

    Directory of Open Access Journals (Sweden)

    S. V. Popova

    2014-01-01

    Full Text Available The key issue of the present paper is clustering of narrow-domain short texts, such as scientific abstracts. The work is based on the observations made when improving the performance of key phrase extraction algorithm. An extended stop-words list was used that was built automatically for the purposes of key phrase extraction and gave the possibility for a considerable quality enhancement of the phrases extracted from scientific publications. A description of the stop- words list creation procedure is given. The main objective is to investigate the possibilities to increase the performance and/or speed of clustering by the above-mentioned list of stop-words as well as information about lexeme parts of speech. In the latter case a vocabulary is applied for the document representation, which contains not all the words that occurred in the collection, but only nouns and adjectives or their sequences encountered in the documents. Two base clustering algorithms are applied: k-means and hierarchical clustering (average agglomerative method. The results show that the use of an extended stop-words list and adjective-noun document representation makes it possible to improve the performance and speed of k-means clustering. In a similar case for average agglomerative method a decline in performance quality may be observed. It is shown that the use of adjective-noun sequences for document representation lowers the clustering quality for both algorithms and can be justified only when a considerable reduction of feature space dimensionality is necessary.

  5. Comparisons of Flow Patterns over a Hierarchical and a Non-hierarchical Surface in Relation to Biofouling Control

    Directory of Open Access Journals (Sweden)

    Bin Ahmad Fawzan Mohammed Ridha

    2018-01-01

    Full Text Available Biofouling can be defined as unwanted deposition and development of organisms on submerged surfaces. It is a major problem as it causes water contamination, infrastructures damage and increase in maintenance and operational cost especially in the shipping industry. There are a few methods that can prevent this problem. One of the most effective methods which is using chemicals particularly Tributyltin has been banned due to adverse effects on the environment. One of the non-toxic methods found to be effective is surface modification which involves altering the surface topography so that it becomes a low-fouling or a non-stick surface to biofouling organisms. Current literature suggested that non-hierarchical topographies has lower antifouling performance compared to hierarchical topographies. It is still unclear if the effects of the flow on these topographies could have aided in their antifouling properties. This research will use Computational Fluid Dynamics (CFD simulations to study the flow on these two topographies which also involves comparison study of the topographies used. According to the results obtained, it is shown that hierarchical topography has higher antifouling performance compared to non-hierarchical topography. This is because the fluid characteristics at the hierarchical topography is more favorable in controlling biofouling. In addition, hierarchical topography has higher wall shear stress distribution compared to non-hierarchical topography

  6. Dynamics of the baryonic component in hierarchical clustering universes

    Science.gov (United States)

    Navarro, Julio

    1993-01-01

    I present self-consistent 3-D simulations of the formation of virialized systems containing both gas and dark matter in a flat universe. A fully Lagrangian code based on the Smoothed Particle Hydrodynamics technique and a tree data structure has been used to evolve regions of comoving radius 2-3 Mpc. Tidal effects are included by coarse-sampling the density of the outer regions up to a radius approx. 20 Mpc. Initial conditions are set at high redshift (z greater than 7) using a standard Cold Dark Matter perturbation spectrum and a baryon mass fraction of 10 percent (omega(sub b) = 0.1). Simulations in which the gas evolves either adiabatically or radiates energy at a rate determined locally by its cooling function were performed. This allows us to investigate with the same set of simulations the importance of radiative losses in the formation of galaxies and the equilibrium structure of virialized systems where cooling is very inefficient. In the absence of radiative losses, the simulations can be rescaled to the density and radius typical of galaxy clusters. A summary of the main results is presented.

  7. Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems.

    Directory of Open Access Journals (Sweden)

    Martin Rosvall

    Full Text Available To comprehend the hierarchical organization of large integrated systems, we introduce the hierarchical map equation, which reveals multilevel structures in networks. In this information-theoretic approach, we exploit the duality between compression and pattern detection; by compressing a description of a random walker as a proxy for real flow on a network, we find regularities in the network that induce this system-wide flow. Finding the shortest multilevel description of the random walker therefore gives us the best hierarchical clustering of the network--the optimal number of levels and modular partition at each level--with respect to the dynamics on the network. With a novel search algorithm, we extract and illustrate the rich multilevel organization of several large social and biological networks. For example, from the global air traffic network we uncover countries and continents, and from the pattern of scientific communication we reveal more than 100 scientific fields organized in four major disciplines: life sciences, physical sciences, ecology and earth sciences, and social sciences. In general, we find shallow hierarchical structures in globally interconnected systems, such as neural networks, and rich multilevel organizations in systems with highly separated regions, such as road networks.

  8. Planck intermediate results: VIII. Filaments between interacting clusters

    DEFF Research Database (Denmark)

    Bartlett, J.G.; Cardoso, J.-F.; Castex, G.

    2013-01-01

    of a fraction of these missing baryons between pairs of galaxy clusters. Methods. Cluster pairs are good candidates for searching for the hotter and denser phase of the intergalactic medium (which is more easily observed through the SZ effect). Using an X-ray catalogue of clusters and the Planck data, we...

  9. Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.

    Science.gov (United States)

    Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D

    2017-06-01

    Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.

  10. ProtoBee: Hierarchical classification and annotation of the honey bee proteome

    OpenAIRE

    Kaplan, Noam; Linial, Michal

    2006-01-01

    The recently sequenced genome of the honey bee (Apis mellifera) has produced 10,157 predicted protein sequences, calling for a computational effort to extract biological insights from them. We have applied an unsupervised hierarchical protein-clustering method, which was previously used in the ProtoNet system, to nearly 200,000 proteins consisting of the predicted honey bee proteins, the SWISS-PROT protein database, and the complete set of proteins of the mouse (Mus musculus) and the fruit fl...

  11. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

    Science.gov (United States)

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

    2008-06-18

    Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson

  12. The Assessment of Hydrogen Energy Systems for Fuel Cell Vehicles Using Principal Componenet Analysis and Cluster Analysis

    DEFF Research Database (Denmark)

    Ren, Jingzheng; Tan, Shiyu; Dong, Lichun

    2012-01-01

    and analysis of the hydrogen systems is meaningful for decision makers to select the best scenario. principal component analysis (PCA) has been used to evaluate the integrated performance of different hydrogen energy systems and select the best scenario, and hierarchical cluster analysis (CA) has been used...... for transportation of hydrogen, hydrogen gas tank for the storage of hydrogen at refueling stations, and gaseous hydrogen as power energy for fuel cell vehicles has been recognized as the best scenario. Also, the clustering results calculated by CA are consistent with those determined by PCA, denoting...

  13. Ensemble clustering in visual working memory biases location memories and reduces the Weber noise of relative positions.

    Science.gov (United States)

    Lew, Timothy F; Vul, Edward

    2015-01-01

    People seem to compute the ensemble statistics of objects and use this information to support the recall of individual objects in visual working memory. However, there are many different ways that hierarchical structure might be encoded. We examined the format of structured memories by asking subjects to recall the locations of objects arranged in different spatial clustering structures. Consistent with previous investigations of structured visual memory, subjects recalled objects biased toward the center of their clusters. Subjects also recalled locations more accurately when they were arranged in fewer clusters containing more objects, suggesting that subjects used the clustering structure of objects to aid recall. Furthermore, subjects had more difficulty recalling larger relative distances, consistent with subjects encoding the positions of objects relative to clusters and recalling them with magnitude-proportional (Weber) noise. Our results suggest that clustering improved the fidelity of recall by biasing the recall of locations toward cluster centers to compensate for uncertainty and by reducing the magnitude of encoded relative distances.

  14. An evaluation of centrality measures used in cluster analysis

    Science.gov (United States)

    Engström, Christopher; Silvestrov, Sergei

    2014-12-01

    Clustering of data into groups of similar objects plays an important part when analysing many types of data, especially when the datasets are large as they often are in for example bioinformatics, social networks and computational linguistics. Many clustering algorithms such as K-means and some types of hierarchical clustering need a number of centroids representing the 'center' of the clusters. The choice of centroids for the initial clusters often plays an important role in the quality of the clusters. Since a data point with a high centrality supposedly lies close to the 'center' of some cluster, this can be used to assign centroids rather than through some other method such as picking them at random. Some work have been done to evaluate the use of centrality measures such as degree, betweenness and eigenvector centrality in clustering algorithms. The aim of this article is to compare and evaluate the usefulness of a number of common centrality measures such as the above mentioned and others such as PageRank and related measures.

  15. D Partition-Based Clustering for Supply Chain Data Management

    Science.gov (United States)

    Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.

    2015-10-01

    Supply Chain Management (SCM) is the management of the products and goods flow from its origin point to point of consumption. During the process of SCM, information and dataset gathered for this application is massive and complex. This is due to its several processes such as procurement, product development and commercialization, physical distribution, outsourcing and partnerships. For a practical application, SCM datasets need to be managed and maintained to serve a better service to its three main categories; distributor, customer and supplier. To manage these datasets, a structure of data constellation is used to accommodate the data into the spatial database. However, the situation in geospatial database creates few problems, for example the performance of the database deteriorate especially during the query operation. We strongly believe that a more practical hierarchical tree structure is required for efficient process of SCM. Besides that, three-dimensional approach is required for the management of SCM datasets since it involve with the multi-level location such as shop lots and residential apartments. 3D R-Tree has been increasingly used for 3D geospatial database management due to its simplicity and extendibility. However, it suffers from serious overlaps between nodes. In this paper, we proposed a partition-based clustering for the construction of a hierarchical tree structure. Several datasets are tested using the proposed method and the percentage of the overlapping nodes and volume coverage are computed and compared with the original 3D R-Tree and other practical approaches. The experiments demonstrated in this paper substantiated that the hierarchical structure of the proposed partitionbased clustering is capable of preserving minimal overlap and coverage. The query performance was tested using 300,000 points of a SCM dataset and the results are presented in this paper. This paper also discusses the outlook of the structure for future reference.

  16. MCBT: Multi-Hop Cluster Based Stable Backbone Trees for Data Collection and Dissemination in WSNs

    Directory of Open Access Journals (Sweden)

    Tae-Jin Lee

    2009-07-01

    Full Text Available We propose a stable backbone tree construction algorithm using multi-hop clusters for wireless sensor networks (WSNs. The hierarchical cluster structure has advantages in data fusion and aggregation. Energy consumption can be decreased by managing nodes with cluster heads. Backbone nodes, which are responsible for performing and managing multi-hop communication, can reduce the communication overhead such as control traffic and minimize the number of active nodes. Previous backbone construction algorithms, such as Hierarchical Cluster-based Data Dissemination (HCDD and Multicluster, Mobile, Multimedia radio network (MMM, consume energy quickly. They are designed without regard to appropriate factors such as residual energy and degree (the number of connections or edges to other nodes of a node for WSNs. Thus, the network is quickly disconnected or has to reconstruct a backbone. We propose a distributed algorithm to create a stable backbone by selecting the nodes with higher energy or degree as the cluster heads. This increases the overall network lifetime. Moreover, the proposed method balances energy consumption by distributing the traffic load among nodes around the cluster head. In the simulation, the proposed scheme outperforms previous clustering schemes in terms of the average and the standard deviation of residual energy or degree of backbone nodes, the average residual energy of backbone nodes after disseminating the sensed data, and the network lifetime.

  17. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  18. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  19. Self-assembled biomimetic superhydrophobic hierarchical arrays.

    Science.gov (United States)

    Yang, Hongta; Dou, Xuan; Fang, Yin; Jiang, Peng

    2013-09-01

    Here, we report a simple and inexpensive bottom-up technology for fabricating superhydrophobic coatings with hierarchical micro-/nano-structures, which are inspired by the binary periodic structure found on the superhydrophobic compound eyes of some insects (e.g., mosquitoes and moths). Binary colloidal arrays consisting of exemplary large (4 and 30 μm) and small (300 nm) silica spheres are first assembled by a scalable Langmuir-Blodgett (LB) technology in a layer-by-layer manner. After surface modification with fluorosilanes, the self-assembled hierarchical particle arrays become superhydrophobic with an apparent water contact angle (CA) larger than 150°. The throughput of the resulting superhydrophobic coatings with hierarchical structures can be significantly improved by templating the binary periodic structures of the LB-assembled colloidal arrays into UV-curable fluoropolymers by a soft lithography approach. Superhydrophobic perfluoroether acrylate hierarchical arrays with large CAs and small CA hysteresis can be faithfully replicated onto various substrates. Both experiments and theoretical calculations based on the Cassie's dewetting model demonstrate the importance of the hierarchical structure in achieving the final superhydrophobic surface states. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Clustering of near clusters versus cluster compactness

    International Nuclear Information System (INIS)

    Yu Gao; Yipeng Jing

    1989-01-01

    The clustering properties of near Zwicky clusters are studied by using the two-point angular correlation function. The angular correlation functions for compact and medium compact clusters, for open clusters, and for all near Zwicky clusters are estimated. The results show much stronger clustering for compact and medium compact clusters than for open clusters, and that open clusters have nearly the same clustering strength as galaxies. A detailed study of the compactness-dependence of correlation function strength is worth investigating. (author)

  1. Adaptive clustering procedure for continuous gravitational wave searches

    Science.gov (United States)

    Singh, Avneet; Papa, Maria Alessandra; Eggenstein, Heinz-Bernd; Walsh, Sinéad

    2017-10-01

    In hierarchical searches for continuous gravitational waves, clustering of candidates is an important post-processing step because it reduces the number of noise candidates that are followed up at successive stages [J. Aasi et al., Phys. Rev. Lett. 88, 102002 (2013), 10.1103/PhysRevD.88.102002; B. Behnke, M. A. Papa, and R. Prix, Phys. Rev. D 91, 064007 (2015), 10.1103/PhysRevD.91.064007; M. A. Papa et al., Phys. Rev. D 94, 122006 (2016), 10.1103/PhysRevD.94.122006]. Previous clustering procedures bundled together nearby candidates ascribing them to the same root cause (be it a signal or a disturbance), based on a predefined cluster volume. In this paper, we present a procedure that adapts the cluster volume to the data itself and checks for consistency of such volume with what is expected from a signal. This significantly improves the noise rejection capabilities at fixed detection threshold, and at fixed computing resources for the follow-up stages, this results in an overall more sensitive search. This new procedure was employed in the first Einstein@Home search on data from the first science run of the advanced LIGO detectors (O1) [LIGO Scientific Collaboration and Virgo Collaboration, arXiv:1707.02669 [Phys. Rev. D (to be published)

  2. Eigenspaces of networks reveal the overlapping and hierarchical community structure more precisely

    International Nuclear Information System (INIS)

    Ma, Xiaoke; Gao, Lin; Yong, Xuerong

    2010-01-01

    Identifying community structure is fundamental for revealing the structure–functionality relationship in complex networks, and spectral algorithms have been shown to be powerful for this purpose. In a traditional spectral algorithm, each vertex of a network is embedded into a spectral space by making use of the eigenvectors of the adjacency matrix or Laplacian matrix of the graph. In this paper, a novel spectral approach for revealing the overlapping and hierarchical community structure of complex networks is proposed by not only using the eigenvalues and eigenvectors but also the properties of eigenspaces of the networks involved. This gives us a better characterization of community. We first show that the communicability between a pair of vertices can be rewritten in term of eigenspaces of a network. An agglomerative clustering algorithm is then presented to discover the hierarchical communities using the communicability matrix. Finally, these overlapping vertices are discovered with the corresponding eigenspaces, based on the fact that the vertices more densely connected amongst one another are more likely to be linked through short cycles. Compared with the traditional spectral algorithms, our algorithm can identify both the overlapping and hierarchical community without increasing the time complexity O(n 3 ), where n is the size of the network. Furthermore, our algorithm can also distinguish the overlapping vertices from bridges. The method is tested by applying it to some computer-generated and real-world networks. The experimental results indicate that our algorithm can reveal community structure more precisely than the traditional spectral approaches

  3. Comparison of two schemes for automatic keyword extraction from MEDLINE for functional gene clustering.

    Science.gov (United States)

    Liu, Ying; Ciliax, Brian J; Borges, Karin; Dasigi, Venu; Ram, Ashwin; Navathe, Shamkant B; Dingledine, Ray

    2004-01-01

    One of the key challenges of microarray studies is to derive biological insights from the unprecedented quatities of data on gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the nature of the functional links among genes within the derived clusters. However, the quality of the keyword lists extracted from biomedical literature for each gene significantly affects the clustering results. We extracted keywords from MEDLINE that describes the most prominent functions of the genes, and used the resulting weights of the keywords as feature vectors for gene clustering. By analyzing the resulting cluster quality, we compared two keyword weighting schemes: normalized z-score and term frequency-inverse document frequency (TFIDF). The best combination of background comparison set, stop list and stemming algorithm was selected based on precision and recall metrics. In a test set of four known gene groups, a hierarchical algorithm correctly assigned 25 of 26 genes to the appropriate clusters based on keywords extracted by the TDFIDF weighting scheme, but only 23 og 26 with the z-score method. To evaluate the effectiveness of the weighting schemes for keyword extraction for gene clusters from microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle were used as a second test set. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords had higher purity, lower entropy, and higher mutual information than those produced from normalized z-score weighted keywords. The optimized algorithms should be useful for sorting genes from microarray lists into functionally discrete clusters.

  4. Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.

    Science.gov (United States)

    Lu, Xiaoqiang; Chen, Yaxiong; Li, Xuelong

    Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep learning architectures can learn more effective image representation features. However, these methods only use semantic features to generate hash codes by shallow projection but ignore texture details. In this paper, we proposed a novel hashing method, namely hierarchical recurrent neural hashing (HRNH), to exploit hierarchical recurrent neural network to generate effective hash codes. There are three contributions of this paper. First, a deep hashing method is proposed to extensively exploit both spatial details and semantic information, in which, we leverage hierarchical convolutional features to construct image pyramid representation. Second, our proposed deep network can exploit directly convolutional feature maps as input to preserve the spatial structure of convolutional feature maps. Finally, we propose a new loss function that considers the quantization error of binarizing the continuous embeddings into the discrete binary codes, and simultaneously maintains the semantic similarity and balanceable property of hash codes. Experimental results on four widely used data sets demonstrate that the proposed HRNH can achieve superior performance over other state-of-the-art hashing methods.Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep

  5. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Directory of Open Access Journals (Sweden)

    Valentina Meuti

    2014-01-01

    Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  6. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Science.gov (United States)

    Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta

    2014-01-01

    Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499

  7. Hierarchical screening for multiple mental disorders.

    Science.gov (United States)

    Batterham, Philip J; Calear, Alison L; Sunderland, Matthew; Carragher, Natacha; Christensen, Helen; Mackinnon, Andrew J

    2013-10-01

    There is a need for brief, accurate screening when assessing multiple mental disorders. Two-stage hierarchical screening, consisting of brief pre-screening followed by a battery of disorder-specific scales for those who meet diagnostic criteria, may increase the efficiency of screening without sacrificing precision. This study tested whether more efficient screening could be gained using two-stage hierarchical screening than by administering multiple separate tests. Two Australian adult samples (N=1990) with high rates of psychopathology were recruited using Facebook advertising to examine four methods of hierarchical screening for four mental disorders: major depressive disorder, generalised anxiety disorder, panic disorder and social phobia. Using K6 scores to determine whether full screening was required did not increase screening efficiency. However, pre-screening based on two decision tree approaches or item gating led to considerable reductions in the mean number of items presented per disorder screened, with estimated item reductions of up to 54%. The sensitivity of these hierarchical methods approached 100% relative to the full screening battery. Further testing of the hierarchical screening approach based on clinical criteria and in other samples is warranted. The results demonstrate that a two-phase hierarchical approach to screening multiple mental disorders leads to considerable increases efficiency gains without reducing accuracy. Screening programs should take advantage of prescreeners based on gating items or decision trees to reduce the burden on respondents. © 2013 Elsevier B.V. All rights reserved.

  8. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  9. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Science.gov (United States)

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is

  10. Hierarchical Solution of the Traveling Salesman Problem with Random Dyadic Tilings

    Science.gov (United States)

    Kalmár-Nagy, Tamás; Bak, Bendegúz Dezső

    We propose a hierarchical heuristic approach for solving the Traveling Salesman Problem (TSP) in the unit square. The points are partitioned with a random dyadic tiling and clusters are formed by the points located in the same tile. Each cluster is represented by its geometrical barycenter and a “coarse” TSP solution is calculated for these barycenters. Midpoints are placed at the middle of each edge in the coarse solution. Near-optimal (or optimal) minimum tours are computed for each cluster. The tours are concatenated using the midpoints yielding a solution for the original TSP. The method is tested on random TSPs (independent, identically distributed points in the unit square) up to 10,000 points as well as on a popular benchmark problem (att532 — coordinates of 532 American cities). Our solutions are 8-13% longer than the optimal ones. We also present an optimization algorithm for the partitioning to improve our solutions. This algorithm further reduces the solution errors (by several percent using 1000 iteration steps). The numerical experiments demonstrate the viability of the approach.

  11. Detecting Hierarchical Structure in Networks

    DEFF Research Database (Denmark)

    Herlau, Tue; Mørup, Morten; Schmidt, Mikkel Nørgaard

    2012-01-01

    Many real-world networks exhibit hierarchical organization. Previous models of hierarchies within relational data has focused on binary trees; however, for many networks it is unknown whether there is hierarchical structure, and if there is, a binary tree might not account well for it. We propose...... a generative Bayesian model that is able to infer whether hierarchies are present or not from a hypothesis space encompassing all types of hierarchical tree structures. For efficient inference we propose a collapsed Gibbs sampling procedure that jointly infers a partition and its hierarchical structure....... On synthetic and real data we demonstrate that our model can detect hierarchical structure leading to better link-prediction than competing models. Our model can be used to detect if a network exhibits hierarchical structure, thereby leading to a better comprehension and statistical account the network....

  12. Hierarchical effects on target detection and conflict monitoring

    Science.gov (United States)

    Cao, Bihua; Gao, Feng; Ren, Maofang; Li, Fuhong

    2016-01-01

    Previous neuroimaging studies have demonstrated a hierarchical functional structure of the frontal cortices of the human brain, but the temporal course and the electrophysiological signature of the hierarchical representation remains unaddressed. In the present study, twenty-one volunteers were asked to perform a nested cue-target task, while their scalp potentials were recorded. The results showed that: (1) in comparison with the lower-level hierarchical targets, the higher-level targets elicited a larger N2 component (220–350 ms) at the frontal sites, and a smaller P3 component (350–500 ms) across the frontal and parietal sites; (2) conflict-related negativity (non-target minus target) was greater for the lower-level hierarchy than the higher-level, reflecting a more intensive process of conflict monitoring at the final step of target detection. These results imply that decision making, context updating, and conflict monitoring differ among different hierarchical levels of abstraction. PMID:27561989

  13. Multi-objective hierarchical genetic algorithms for multilevel redundancy allocation optimization

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Ranjan [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: ranjan.k@ks3.ecs.kyoto-u.ac.jp; Izui, Kazuhiro [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: izui@prec.kyoto-u.ac.jp; Yoshimura, Masataka [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: yoshimura@prec.kyoto-u.ac.jp; Nishiwaki, Shinji [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: shinji@prec.kyoto-u.ac.jp

    2009-04-15

    Multilevel redundancy allocation optimization problems (MRAOPs) occur frequently when attempting to maximize the system reliability of a hierarchical system, and almost all complex engineering systems are hierarchical. Despite their practical significance, limited research has been done concerning the solving of simple MRAOPs. These problems are not only NP hard but also involve hierarchical design variables. Genetic algorithms (GAs) have been applied in solving MRAOPs, since they are computationally efficient in solving such problems, unlike exact methods, but their applications has been confined to single-objective formulation of MRAOPs. This paper proposes a multi-objective formulation of MRAOPs and a methodology for solving such problems. In this methodology, a hierarchical GA framework for multi-objective optimization is proposed by introducing hierarchical genotype encoding for design variables. In addition, we implement the proposed approach by integrating the hierarchical genotype encoding scheme with two popular multi-objective genetic algorithms (MOGAs)-the strength Pareto evolutionary genetic algorithm (SPEA2) and the non-dominated sorting genetic algorithm (NSGA-II). In the provided numerical examples, the proposed multi-objective hierarchical approach is applied to solve two hierarchical MRAOPs, a 4- and a 3-level problems. The proposed method is compared with a single-objective optimization method that uses a hierarchical genetic algorithm (HGA), also applied to solve the 3- and 4-level problems. The results show that a multi-objective hierarchical GA (MOHGA) that includes elitism and mechanism for diversity preserving performed better than a single-objective GA that only uses elitism, when solving large-scale MRAOPs. Additionally, the experimental results show that the proposed method with NSGA-II outperformed the proposed method with SPEA2 in finding useful Pareto optimal solution sets.

  14. Multi-objective hierarchical genetic algorithms for multilevel redundancy allocation optimization

    International Nuclear Information System (INIS)

    Kumar, Ranjan; Izui, Kazuhiro; Yoshimura, Masataka; Nishiwaki, Shinji

    2009-01-01

    Multilevel redundancy allocation optimization problems (MRAOPs) occur frequently when attempting to maximize the system reliability of a hierarchical system, and almost all complex engineering systems are hierarchical. Despite their practical significance, limited research has been done concerning the solving of simple MRAOPs. These problems are not only NP hard but also involve hierarchical design variables. Genetic algorithms (GAs) have been applied in solving MRAOPs, since they are computationally efficient in solving such problems, unlike exact methods, but their applications has been confined to single-objective formulation of MRAOPs. This paper proposes a multi-objective formulation of MRAOPs and a methodology for solving such problems. In this methodology, a hierarchical GA framework for multi-objective optimization is proposed by introducing hierarchical genotype encoding for design variables. In addition, we implement the proposed approach by integrating the hierarchical genotype encoding scheme with two popular multi-objective genetic algorithms (MOGAs)-the strength Pareto evolutionary genetic algorithm (SPEA2) and the non-dominated sorting genetic algorithm (NSGA-II). In the provided numerical examples, the proposed multi-objective hierarchical approach is applied to solve two hierarchical MRAOPs, a 4- and a 3-level problems. The proposed method is compared with a single-objective optimization method that uses a hierarchical genetic algorithm (HGA), also applied to solve the 3- and 4-level problems. The results show that a multi-objective hierarchical GA (MOHGA) that includes elitism and mechanism for diversity preserving performed better than a single-objective GA that only uses elitism, when solving large-scale MRAOPs. Additionally, the experimental results show that the proposed method with NSGA-II outperformed the proposed method with SPEA2 in finding useful Pareto optimal solution sets

  15. Constructing a graph of connections in clustering algorithm of complex objects

    Directory of Open Access Journals (Sweden)

    Татьяна Шатовская

    2015-05-01

    Full Text Available The article describes the results of modifying the algorithm Chameleon. Hierarchical multi-level algorithm consists of several phases: the construction of the count, coarsening, the separation and recovery. Each phase can be used various approaches and algorithms. The main aim of the work is to study the quality of the clustering of different sets of data using a set of algorithms combinations at different stages of the algorithm and improve the stage of construction by the optimization algorithm of k choice in the graph construction of k of nearest neighbors

  16. Open-Source Sequence Clustering Methods Improve the State Of the Art.

    Science.gov (United States)

    Kopylova, Evguenia; Navas-Molina, Jose A; Mercier, Céline; Xu, Zhenjiang Zech; Mahé, Frédéric; He, Yan; Zhou, Hong-Wei; Rognes, Torbjørn; Caporaso, J Gregory; Knight, Rob

    2016-01-01

    Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent analysis steps. Here, we evaluated the performance of recently released state-of-the-art open-source clustering software products, namely, OTUCLUST, Swarm, SUMACLUST, and SortMeRNA, against current principal options (UCLUST and USEARCH) in QIIME, hierarchical clustering methods in mothur, and USEARCH's most recent clustering algorithm, UPARSE. All the latest open-source tools showed promising results, reporting up to 60% fewer spurious OTUs than UCLUST, indicating that the underlying clustering algorithm can vastly reduce the number of these derived OTUs. Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results. Swarm, SUMACLUST, and SortMeRNA have been included in the QIIME 1.9.0 release. IMPORTANCE Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014, http

  17. An Integrated Intrusion Detection Model of Cluster-Based Wireless Sensor Network.

    Science.gov (United States)

    Sun, Xuemei; Yan, Bo; Zhang, Xinzhong; Rong, Chuitian

    2015-01-01

    Considering wireless sensor network characteristics, this paper combines anomaly and mis-use detection and proposes an integrated detection model of cluster-based wireless sensor network, aiming at enhancing detection rate and reducing false rate. Adaboost algorithm with hierarchical structures is used for anomaly detection of sensor nodes, cluster-head nodes and Sink nodes. Cultural-Algorithm and Artificial-Fish-Swarm-Algorithm optimized Back Propagation is applied to mis-use detection of Sink node. Plenty of simulation demonstrates that this integrated model has a strong performance of intrusion detection.

  18. Catalysis with hierarchical zeolites

    DEFF Research Database (Denmark)

    Holm, Martin Spangsberg; Taarning, Esben; Egeblad, Kresten

    2011-01-01

    Hierarchical (or mesoporous) zeolites have attracted significant attention during the first decade of the 21st century, and so far this interest continues to increase. There have already been several reviews giving detailed accounts of the developments emphasizing different aspects of this research...... topic. Until now, the main reason for developing hierarchical zeolites has been to achieve heterogeneous catalysts with improved performance but this particular facet has not yet been reviewed in detail. Thus, the present paper summaries and categorizes the catalytic studies utilizing hierarchical...... zeolites that have been reported hitherto. Prototypical examples from some of the different categories of catalytic reactions that have been studied using hierarchical zeolite catalysts are highlighted. This clearly illustrates the different ways that improved performance can be achieved with this family...

  19. The Design of Cluster Randomized Trials with Random Cross-Classifications

    Science.gov (United States)

    Moerbeek, Mirjam; Safarkhani, Maryam

    2018-01-01

    Data from cluster randomized trials do not always have a pure hierarchical structure. For instance, students are nested within schools that may be crossed by neighborhoods, and soldiers are nested within army units that may be crossed by mental health-care professionals. It is important that the random cross-classification is taken into account…

  20. Hierarchical prisoner’s dilemma in hierarchical game for resource competition

    Science.gov (United States)

    Fujimoto, Yuma; Sagawa, Takahiro; Kaneko, Kunihiko

    2017-07-01

    Dilemmas in cooperation are one of the major concerns in game theory. In a public goods game, each individual cooperates by paying a cost or defecting without paying it, and receives a reward from the group out of the collected cost. Thus, defecting is beneficial for each individual, while cooperation is beneficial for the group. Now, groups (say, countries) consisting of individuals also play games. To study such a multi-level game, we introduce a hierarchical game in which multiple groups compete for limited resources by utilizing the collected cost in each group, where the power to appropriate resources increases with the population of the group. Analyzing this hierarchical game, we found a hierarchical prisoner’s dilemma, in which groups choose the defecting policy (say, armament) as a Nash strategy to optimize each group’s benefit, while cooperation optimizes the total benefit. On the other hand, for each individual, refusing to pay the cost (say, tax) is a Nash strategy, which turns out to be a cooperation policy for the group, thus leading to a hierarchical dilemma. Here the group reward increases with the group size. However, we find that there exists an optimal group size that maximizes the individual payoff. Furthermore, when the population asymmetry between two groups is large, the smaller group will choose a cooperation policy (say, disarmament) to avoid excessive response from the larger group, and the prisoner’s dilemma between the groups is resolved. Accordingly, the relevance of this hierarchical game on policy selection in society and the optimal size of human or animal groups are discussed.

  1. Cognitive Clusters in Specific Learning Disorder.

    Science.gov (United States)

    Poletti, Michele; Carretta, Elisa; Bonvicini, Laura; Giorgi-Rossi, Paolo

    The heterogeneity among children with learning disabilities still represents a barrier and a challenge in their conceptualization. Although a dimensional approach has been gaining support, the categorical approach is still the most adopted, as in the recent fifth edition of the Diagnostic and Statistical Manual of Mental Disorders. The introduction of the single overarching diagnostic category of specific learning disorder (SLD) could underemphasize interindividual clinical differences regarding intracategory cognitive functioning and learning proficiency, according to current models of multiple cognitive deficits at the basis of neurodevelopmental disorders. The characterization of specific cognitive profiles associated with an already manifest SLD could help identify possible early cognitive markers of SLD risk and distinct trajectories of atypical cognitive development leading to SLD. In this perspective, we applied a cluster analysis to identify groups of children with a Diagnostic and Statistical Manual-based diagnosis of SLD with similar cognitive profiles and to describe the association between clusters and SLD subtypes. A sample of 205 children with a diagnosis of SLD were enrolled. Cluster analyses (agglomerative hierarchical and nonhierarchical iterative clustering technique) were used successively on 10 core subtests of the Wechsler Intelligence Scale for Children-Fourth Edition. The 4-cluster solution was adopted, and external validation found differences in terms of SLD subtype frequencies and learning proficiency among clusters. Clinical implications of these findings are discussed, tracing directions for further studies.

  2. The outskirts of the Coma cluster

    Science.gov (United States)

    Gavazzi, Giuseppe

    Evolved Coma-like clusters of galaxies are constituted of relaxed cores composed of ''old'' early-type galaxies, embedded in large-scale structures, mostly constituted of unevolved (late-type) systems. According to the hierarchical theory of cluster formation the central regions are being fed with unevolved, low-mass systems infalling from the surroundings that are gradually transformed into elliptical/S0 galaxies by tidal galaxy-galaxy and galaxy-cluster interactions, taking place at some boundary distance. The Coma cluster, the most studied of all local clusters, provides us with the ideal test-bed for such an evolutionary study because of the completeness of the photometric and kinematic information already at hands. The field of view of the planned GALEX observations is not big enough to include the boundary interface where most transformations processes are expected to take place, including the truncation of the current star formation. We propose to complete the outskirt of Coma with an additional corona of 11 GALEX imaging fields of 1500 sec exposure each, matching the deepness (UV_{AB}=23.5 mag) of the fields observed in guarantee time. Given the priority of the target, we also propose one optional Central pointing that includes one bright star marginally exceeding the detector brightness limit.

  3. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  4. Complex networks as an emerging property of hierarchical preferential attachment

    Science.gov (United States)

    Hébert-Dufresne, Laurent; Laurence, Edward; Allard, Antoine; Young, Jean-Gabriel; Dubé, Louis J.

    2015-12-01

    Real complex systems are not rigidly structured; no clear rules or blueprints exist for their construction. Yet, amidst their apparent randomness, complex structural properties universally emerge. We propose that an important class of complex systems can be modeled as an organization of many embedded levels (potentially infinite in number), all of them following the same universal growth principle known as preferential attachment. We give examples of such hierarchy in real systems, for instance, in the pyramid of production entities of the film industry. More importantly, we show how real complex networks can be interpreted as a projection of our model, from which their scale independence, their clustering, their hierarchy, their fractality, and their navigability naturally emerge. Our results suggest that complex networks, viewed as growing systems, can be quite simple, and that the apparent complexity of their structure is largely a reflection of their unobserved hierarchical nature.

  5. Identifying the ideal profile of French yogurts for different clusters of consumers.

    Science.gov (United States)

    Masson, M; Saint-Eve, A; Delarue, J; Blumenthal, D

    2016-05-01

    Identifying the sensory properties that affect consumer preferences for food products is an important feature of product development. Different methods, such as external preference mapping or partial least squares regression, are used to establish relationships between sensory data and consumer preferences and to identify sensory attributes that drive consumer preferences, by highlighting optimum products. Plain French yogurts were evaluated by a sensory profiling method performed by 12 trained judges. In parallel, 180 consumers were asked to score their overall liking and complete a cognitive restraint questionnaire. After hierarchical cluster analysis on the liking scores, preference mapping using a quadratic regression model was performed. Five clusters of consumers were identified as a function of different preference patterns. Contrary to our expectations, fat levels were not discriminating. For each cluster, the results of preference mapping enabled the identification of optimum products. A comparison of the 5 sensory profiles revealed numerous differences between key sensory attributes. For example, one consumer cluster had a strong preference for products perceived as very thick, grainy, but with a less flowing texture, less sticky, whey presence and color, in contrast to other clusters. In addition, each segment of consumers was characterized according to the results of the cognitive restraint questionnaire. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  6. Anti-inflammatory and anti-angiogenic activities in vitro of eight diterpenes from Daphne genkwa based on hierarchical cluster and principal component analysis.

    Science.gov (United States)

    Wang, Ling; Lan, Xin-Yi; Ji, Jun; Zhang, Chun-Feng; Li, Fei; Wang, Chong-Zhi; Yuan, Chun-Su

    2018-06-01

    Rheumatoid arthritis (RA) is one of the most prevalent chronic inflammatory and angiogenic diseases. The aim of this study was to evaluate the anti-inflammatory and anti-angiogenic activities in vitro of eight diterpenoids isolated from Daphne genkwa. LC-MS was used to identify diterpenes isolated from D. genkwa. The anti-inflammatory and anti-angiogenic activities of eight diterpenoids were evaluated on LPS-induced macrophage RAW264.7 cells and TNF-α-stimulated human umbilical vein endothelial cells (HUVECs) using hierarchical cluster analysis (HCA) and principal component analysis (PCA). The eight diterpenes isolated from D. genkwa were identified as yuanhuaphnin, isoyuanhuacine, 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl, yuanhuagine, isoyuanhuadine, yuanhuadine, yuanhuaoate C and yuanhuacine. All the eight diterpenes significantly down-regulated the excessive secretion of TNF-α, IL-6, IL-1β and NO in LPS-induced RAW264.7 macrophages. However, only 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl markedly reduced production of VEGF, MMP-3, ICAM and VCAM in TNF-α-stimulated HUVECs. HCA obtained 4 clusters, containing 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl, isoyuanhuacine, isoyuanhuadine and five other compounds. PCA showed that the ranking of diterpenes sorted by efficacy from highest to lowest was 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl, yuanhuaphnin, isoyuanhuacine, yuanhuacine, yuanhuaoate C, yuanhuagine, isoyuanhuadine, yuanhuadine. In conclusion, eight diterpenes isolated from D. genkwa showed different levels of activity in LPS-induced RAW264.7 cells and TNF-α-stimulated HUVECs. The comprehensive evaluation of activity by HCA and PCA indicated that of the eight diterpenes, 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl was the best, and can be developed as a new drug for RA therapy.

  7. ClustOfVar: An R Package for the Clustering of Variables

    Directory of Open Access Journals (Sweden)

    Marie Chavent

    2012-09-01

    Full Text Available Clustering of variables is as a way to arrange variables into homogeneous clusters, i.e., groups of variables which are strongly related to each other and thus bring the same information. These approaches can then be useful for dimension reduction and variable selection. Several specific methods have been developed for the clustering of numerical variables. However concerning qualitative variables or mixtures of quantitative and qualitative variables, far fewer methods have been proposed. The R package ClustOfVar was specifically developed for this purpose. The homogeneity criterion of a cluster is defined as the sum of correlation ratios (for qualitative variables and squared correlations (for quantitative variables to a synthetic quantitative variable, summarizing ``as good as possible'' the variables in the cluster. This synthetic variable is the first principal component obtained with the PCAMIX method. Two clustering algorithms are proposed to optimize the homogeneity criterion: iterative relocation algorithm and ascendant hierarchical clustering. We also propose a bootstrap approach in order to determine suitable numbers of clusters. We illustrate the methodologies and the associated package on small datasets.

  8. Applying Clustering Methods in Drawing Maps of Science: Case Study of the Map For Urban Management Science

    Directory of Open Access Journals (Sweden)

    Mohammad Abuei Ardakan

    2010-04-01

    Full Text Available The present paper offers a basic introduction to data clustering and demonstrates the application of clustering methods in drawing maps of science. All approaches towards classification and clustering of information are briefly discussed. Their application to the process of visualization of conceptual information and drawing of science maps are illustrated by reviewing similar researches in this field. By implementing aggregated hierarchical clustering algorithm, which is an algorithm based on complete-link method, the map for urban management science as an emerging, interdisciplinary scientific field is analyzed and reviewed.

  9. Enhancement of Adaptive Cluster Hierarchical Routing Protocol using Distance and Energy for Wireless Sensor Networks

    International Nuclear Information System (INIS)

    Nawar, N.M.; Soliman, S.E.; Kelash, H.M.; Ayad, N.M.

    2014-01-01

    The application of wireless networking is widely used in nuclear applications. This includes reactor control and fire dedication system. This paper is devoted to the application of this concept in the intrusion system of the Radioisotope Production Facility (RPF) of the Egyptian Atomic Energy Authority. This includes the tracking, monitoring and control components of this system. The design and implementation of wireless sensor networks has become a hot area of research due to the extensive use of sensor networks to enable applications that connect the physical world to the virtual world [1-2]. The original LEACH is named a communication protocol (clustering-based); the extended LEACH’s stochastic cluster head selection algorithm by a deterministic component. Depending on the network configuration an increase of network lifetime can be accomplished [3]. The proposed routing mechanisms after enhancement divide the nodes into clusters. A cluster head performs its task which is considerably more energy-intensive than the rest of the nodes inside sensor network. So, nodes rotate tasks at different rounds between a cluster head and other sensors throughout the lifetime of the network to balance the energy dissipation [4-5].The performance improvement when using routing protocol after enhancement of the algorithm which takes into consideration the distance and the remaining energy for choosing the cluster head by obtains from the advertise message. Network Simulator (Ns2 simulator) is used to prove that LEACH after enhancement performs better than the original LEACH protocol in terms of Average Energy, Network Life Time, Delay, Throughput and Overhead.

  10. A bio-inspired N-doped porous carbon electrocatalyst with hierarchical superstructure for efficient oxygen reduction reaction

    Science.gov (United States)

    Miao, Yue-E.; Yan, Jiajie; Ouyang, Yue; Lu, Hengyi; Lai, Feili; Wu, Yue; Liu, Tianxi

    2018-06-01

    The bio-inspired hierarchical "grape cluster" superstructure provides an effective integration of one-dimensional carbon nanofibers (CNF) with isolated carbonaceous nanoparticles into three-dimensional (3D) conductive frameworks for efficient electron and mass transfer. Herein, a 3D N-doped porous carbon electrocatalyst consisting of carbon nanofibers with grape-like N-doped hollow carbon particles (CNF@NC) has been prepared through a simple electrospinning strategy combined with in-situ growth and carbonization processes. Such a bio-inspired hierarchically organized conductive network largely facilitates both the mass diffusion and electron transfer during the oxygen reduction reactions (ORR). Therefore, the metal-free CNF@NC catalyst demonstrates superior catalytic activity with an absolute four-electron transfer mechanism, strong methanol tolerance and good long-term stability towards ORR in alkaline media.

  11. Towards Clustering of Mobile and Smartwatch Accelerometer Data for Physical Activity Recognition

    Directory of Open Access Journals (Sweden)

    Chelsea Dobbins

    2018-06-01

    Full Text Available Mobile and wearable devices now have a greater capability of sensing human activity ubiquitously and unobtrusively through advancements in miniaturization and sensing abilities. However, outstanding issues remain around the energy restrictions of these devices when processing large sets of data. This paper presents our approach that uses feature selection to refine the clustering of accelerometer data to detect physical activity. This also has a positive effect on the computational burden that is associated with processing large sets of data, as energy efficiency and resource use is decreased because less data is processed by the clustering algorithms. Raw accelerometer data, obtained from smartphones and smartwatches, have been preprocessed to extract both time and frequency domain features. Principle component analysis feature selection (PCAFS and correlation feature selection (CFS have been used to remove redundant features. The reduced feature sets have then been evaluated against three widely used clustering algorithms, including hierarchical clustering analysis (HCA, k-means, and density-based spatial clustering of applications with noise (DBSCAN. Using the reduced feature sets resulted in improved separability, reduced uncertainty, and improved efficiency compared with the baseline, which utilized all features. Overall, the CFS approach in conjunction with HCA produced higher Dunn Index results of 9.7001 for the phone and 5.1438 for the watch features, which is an improvement over the baseline. The results of this comparative study of feature selection and clustering, with the specific algorithms used, has not been performed previously and provides an optimistic and usable approach to recognize activities using either a smartphone or smartwatch.

  12. Clustering of comb and propolis waxes based on the distribution of aliphatic constituents

    Directory of Open Access Journals (Sweden)

    Custodio Angela R.

    2003-01-01

    Full Text Available Chemical composition data for 41 samples of propolis waxes and 9 samples of comb waxes of Apis mellifera collected mainly in Brazil were treated using Principal Component Analysis (PCA and Hierarchical Cluster Analysis (HCA. For chemometrical analysis, the distribution of hydrocarbons and residues of alcohols and carboxylic acids of monoesters were considered. The clustering obtained revealed chemical affinities and differences not previously grasped by simple eye-inspection of the data. No consistent differences were detected between comb and propolis waxes. These and previous results suggest that hydrocarbons, carboxylic acids, aliphatic alcohols and esters from both comb and propolis waxes are bee-produced compounds and, hence, the differences detected between one and another region are dependent on genetic factors related to the insects rather than the local flora. The samples analyzed were split into two main clusters, one of them comprising exclusively material collected in the State of São Paulo. The results are discussed with respect to the africanization of honeybees that first took place in that State and therefrom irradiated to other parts of Brazil.

  13. Cluster Analysis of the Newcastle Electronic Corpus of Tyneside English: A Comparison of Methods

    NARCIS (Netherlands)

    Moisl, Hermann; Jones, Valerie M.

    2005-01-01

    This article examines the feasibility of an empirical approach to sociolinguistic analysis of the Newcastle Electronic Corpus of Tyneside English using exploratory multivariate methods. It addresses a known problem with one class of such methods, hierarchical cluster analysis—that different

  14. THE METALLICITY BIMODALITY OF GLOBULAR CLUSTER SYSTEMS: A TEST OF GALAXY ASSEMBLY AND OF THE EVOLUTION OF THE GALAXY MASS-METALLICITY RELATION

    International Nuclear Information System (INIS)

    Tonini, Chiara

    2013-01-01

    We build a theoretical model to study the origin of the globular cluster metallicity bimodality in the hierarchical galaxy assembly scenario. The model is based on empirical relations such as the galaxy mass-metallicity relation [O/H]-M star as a function of redshift, and on the observed galaxy stellar mass function up to redshift z ∼ 4. We make use of the theoretical merger rates as a function of mass and redshift from the Millennium simulation to build galaxy merger trees. We derive a new galaxy [Fe/H]-M star relation as a function of redshift, and by assuming that globular clusters share the metallicity of their original parent galaxy at the time of their formation, we populate the merger tree with globular clusters. We perform a series of Monte Carlo simulations of the galaxy hierarchical assembly, and study the properties of the final globular cluster population as a function of galaxy mass, assembly and star formation history, and under different assumptions for the evolution of the galaxy mass-metallicity relation. The main results and predictions of the model are the following. (1) The hierarchical clustering scenario naturally predicts a metallicity bimodality in the galaxy globular cluster population, where the metal-rich subpopulation is composed of globular clusters formed in the galaxy main progenitor around redshift z ∼ 2, and the metal-poor subpopulation is composed of clusters accreted from satellites, and formed at redshifts z ∼ 3-4. (2) The model reproduces the observed relations by Peng et al. for the metallicities of the metal-rich and metal-poor globular cluster subpopulations as a function of galaxy mass; the positions of the metal-poor and metal-rich peaks depend exclusively on the evolution of the galaxy mass-metallicity relation and the [O/Fe], both of which can be constrained by this method. In particular, we find that the galaxy [O/Fe] evolves linearly with redshift from a value of ∼0.5 at redshift z ∼ 4 to a value of ∼0.1 at

  15. How hierarchical is language use?

    Science.gov (United States)

    Frank, Stefan L.; Bod, Rens; Christiansen, Morten H.

    2012-01-01

    It is generally assumed that hierarchical phrase structure plays a central role in human language. However, considerations of simplicity and evolutionary continuity suggest that hierarchical structure should not be invoked too hastily. Indeed, recent neurophysiological, behavioural and computational studies show that sequential sentence structure has considerable explanatory power and that hierarchical processing is often not involved. In this paper, we review evidence from the recent literature supporting the hypothesis that sequential structure may be fundamental to the comprehension, production and acquisition of human language. Moreover, we provide a preliminary sketch outlining a non-hierarchical model of language use and discuss its implications and testable predictions. If linguistic phenomena can be explained by sequential rather than hierarchical structure, this will have considerable impact in a wide range of fields, such as linguistics, ethology, cognitive neuroscience, psychology and computer science. PMID:22977157

  16. Road Network Selection Based on Road Hierarchical Structure Control

    Directory of Open Access Journals (Sweden)

    HE Haiwei

    2015-04-01

    Full Text Available A new road network selection method based on hierarchical structure is studied. Firstly, road network is built as strokes which are then classified into hierarchical collections according to the criteria of betweenness centrality value (BC value. Secondly, the hierarchical structure of the strokes is enhanced using structural characteristic identification technique. Thirdly, the importance calculation model was established according to the relationships among the hierarchical structure of the strokes. Finally, the importance values of strokes are got supported with the model's hierarchical calculation, and with which the road network is selected. Tests are done to verify the advantage of this method by comparing it with other common stroke-oriented methods using three kinds of typical road network data. Comparision of the results show that this method had few need to semantic data, and could eliminate the negative influence of edge strokes caused by the criteria of BC value well. So, it is better to maintain the global hierarchical structure of road network, and suitable to meet with the selection of various kinds of road network at the same time.

  17. Hierarchical model generation for architecture reconstruction using laser-scanned point clouds

    Science.gov (United States)

    Ning, Xiaojuan; Wang, Yinghui; Zhang, Xiaopeng

    2014-06-01

    Architecture reconstruction using terrestrial laser scanner is a prevalent and challenging research topic. We introduce an automatic, hierarchical architecture generation framework to produce full geometry of architecture based on a novel combination of facade structures detection, detailed windows propagation, and hierarchical model consolidation. Our method highlights the generation of geometric models automatically fitting the design information of the architecture from sparse, incomplete, and noisy point clouds. First, the planar regions detected in raw point clouds are interpreted as three-dimensional clusters. Then, the boundary of each region extracted by projecting the points into its corresponding two-dimensional plane is classified to obtain detailed shape structure elements (e.g., windows and doors). Finally, a polyhedron model is generated by calculating the proposed local structure model, consolidated structure model, and detailed window model. Experiments on modeling the scanned real-life buildings demonstrate the advantages of our method, in which the reconstructed models not only correspond to the information of architectural design accurately, but also satisfy the requirements for visualization and analysis.

  18. Cluster Analysis as an Analytical Tool of Population Policy

    Directory of Open Access Journals (Sweden)

    Oksana Mikhaylovna Shubat

    2017-12-01

    Full Text Available The predicted negative trends in Russian demography (falling birth rates, population decline actualize the need to strengthen measures of family and population policy. Our research purpose is to identify groups of Russian regions with similar characteristics in the family sphere using cluster analysis. The findings should make an important contribution to the field of family policy. We used hierarchical cluster analysis based on the Ward method and the Euclidean distance for segmentation of Russian regions. Clustering is based on four variables, which allowed assessing the family institution in the region. The authors used the data of Federal State Statistics Service from 2010 to 2015. Clustering and profiling of each segment has allowed forming a model of Russian regions depending on the features of the family institution in these regions. The authors revealed four clusters grouping regions with similar problems in the family sphere. This segmentation makes it possible to develop the most relevant family policy measures in each group of regions. Thus, the analysis has shown a high degree of differentiation of the family institution in the regions. This suggests that a unified approach to population problems’ solving is far from being effective. To achieve greater results in the implementation of family policy, a differentiated approach is needed. Methods of multidimensional data classification can be successfully applied as a relevant analytical toolkit. Further research could develop the adaptation of multidimensional classification methods to the analysis of the population problems in Russian regions. In particular, the algorithms of nonparametric cluster analysis may be of relevance in future studies.

  19. Hierarchical Context Modeling for Video Event Recognition.

    Science.gov (United States)

    Wang, Xiaoyang; Ji, Qiang

    2016-10-11

    Current video event recognition research remains largely target-centered. For real-world surveillance videos, targetcentered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.

  20. Hierarchical analysis of dependency in metabolic networks.

    Science.gov (United States)

    Gagneur, Julien; Jackson, David B; Casari, Georg

    2003-05-22

    Elucidation of metabolic networks for an increasing number of organisms reveals that even small networks can contain thousands of reactions and chemical species. The intimate connectivity between components complicates their decomposition into biologically meaningful sub-networks. Moreover, traditional higher-order representations of metabolic networks as metabolic pathways, suffers from the lack of rigorous definition, yielding pathways of disparate content and size. We introduce a hierarchical representation that emphasizes the gross organization of metabolic networks in largely independent pathways and sub-systems at several levels of independence. The approach highlights the coupling of different pathways and the shared compounds responsible for those couplings. By assessing our results on Escherichia coli (E.coli metabolic reactions, Genetic Circuits Research Group, University of California, San Diego, http://gcrg.ucsd.edu/organisms/ecoli.html, 'model v 1.01. reactions') against accepted biochemical annotations, we provide the first systematic synopsis of an organism's metabolism. Comparison with operons of E.coli shows that low-level clusters are reflected in genome organization and gene regulation. Source code, data sets and supplementary information are available at http://www.mas.ecp.fr/labo/equipe/gagneur/hierarchy/hierarchy.html

  1. Using cluster analysis and a classification and regression tree model to developed cover types in the Sky Islands of southeastern Arizona [Abstract

    Science.gov (United States)

    Jose M. Iniguez; Joseph L. Ganey; Peter J. Daugherty; John D. Bailey

    2005-01-01

    The objective of this study was to develop a rule based cover type classification system for the forest and woodland vegetation in the Sky Islands of southeastern Arizona. In order to develop such system we qualitatively and quantitatively compared a hierarchical (Ward’s) and a non-hierarchical (k-means) clustering method. Ecologically, unique groups and plots...

  2. Markov Chain Model-Based Optimal Cluster Heads Selection for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Gulnaz Ahmed

    2017-02-01

    Full Text Available The longer network lifetime of Wireless Sensor Networks (WSNs is a goal which is directly related to energy consumption. This energy consumption issue becomes more challenging when the energy load is not properly distributed in the sensing area. The hierarchal clustering architecture is the best choice for these kind of issues. In this paper, we introduce a novel clustering protocol called Markov chain model-based optimal cluster heads (MOCHs selection for WSNs. In our proposed model, we introduce a simple strategy for the optimal number of cluster heads selection to overcome the problem of uneven energy distribution in the network. The attractiveness of our model is that the BS controls the number of cluster heads while the cluster heads control the cluster members in each cluster in such a restricted manner that a uniform and even load is ensured in each cluster. We perform an extensive range of simulation using five quality measures, namely: the lifetime of the network, stable and unstable region in the lifetime of the network, throughput of the network, the number of cluster heads in the network, and the transmission time of the network to analyze the proposed model. We compare MOCHs against Sleep-awake Energy Efficient Distributed (SEED clustering, Artificial Bee Colony (ABC, Zone Based Routing (ZBR, and Centralized Energy Efficient Clustering (CEEC using the above-discussed quality metrics and found that the lifetime of the proposed model is almost 1095, 2630, 3599, and 2045 rounds (time steps greater than SEED, ABC, ZBR, and CEEC, respectively. The obtained results demonstrate that the MOCHs is better than SEED, ABC, ZBR, and CEEC in terms of energy efficiency and the network throughput.

  3. Edge Principal Components and Squash Clustering: Using the Special Structure of Phylogenetic Placement Data for Sample Comparison

    Science.gov (United States)

    Matsen IV, Frederick A.; Evans, Steven N.

    2013-01-01

    Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome. PMID:23505415

  4. AN UPDATED CATALOG OF M33 CLUSTERS AND CANDIDATES: UBVRI PHOTOMETRY AND SOME STATISTICAL RESULTS

    International Nuclear Information System (INIS)

    Ma Jun

    2012-01-01

    We present UBVRI photometry for 392 star clusters and candidates in the field of M33, which are selected from the most recent star cluster catalog. In this catalog, the authors listed star clusters' parameters such as cluster positions, magnitudes, colors in the UBVRIJHK s filters, and so on. However, a large fraction of objects in this catalog do not have previously published photometry. Photometry is performed using archival images from the Local Group Galaxies Survey, which covers 0.8 deg 2 along the major axis of M33. Detailed comparisons show that, in general, our photometry is consistent with previous measurements. Positions (right ascension and declination) for some clusters are corrected here. Combined with previous literature, ours constitute a large sample of M33 star clusters. Based on this cluster sample, we present some statistical results: none of the youngest M33 clusters (∼10 7 yr) have masses approaching 10 5 M ☉ ; roughly half the star clusters are consistent with the 10 4 -10 5 M ☉ mass models; the continuous distribution of star clusters along the model line indicates that M33 star clusters have been formed continuously from the epoch of the first star cluster formation until recent times; and there are ∼50 star clusters which are overlapped with the Galactic globular clusters on the color-color diagram, and these clusters are old globular cluster candidates in M33.

  5. TESTING STRICT HYDROSTATIC EQUILIBRIUM IN SIMULATED CLUSTERS OF GALAXIES: IMPLICATIONS FOR A1689

    International Nuclear Information System (INIS)

    Molnar, S. M.; Umetsu, K.; Chiu, I.-N.; Chen, P.; Hearn, N.; Broadhurst, T.; Bryan, G.; Shang, C.

    2010-01-01

    Accurate mass determination of clusters of galaxies is crucial if they are to be used as cosmological probes. However, there are some discrepancies between cluster masses determined based on gravitational lensing and X-ray observations assuming strict hydrostatic equilibrium (i.e., the equilibrium gas pressure is provided entirely by thermal pressure). Cosmological simulations suggest that turbulent gas motions remaining from hierarchical structure formation may provide a significant contribution to the equilibrium pressure in clusters. We analyze a sample of massive clusters of galaxies drawn from high-resolution cosmological simulations and find a significant contribution (20%-45%) from non-thermal pressure near the center of relaxed clusters, and, in accord with previous studies, a minimum contribution at about 0.1 R vir , growing to about 30%-45% at the virial radius, R vir . Our results strongly suggest that relaxed clusters should have significant non-thermal support in their core region. As an example, we test the validity of strict hydrostatic equilibrium in the well-studied massive galaxy cluster A1689 using the latest high-resolution gravitational lensing and X-ray observations. We find a contribution of about 40% from non-thermal pressure within the core region of A1689, suggesting an alternate explanation for the mass discrepancy: the strict hydrostatic equilibrium is not valid in this region.

  6. Exploring hierarchical and overlapping modular structure in the yeast protein interaction network

    Directory of Open Access Journals (Sweden)

    Zhao Yi

    2010-12-01

    Full Text Available Abstract Background Developing effective strategies to reveal modular structures in protein interaction networks is crucial for better understanding of molecular mechanisms of underlying biological processes. In this paper, we propose a new density-based algorithm (ADHOC for clustering vertices of a protein interaction network using a novel subgraph density measurement. Results By statistically evaluating several independent criteria, we found that ADHOC could significantly improve the outcome as compared with five previously reported density-dependent methods. We further applied ADHOC to investigate the hierarchical and overlapping modular structure in the yeast PPI network. Our method could effectively detect both protein modules and the overlaps between them, and thus greatly promote the precise prediction of protein functions. Moreover, by further assaying the intermodule layer of the yeast PPI network, we classified hubs into two types, module hubs and inter-module hubs. Each type presents distinct characteristics both in network topology and biological functions, which could conduce to the better understanding of relationship between network architecture and biological implications. Conclusions Our proposed algorithm based on the novel subgraph density measurement makes it possible to more precisely detect hierarchical and overlapping modular structures in protein interaction networks. In addition, our method also shows a strong robustness against the noise in network, which is quite critical for analyzing such a high noise network.

  7. Hierarchical capillary adhesion of microcantilevers or hairs

    International Nuclear Information System (INIS)

    Liu Jianlin; Feng Xiqiao; Xia Re; Zhao Hongping

    2007-01-01

    As a result of capillary forces, animal hairs, carbon nanotubes or nanowires of a periodically or randomly distributed array often assemble into hierarchical structures. In this paper, the energy method is adopted to analyse the capillary adhesion of microsized hairs, which are modelled as clamped microcantilevers wetted by liquids. The critical conditions for capillary adhesion of two hairs, three hairs or two bundles of hairs are derived in terms of Young's contact angle, elastic modulus and geometric sizes of the beams. Then, the hierarchical capillary adhesion of hairs is addressed. It is found that for multiple hairs or microcantilevers, the system tends to take a hierarchical structure as a result of the minimization of the total potential energy of the system. The level number of structural hierarchy increases with the increase in the number of hairs if they are sufficiently long. Additionally, we performed experiments to verify our theoretical solutions for the adhesion of microbeams

  8. A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure.

    Science.gov (United States)

    Balzer, Laura B; Zheng, Wenjing; van der Laan, Mark J; Petersen, Maya L

    2018-01-01

    We often seek to estimate the impact of an exposure naturally occurring or randomly assigned at the cluster-level. For example, the literature on neighborhood determinants of health continues to grow. Likewise, community randomized trials are applied to learn about real-world implementation, sustainability, and population effects of interventions with proven individual-level efficacy. In these settings, individual-level outcomes are correlated due to shared cluster-level factors, including the exposure, as well as social or biological interactions between individuals. To flexibly and efficiently estimate the effect of a cluster-level exposure, we present two targeted maximum likelihood estimators (TMLEs). The first TMLE is developed under a non-parametric causal model, which allows for arbitrary interactions between individuals within a cluster. These interactions include direct transmission of the outcome (i.e. contagion) and influence of one individual's covariates on another's outcome (i.e. covariate interference). The second TMLE is developed under a causal sub-model assuming the cluster-level and individual-specific covariates are sufficient to control for confounding. Simulations compare the alternative estimators and illustrate the potential gains from pairing individual-level risk factors and outcomes during estimation, while avoiding unwarranted assumptions. Our results suggest that estimation under the sub-model can result in bias and misleading inference in an observational setting. Incorporating working assumptions during estimation is more robust than assuming they hold in the underlying causal model. We illustrate our approach with an application to HIV prevention and treatment.

  9. A Comprehensive Survey on Hierarchical-Based Routing Protocols for Mobile Wireless Sensor Networks: Review, Taxonomy, and Future Directions

    Directory of Open Access Journals (Sweden)

    Nabil Sabor

    2017-01-01

    Full Text Available Introducing mobility to Wireless Sensor Networks (WSNs puts new challenges particularly in designing of routing protocols. Mobility can be applied to the sensor nodes and/or the sink node in the network. Many routing protocols have been developed to support the mobility of WSNs. These protocols are divided depending on the routing structure into hierarchical-based, flat-based, and location-based routing protocols. However, the hierarchical-based routing protocols outperform the other routing types in saving energy, scalability, and extending lifetime of Mobile WSNs (MWSNs. Selecting an appropriate hierarchical routing protocol for specific applications is an important and difficult task. Therefore, this paper focuses on reviewing some of the recently hierarchical-based routing protocols that are developed in the last five years for MWSNs. This survey divides the hierarchical-based routing protocols into two broad groups, namely, classical-based and optimized-based routing protocols. Also, we present a detailed classification of the reviewed protocols according to the routing approach, control manner, mobile element, mobility pattern, network architecture, clustering attributes, protocol operation, path establishment, communication paradigm, energy model, protocol objectives, and applications. Moreover, a comparison between the reviewed protocols is investigated in this survey depending on delay, network size, energy-efficiency, and scalability while mentioning the advantages and drawbacks of each protocol. Finally, we summarize and conclude the paper with future directions.

  10. Cluster Analysis of the Newcastle Electronic Corpus of Tyneside English: In A Comparison of Methods

    NARCIS (Netherlands)

    Moisl, Hermann; Jones, Valerie M.

    2005-01-01

    This article examines the feasibility of an empirical approach to sociolinguistic analysis of the Newcastle Electronic Corpus of Tyneside English using exploratory multivariate methods. It addresses a known problem with one class of such methods, hierarchical cluster analysis—that different

  11. CLOSE STELLAR ENCOUNTERS IN YOUNG, SUBSTRUCTURED, DISSOLVING STAR CLUSTERS: STATISTICS AND EFFECTS ON PLANETARY SYSTEMS

    International Nuclear Information System (INIS)

    Craig, Jonathan; Krumholz, Mark R.

    2013-01-01

    Both simulations and observations indicate that stars form in filamentary, hierarchically clustered associations, most of which disperse into their galactic field once feedback destroys their parent clouds. However, during their early evolution in these substructured environments, stars can undergo close encounters with one another that might have significant impacts on their protoplanetary disks or young planetary systems. We perform N-body simulations of the early evolution of dissolving, substructured clusters with a wide range of properties, with the aim of quantifying the expected number and orbital element distributions of encounters as a function of cluster properties. We show that the presence of substructure both boosts the encounter rate and modifies the distribution of encounter velocities compared to what would be expected for a dynamically relaxed cluster. However, the boost only lasts for a dynamical time, and as a result the overall number of encounters expected remains low enough that gravitational stripping is unlikely to be a significant effect for the vast majority of star-forming environments in the Galaxy. We briefly discuss the implications of this result for models of the origin of the solar system, and of free-floating planets. We also provide tabulated encounter rates and orbital element distributions suitable for inclusion in population synthesis models of planet formation in a clustered environment.

  12. CLOSE STELLAR ENCOUNTERS IN YOUNG, SUBSTRUCTURED, DISSOLVING STAR CLUSTERS: STATISTICS AND EFFECTS ON PLANETARY SYSTEMS

    Energy Technology Data Exchange (ETDEWEB)

    Craig, Jonathan; Krumholz, Mark R., E-mail: krumholz@ucolick.org [Department of Astronomy and Astrophysics, University of California, Santa Cruz, CA 95064 (United States)

    2013-06-01

    Both simulations and observations indicate that stars form in filamentary, hierarchically clustered associations, most of which disperse into their galactic field once feedback destroys their parent clouds. However, during their early evolution in these substructured environments, stars can undergo close encounters with one another that might have significant impacts on their protoplanetary disks or young planetary systems. We perform N-body simulations of the early evolution of dissolving, substructured clusters with a wide range of properties, with the aim of quantifying the expected number and orbital element distributions of encounters as a function of cluster properties. We show that the presence of substructure both boosts the encounter rate and modifies the distribution of encounter velocities compared to what would be expected for a dynamically relaxed cluster. However, the boost only lasts for a dynamical time, and as a result the overall number of encounters expected remains low enough that gravitational stripping is unlikely to be a significant effect for the vast majority of star-forming environments in the Galaxy. We briefly discuss the implications of this result for models of the origin of the solar system, and of free-floating planets. We also provide tabulated encounter rates and orbital element distributions suitable for inclusion in population synthesis models of planet formation in a clustered environment.

  13. Close Stellar Encounters in Young, Substructured, Dissolving Star Clusters: Statistics and Effects on Planetary Systems

    Science.gov (United States)

    Craig, Jonathan; Krumholz, Mark R.

    2013-06-01

    Both simulations and observations indicate that stars form in filamentary, hierarchically clustered associations, most of which disperse into their galactic field once feedback destroys their parent clouds. However, during their early evolution in these substructured environments, stars can undergo close encounters with one another that might have significant impacts on their protoplanetary disks or young planetary systems. We perform N-body simulations of the early evolution of dissolving, substructured clusters with a wide range of properties, with the aim of quantifying the expected number and orbital element distributions of encounters as a function of cluster properties. We show that the presence of substructure both boosts the encounter rate and modifies the distribution of encounter velocities compared to what would be expected for a dynamically relaxed cluster. However, the boost only lasts for a dynamical time, and as a result the overall number of encounters expected remains low enough that gravitational stripping is unlikely to be a significant effect for the vast majority of star-forming environments in the Galaxy. We briefly discuss the implications of this result for models of the origin of the solar system, and of free-floating planets. We also provide tabulated encounter rates and orbital element distributions suitable for inclusion in population synthesis models of planet formation in a clustered environment.

  14. HLM in Cluster-Randomised Trials--Measuring Efficacy across Diverse Populations of Learners

    Science.gov (United States)

    Hegedus, Stephen; Tapper, John; Dalton, Sara; Sloane, Finbarr

    2013-01-01

    We describe the application of Hierarchical Linear Modelling (HLM) in a cluster-randomised study to examine learning algebraic concepts and procedures in an innovative, technology-rich environment in the US. HLM is applied to measure the impact of such treatment on learning and on contextual variables. We provide a detailed description of such…

  15. Sleep, Dietary, and Exercise Behavioral Clusters Among Truck Drivers With Obesity: Implications for Interventions.

    Science.gov (United States)

    Olson, Ryan; Thompson, Sharon V; Wipfli, Brad; Hanson, Ginger; Elliot, Diane L; Anger, W Kent; Bodner, Todd; Hammer, Leslie B; Hohn, Elliot; Perrin, Nancy A

    2016-03-01

    The objectives of the study were to describe a sample of truck drivers, identify clusters of drivers with similar patterns in behaviors affecting energy balance (sleep, diet, and exercise), and test for cluster differences in health safety, and psychosocial factors. Participants' (n = 452, body mass index M = 37.2, 86.4% male) self-reported behaviors were dichotomized prior to hierarchical cluster analysis, which identified groups with similar behavior covariation. Cluster differences were tested with generalized estimating equations. Five behavioral clusters were identified that differed significantly in age, smoking status, diabetes prevalence, lost work days, stress, and social support, but not in body mass index. Cluster 2, characterized by the best sleep quality, had significantly lower lost workdays and stress than other clusters. Weight management interventions for drivers should explicitly address sleep, and may be maximally effective after establishing socially supportive work environments that reduce stress exposures.

  16. Performance Analysis of Entropy Methods on K Means in Clustering Process

    Science.gov (United States)

    Dicky Syahputra Lubis, Mhd.; Mawengkang, Herman; Suwilo, Saib

    2017-12-01

    K Means is a non-hierarchical data clustering method that attempts to partition existing data into one or more clusters / groups. This method partitions the data into clusters / groups so that data that have the same characteristics are grouped into the same cluster and data that have different characteristics are grouped into other groups.The purpose of this data clustering is to minimize the objective function set in the clustering process, which generally attempts to minimize variation within a cluster and maximize the variation between clusters. However, the main disadvantage of this method is that the number k is often not known before. Furthermore, a randomly chosen starting point may cause two points to approach the distance to be determined as two centroids. Therefore, for the determination of the starting point in K Means used entropy method where this method is a method that can be used to determine a weight and take a decision from a set of alternatives. Entropy is able to investigate the harmony in discrimination among a multitude of data sets. Using Entropy criteria with the highest value variations will get the highest weight. Given this entropy method can help K Means work process in determining the starting point which is usually determined at random. Thus the process of clustering on K Means can be more quickly known by helping the entropy method where the iteration process is faster than the K Means Standard process. Where the postoperative patient dataset of the UCI Repository Machine Learning used and using only 12 data as an example of its calculations is obtained by entropy method only with 2 times iteration can get the desired end result.

  17. MOTIVATIONAL CLUSTER PROFILES OF ADOLESCENT ATHLETES: AN EXAMINATION OF DIFFERENCES IN PHYSICAL-SELF PERCEPTION

    Directory of Open Access Journals (Sweden)

    Emine Çağlar

    2010-06-01

    Full Text Available The primary purpose of the present study was to identify motivational profiles of adolescent athletes using cluster analysis in non-Western culture. A second purpose was to examine relationships between physical self-perception differences of adolescent athletes and motivational profiles. One hundred and thirty six male (Mage = 17.46, SD = 1.25 years and 80 female adolescent athletes (Mage = 17.61, SD = 1.19 years from a variety of team sports including basketball, soccer, volleyball, and handball volunteered to participate in this study. The Sport Motivation Scale (SMS and Physical Self-Perception Profile (PSPP were administered to all participants. Hierarchical cluster analysis revealed a four-cluster solution for this sample: amotivated, low motivated, moderate motivated, and highly motivated. A 4 x 5 (Cluster x PSPP Subscales MANOVA revealed no significant main effect of motivational clusters on physical self-perception levels (p > 0.05. As a result, findings of the present study showed that motivational types of the adolescent athletes constituted four different motivational clusters. Highly and moderate motivated athletes consistently scored higher than amotivated athletes on the perceived sport competence, physical condition, and physical self-worth subscales of PSPP. This study identified motivational profiles of competitive youth-sport participants

  18. SATELLITE DWARF GALAXIES IN A HIERARCHICAL UNIVERSE: THE PREVALENCE OF DWARF-DWARF MAJOR MERGERS

    OpenAIRE

    Deason, A; Wetzel, A; Garrison-Kimmel, S

    2014-01-01

    Mergers are a common phenomenon in hierarchical structure formation, especially for massive galaxies and clusters, but their importance for dwarf galaxies in the Local Group remains poorly understood. We investigate the frequency of major mergers between dwarf galaxies in the Local Group using the ELVIS suite of cosmological zoom-in dissipationless simulations of Milky Way- and M31-like host halos. We find that ~10% of satellite dwarf galaxies with M_star > 10^6 M_sun that are within the host...

  19. Morphological quantification of hierarchical geomaterials by X-ray nano-CT bridges the gap from nano to micro length scales

    KAUST Repository

    Brisard, S.

    2012-01-30

    Morphological quantification of the complex structure of hierarchical geomaterials is of great relevance for Earth science and environmental engineering, among others. To date, methods that quantify the 3D morphology on length scales ranging from a few tens of nanometers to several hun-dred nanometers have had limited success. We demonstrate, for the first time, that it is possible to go beyond visualization and to extract quantitative morphological information from X-ray images in the aforementioned length scales. As examples, two different hierarchical geomaterials exhibiting complex porous structures ranging from nanometer to macroscopic scale are studied: a flocculated clay water suspension and two hydrated cement pastes. We show that from a single projection image it is possible to perform a direct computation of the ultra-small angle-scattering spectra. The predictions matched very well the experimental data obtained by the best ultra-small angle-scattering experimental setups as observed for the cement paste. In this context, we demonstrate that the structure of flocculated clay suspension exhibit two well-distinct regimes of aggregation, a dense mass fractal aggregation at short distance and a more open structure at large distance, which can be generated by a 3D reaction limited cluster-cluster aggregation process. For the first time, a high-resolution 3D image of fibrillar cement paste cluster was obtained from limited angle nanotomography.

  20. EVOLUTION AND DISTRIBUTION OF MAGNETIC FIELDS FROM ACTIVE GALACTIC NUCLEI IN GALAXY CLUSTERS. II. THE EFFECTS OF CLUSTER SIZE AND DYNAMICAL STATE

    International Nuclear Information System (INIS)

    Xu Hao; Li Hui; Collins, David C.; Li, Shengtai; Norman, Michael L.

    2011-01-01

    Theory and simulations suggest that magnetic fields from radio jets and lobes powered by their central super massive black holes can be an important source of magnetic fields in the galaxy clusters. This is Paper II in a series of studies where we present self-consistent high-resolution adaptive mesh refinement cosmological magnetohydrodynamic simulations that simultaneously follow the formation of a galaxy cluster and evolution of magnetic fields ejected by an active galactic nucleus. We studied 12 different galaxy clusters with virial masses ranging from 1 x 10 14 to 2 x 10 15 M sun . In this work, we examine the effects of the mass and merger history on the final magnetic properties. We find that the evolution of magnetic fields is qualitatively similar to those of previous studies. In most clusters, the injected magnetic fields can be transported throughout the cluster and be further amplified by the intracluster medium (ICM) turbulence during the cluster formation process with hierarchical mergers, while the amplification history and the magnetic field distribution depend on the cluster formation and magnetism history. This can be very different for different clusters. The total magnetic energies in these clusters are between 4 x 10 57 and 10 61 erg, which is mainly decided by the cluster mass, scaling approximately with the square of the total mass. Dynamically older relaxed clusters usually have more magnetic fields in their ICM. The dynamically very young clusters may be magnetized weakly since there is not enough time for magnetic fields to be amplified.

  1. Stability of glassy hierarchical networks

    Science.gov (United States)

    Zamani, M.; Camargo-Forero, L.; Vicsek, T.

    2018-02-01

    The structure of interactions in most animal and human societies can be best represented by complex hierarchical networks. In order to maintain close-to-optimal function both stability and adaptability are necessary. Here we investigate the stability of hierarchical networks that emerge from the simulations of an organization type with an efficiency function reminiscent of the Hamiltonian of spin glasses. Using this quantitative approach we find a number of expected (from everyday observations) and highly non-trivial results for the obtained locally optimal networks, including, for example: (i) stability increases with growing efficiency and level of hierarchy; (ii) the same perturbation results in a larger change for more efficient states; (iii) networks with a lower level of hierarchy become more efficient after perturbation; (iv) due to the huge number of possible optimal states only a small fraction of them exhibit resilience and, finally, (v) ‘attacks’ targeting the nodes selectively (regarding their position in the hierarchy) can result in paradoxical outcomes.

  2. Making Statistical Data More Easily Accessible on the Web Results of the StatSearch Case Study

    CERN Document Server

    Rajman, M; Boynton, I M; Fridlund, B; Fyhrlund, A; Sundgren, B; Lundquist, P; Thelander, H; Wänerskär, M

    2005-01-01

    In this paper we present the results of the StatSearch case study that aimed at providing an enhanced access to statistical data available on the Web. In the scope of this case study we developed a prototype of an information access tool combining a query-based search engine with semi-automated navigation techniques exploiting the hierarchical structuring of the available data. This tool enables a better control of the information retrieval, improving the quality and ease of the access to statistical information. The central part of the presented StatSearch tool consists in the design of an algorithm for automated navigation through a tree-like hierarchical document structure. The algorithm relies on the computation of query related relevance score distributions over the available database to identify the most relevant clusters in the data structure. These most relevant clusters are then proposed to the user for navigation, or, alternatively, are the support for the automated navigation process. Several appro...

  3. On Hierarchical Extensions of Large-Scale 4-regular Grid Network Structures

    DEFF Research Database (Denmark)

    Pedersen, Jens Myrup; Patel, A.; Knudsen, Thomas Phillip

    2004-01-01

    dependencies between the number of nodes and the distances in the structures. The perfect square mesh is introduced for hierarchies, and it is shown that applying ordered hierarchies in this way results in logarithmic dependencies between the number of nodes and the distances, resulting in better scaling...... structures. For example, in a mesh of 391876 nodes the average distance is reduced from 417.33 to 17.32 by adding hierarchical lines. This is gained by increasing the number of lines by 4.20% compared to the non-hierarchical structure. A similar hierarchical extension of the torus structure also results...

  4. Clustering of commercial fish sauce products based on an e-panel technique

    Directory of Open Access Journals (Sweden)

    Mitsutoshi Nakano

    2018-02-01

    Full Text Available Fish sauce is a brownish liquid seasoning with a characteristic flavor that is produced in Asian countries and limited areas of Europe. The types of fish and shellfish and fermentation process used in its production depend on the region from which it derives. Variations in ingredients and fermentation procedures yield end products with different smells, tastes, and colors. For this data article, we employed an electronic panel (e-panel technique including an electronic nose (e-nose, electronic tongue (e-tongue, and electronic eye (e-eye, in which smell, taste, and color are evaluated by sensors instead of the human nose, tongue, and eye to avoid subjective error. The presented data comprise clustering of 46 commercially available fish sauce products based separate e-nose, e-tongue, and e-eye test results. Sensory intensity data from the e-nose, e-tongue, and e-eye were separately classified by cluster analysis and are shown in dendrograms. The hierarchical cluster analysis indicates major three groups on e-nose and e-tongue data, and major four groups on e-eye data.

  5. Developing a Clustering-Based Empirical Bayes Analysis Method for Hotspot Identification

    Directory of Open Access Journals (Sweden)

    Yajie Zou

    2017-01-01

    Full Text Available Hotspot identification (HSID is a critical part of network-wide safety evaluations. Typical methods for ranking sites are often rooted in using the Empirical Bayes (EB method to estimate safety from both observed crash records and predicted crash frequency based on similar sites. The performance of the EB method is highly related to the selection of a reference group of sites (i.e., roadway segments or intersections similar to the target site from which safety performance functions (SPF used to predict crash frequency will be developed. As crash data often contain underlying heterogeneity that, in essence, can make them appear to be generated from distinct subpopulations, methods are needed to select similar sites in a principled manner. To overcome this possible heterogeneity problem, EB-based HSID methods that use common clustering methodologies (e.g., mixture models, K-means, and hierarchical clustering to select “similar” sites for building SPFs are developed. Performance of the clustering-based EB methods is then compared using real crash data. Here, HSID results, when computed on Texas undivided rural highway cash data, suggest that all three clustering-based EB analysis methods are preferred over the conventional statistical methods. Thus, properly classifying the road segments for heterogeneous crash data can further improve HSID accuracy.

  6. Hierarchical decision making for flood risk reduction

    DEFF Research Database (Denmark)

    Custer, Rocco; Nishijima, Kazuyoshi

    2013-01-01

    . In current practice, structures are often optimized individually without considering benefits of having a hierarchy of protection structures. It is here argued, that the joint consideration of hierarchically integrated protection structures is beneficial. A hierarchical decision model is utilized to analyze...... and compare the benefit of large upstream protection structures and local downstream protection structures in regard to epistemic uncertainty parameters. Results suggest that epistemic uncertainty influences the outcome of the decision model and that, depending on the magnitude of epistemic uncertainty...

  7. The Structure of the Young Star Cluster NGC 6231. II. Structure, Formation, and Fate

    Science.gov (United States)

    Kuhn, Michael A.; Getman, Konstantin V.; Feigelson, Eric D.; Sills, Alison; Gromadzki, Mariusz; Medina, Nicolás; Borissova, Jordanka; Kurtev, Radostin

    2017-12-01

    The young cluster NGC 6231 (stellar ages ˜2-7 Myr) is observed shortly after star formation activity has ceased. Using the catalog of 2148 probable cluster members obtained from Chandra, VVV, and optical surveys (Paper I), we examine the cluster’s spatial structure and dynamical state. The spatial distribution of stars is remarkably well fit by an isothermal sphere with moderate elongation, while other commonly used models like Plummer spheres, multivariate normal distributions, or power-law models are poor fits. The cluster has a core radius of 1.2 ± 0.1 pc and a central density of ˜200 stars pc-3. The distribution of stars is mildly mass segregated. However, there is no radial stratification of the stars by age. Although most of the stars belong to a single cluster, a small subcluster of stars is found superimposed on the main cluster, and there are clumpy non-isotropic distributions of stars outside ˜4 core radii. When the size, mass, and age of NGC 6231 are compared to other young star clusters and subclusters in nearby active star-forming regions, it lies at the high-mass end of the distribution but along the same trend line. This could result from similar formation processes, possibly hierarchical cluster assembly. We argue that NGC 6231 has expanded from its initial size but that it remains gravitationally bound.

  8. Hierarchical classification with a competitive evolutionary neural tree.

    Science.gov (United States)

    Adams, R G.; Butchart, K; Davey, N

    1999-04-01

    A new, dynamic, tree structured network, the Competitive Evolutionary Neural Tree (CENT) is introduced. The network is able to provide a hierarchical classification of unlabelled data sets. The main advantage that the CENT offers over other hierarchical competitive networks is its ability to self determine the number, and structure, of the competitive nodes in the network, without the need for externally set parameters. The network produces stable classificatory structures by halting its growth using locally calculated heuristics. The results of network simulations are presented over a range of data sets, including Anderson's IRIS data set. The CENT network demonstrates its ability to produce a representative hierarchical structure to classify a broad range of data sets.

  9. Deployment Strategy for Car-Sharing Depots by Clustering Urban Traffic Big Data Based on Affinity Propagation

    Directory of Open Access Journals (Sweden)

    Zhihan Liu

    2018-01-01

    Full Text Available Car sharing is a type of car rental service, by which consumers rent cars for short periods of time, often charged by hours. The analysis of urban traffic big data is full of importance and significance to determine locations of depots for car-sharing system. Taxi OD (Origin-Destination is a typical dataset of urban traffic. The volume of the data is extremely large so that traditional data processing applications do not work well. In this paper, an optimization method to determine the depot locations by clustering taxi OD points with AP (Affinity Propagation clustering algorithm has been presented. By analyzing the characteristics of AP clustering algorithm, AP clustering has been optimized hierarchically based on administrative region segmentation. Considering sparse similarity matrix of taxi OD points, the input parameters of AP clustering have been adapted. In the case study, we choose the OD pairs information from Beijing’s taxi GPS trajectory data. The number and locations of depots are determined by clustering the OD points based on the optimization AP clustering. We describe experimental results of our approach and compare it with standard K-means method using quantitative and stationarity index. Experiments on the real datasets show that the proposed method for determining car-sharing depots has a superior performance.

  10. Band structures of two dimensional solid/air hierarchical phononic crystals

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Y.L.; Tian, X.G. [State Key Laboratory for Mechanical Structure Strength and Vibration, Xi' an Jiaotong University, Xi' an 710049 (China); Chen, C.Q., E-mail: chencq@tsinghua.edu.cn [Department of Engineering Mechanics, AML and CNMM, Tsinghua University, Beijing 100084 (China)

    2012-06-15

    The hierarchical phononic crystals to be considered show a two-order 'hierarchical' feature, which consists of square array arranged macroscopic periodic unit cells with each unit cell itself including four sub-units. Propagation of acoustic wave in such two dimensional solid/air phononic crystals is investigated by the finite element method (FEM) with the Bloch theory. Their band structure, wave filtering property, and the physical mechanism responsible for the broadened band gap are explored. The corresponding ordinary phononic crystal without hierarchical feature is used for comparison. Obtained results show that the solid/air hierarchical phononic crystals possess tunable outstanding band gap features, which are favorable for applications such as sound insulation and vibration attenuation.

  11. Measuring age differences among globular clusters having similar metallicities - A new method and first results

    International Nuclear Information System (INIS)

    Vandenberg, D.A.; Bolte, M.; Stetson, P.B.

    1990-01-01

    A color-difference technique for estimating the relative ages of globular clusters with similar chemical compositions on the basis of their CM diagrams is described and demonstrated. The theoretical basis and implementation of the procedure are explained, and results for groups of globular clusters with m/H = about -2, -1.6, and -1.3, and for two special cases (Palomar 12 and NGC 5139) are presented in extensive tables and graphs and discussed in detail. It is found that the more metal-deficient globular clusters are nearly coeval (differences less than 0.5 Gyr), whereas the most metal-rich globular clusters exhibit significant age differences (about 2 Gyr). This result is shown to contradict Galactic evolution models postulating halo collapse in less than a few times 100 Myr. 77 refs

  12. Hierarchical cellular designs for load-bearing biocomposite beams and plates

    International Nuclear Information System (INIS)

    Burgueno, Rigoberto; Quagliata, Mario J.; Mohanty, Amar K.; Mehta, Geeta; Drzal, Lawrence T.; Misra, Manjusri

    2005-01-01

    Scrutiny into the composition of natural, or biological materials convincingly reveals that high material and structural efficiency can be attained, even with moderate-quality constituents, by hierarchical topologies, i.e., successively organized material levels or layers. The present study demonstrates that biologically inspired hierarchical designs can help improve the moderate properties of natural fiber polymer composites or biocomposites and allow them to compete with conventional materials for load-bearing applications. An overview of the mechanics concepts that allow hierarchical designs to achieve higher performance is presented, followed by observation and results from flexural tests on periodic and hierarchical cellular beams and plates made from industrial hemp fibers and unsaturated polyester resin biocomposites. The experimental data is shown to agree well with performance indices predicted by mechanics models. A procedure for the multi-scale integrated material/structural analysis of hierarchical cellular biocomposite components is presented and its advantages and limitations are discussed

  13. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  14. WESTERN CHARPATHIAN RURAL MOUNTAIN TOURISM MAPPING THROUGH CLUSTER METHODOLOGY

    Directory of Open Access Journals (Sweden)

    Elena TOMA

    2013-10-01

    Full Text Available Rural tourism from Western Carpathian Mountain was characterized in the last years by a low occupancy rate and a decline in tourist arrivals, due, beside of the direct effects of economic crises, to the remote location of mountain villages and to the low quality of infrastructure. For this reason we consider that the implementation of complex and integrated products based on tour thematic circuits represents a real opportunity to develop local rural tourism industry. The aim of this paper is to identify which is the best networking solution, based on clustering analysis. The Multidimensional Scaling Method and Hierarchical Cluster Method permitted us to demonstrate and identify the best way of clustering, and, in this way, the best route for a potential tour touristic circuit. Reported to the counties from which the villages take part, the identified cluster concentrate 57.7% of rural touristic accommodations and 65.0% of tourist arrivals, but it has an occupancy rate of only 5.9%. By implementing new complex touristic products we consider that can be assured a rise of this touristic dimension of the cluster and we propose more in depth studies regarding the profile of the potential customers.

  15. Endohedral gallide cluster superconductors and superconductivity in ReGa5.

    Science.gov (United States)

    Xie, Weiwei; Luo, Huixia; Phelan, Brendan F; Klimczuk, Tomasz; Cevallos, Francois Alexandre; Cava, Robert Joseph

    2015-12-22

    We present transition metal-embedded (T@Gan) endohedral Ga-clusters as a favorable structural motif for superconductivity and develop empirical, molecule-based, electron counting rules that govern the hierarchical architectures that the clusters assume in binary phases. Among the binary T@Gan endohedral cluster systems, Mo8Ga41, Mo6Ga31, Rh2Ga9, and Ir2Ga9 are all previously known superconductors. The well-known exotic superconductor PuCoGa5 and related phases are also members of this endohedral gallide cluster family. We show that electron-deficient compounds like Mo8Ga41 prefer architectures with vertex-sharing gallium clusters, whereas electron-rich compounds, like PdGa5, prefer edge-sharing cluster architectures. The superconducting transition temperatures are highest for the electron-poor, corner-sharing architectures. Based on this analysis, the previously unknown endohedral cluster compound ReGa5 is postulated to exist at an intermediate electron count and a mix of corner sharing and edge sharing cluster architectures. The empirical prediction is shown to be correct and leads to the discovery of superconductivity in ReGa5. The Fermi levels for endohedral gallide cluster compounds are located in deep pseudogaps in the electronic densities of states, an important factor in determining their chemical stability, while at the same time limiting their superconducting transition temperatures.

  16. Distribution of shallow very low frequency earthquakes in the eastern Nankai trough influenced by a subducted oceanic ridge: Results from cluster analysis applied to ocean bottom seismographs

    Science.gov (United States)

    To, A.; Obana, K.; Araki, E.

    2016-12-01

    The activity of very low frequency earthquakes (VLFEs) in the shallow accretionary prism of the eastern Nankai trough has been observed frequently in the past. In this study, we investigated the distribution of VLFEs that occurred in October 2015, which were recorded by an array of broadband ocean bottom seismometers (BBOBSs) of DONET1 network. The size of the network is much wider (>80 km) compared to previous BBOBS networks that were used for close-in observations of VLFEs; therefore the new dataset provides a broader overview of the VLFE distribution of this region. We first located the detected events using conventional methods such as the envelope correlation method. However, the results seemed to be largely scattered due to noise and the effect of 3D structures that could not be properly handled. Then, we introduced hierarchal clustering analysis, based on measured travel time patterns among stations obtained for each event. The analyses enabled the assessment of relative locations among events. Finally, the locations of event-clusters were estimated, instead of individual events, so that the obtained locations seemed less scattered. The obtained results indicate that the VLFE distribution is strongly influenced by a subducted ridge (Park et al., 2003) that exists beneath the northeastern side of the DONET1 network. Though the VLFEs are distributed from an area near the outer ridge toward the trench axis in the region with a smooth plate boundary, they are clustered at a shallow depth near the outer ridge in the region of the rough plate boundary. The VLFEs are clustered on the landward side of the peak of the subducted ridge; this could be explained by an elevated pore pressure in the region caused by the low-permeability oceanic ridge that may clog the up-dip pathway of the fluid along the decollement zone. The along-strike variation of the stress state, inferred from the VLFE distribution, should be an important factor in assessing the strain release

  17. Hierarchical Ag mesostructures for single particle SERS substrate

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Minwei, E-mail: xuminwei@xjtu.edu.cn; Zhang, Yin

    2017-01-30

    Highlights: • Hierarchical Ag mesostructures with the size of 250, 360 and 500 nm are synthesized via a seed-mediated approach. • The Ag mesostructures present the tailorable size and highly roughened surfaces. • The average enhancement factors for individual Ag mesostructures were estimated to be as high as 10{sup 6}. - Abstract: Hierarchical Ag mesostructures with highly rough surface morphology have been synthesized at room temperature through a simple seed-mediated approach. Electron microscopy characterizations indicate that the obtained Ag mesostructures exhibit a textured surface morphology with the flower-like architecture. Moreover, the particle size can be tailored easily in the range of 250–500 nm. For the growth process of the hierarchical Ag mesostructures, it is believed that the self-assembly mechanism is more reasonable rather than the epitaxial overgrowth of Ag seed. The oriented attachment of nanoparticles is revealed during the formation of Ag mesostructures. Single particle surface enhanced Raman spectra (sp-SERS) of crystal violet adsorbed on the hierarchical Ag mesostructures were measured. Results reveal that the hierarchical Ag mesostructures can be highly sensitive sp-SERS substrates with good reproducibility. The average enhancement factors for individual Ag mesostructures are estimated to be about 10{sup 6}.

  18. Quantitative Analysis and Comparison of Four Major Flavonol Glycosides in the Leaves of Toona sinensis (A. Juss.) Roemer (Chinese Toon) from Various Origins by High-Performance Liquid Chromatography-Diode Array Detector and Hierarchical Clustering Analysis

    Science.gov (United States)

    Sun, Xiaoxiang; Zhang, Liting; Cao, Yaqi; Gu, Qinying; Yang, Huan; Tam, James P.

    2016-01-01

    Background: Toona sinensis (A. Juss.) Roemer is an endemic species of Toona genus native to Asian area. Its dried leaves are applied in the treatment of many diseases; however, few investigations have been reported for the quantitative analysis and comparison of major bioactive flavonol glycosides in the leaves harvested from various origins. Objective: To quantitatively analyze four major flavonol glycosides including rutinoside, quercetin-3-O-β-D-glucoside, quercetin-3-O-α-L-rhamnoside, and kaempferol-3-O-α-L-rhamnoside in the leaves from different production sites and classify them according to the content of these glycosides. Materials and Methods: A high-performance liquid chromatography-diode array detector (HPLC-DAD) method for their simultaneous determination was developed and validated for linearity, precision, accuracy, stability, and repeatability. Moreover, the method established was then employed to explore the difference in the content of these four glycosides in raw materials. Finally, a hierarchical clustering analysis was performed to classify 11 voucher specimens. Results: The separation was performed on a Waters XBridge Shield RP18 column (150 mm × 4.6 mm, 3.5 μm) kept at 35°C, and acetonitrile and H2O containing 0.30% trifluoroacetic acid as mobile phase was driven at 1.0 mL/min during the analysis. Ten microliters of solution were injected and 254 nm was selected to monitor the separation. A strong linear relationship between the peak area and concentration of four analytes was observed. And, the method was also validated to be repeatable, stable, precise, and accurate. Conclusion: An efficient and reliable HPLC-DAD method was established and applied in the assays for the samples from 11 origins successfully. Moreover, the content of those flavonol glycosides varied much among different batches, and the flavonoids could be considered as biomarkers to control the quality of Chinese Toon. SUMMARY Four major flavonol glycosides in the leaves

  19. Merging history of three bimodal clusters

    Science.gov (United States)

    Maurogordato, S.; Sauvageot, J. L.; Bourdin, H.; Cappi, A.; Benoist, C.; Ferrari, C.; Mars, G.; Houairi, K.

    2011-01-01

    We present a combined X-ray and optical analysis of three bimodal galaxy clusters selected as merging candidates at z ~ 0.1. These targets are part of MUSIC (MUlti-Wavelength Sample of Interacting Clusters), which is a general project designed to study the physics of merging clusters by means of multi-wavelength observations. Observations include spectro-imaging with XMM-Newton EPIC camera, multi-object spectroscopy (260 new redshifts), and wide-field imaging at the ESO 3.6 m and 2.2 m telescopes. We build a global picture of these clusters using X-ray luminosity and temperature maps together with galaxy density and velocity distributions. Idealized numerical simulations were used to constrain the merging scenario for each system. We show that A2933 is very likely an equal-mass advanced pre-merger ~200 Myr before the core collapse, while A2440 and A2384 are post-merger systems (~450 Myr and ~1.5 Gyr after core collapse, respectively). In the case of A2384, we detect a spectacular filament of galaxies and gas spreading over more than 1 h-1 Mpc, which we infer to have been stripped during the previous collision. The analysis of the MUSIC sample allows us to outline some general properties of merging clusters: a strong luminosity segregation of galaxies in recent post-mergers; the existence of preferential axes - corresponding to the merging directions - along which the BCGs and structures on various scales are aligned; the concomitance, in most major merger cases, of secondary merging or accretion events, with groups infalling onto the main cluster, and in some cases the evidence of previous merging episodes in one of the main components. These results are in good agreement with the hierarchical scenario of structure formation, in which clusters are expected to form by successive merging events, and matter is accreted along large-scale filaments. Based on data obtained with the European Southern Observatory, Chile (programs 072.A-0595, 075.A-0264, and 079.A-0425

  20. Clustering recommendations to compute agent reputation

    Science.gov (United States)

    Bedi, Punam; Kaur, Harmeet

    2005-03-01

    Traditional centralized approaches to security are difficult to apply to multi-agent systems which are used nowadays in e-commerce applications. Developing a notion of trust that is based on the reputation of an agent can provide a softer notion of security that is sufficient for many multi-agent applications. Our paper proposes a mechanism for computing reputation of the trustee agent for use by the trustier agent. The trustier agent computes the reputation based on its own experience as well as the experience the peer agents have with the trustee agents. The trustier agents intentionally interact with the peer agents to get their experience information in the form of recommendations. We have also considered the case of unintentional encounters between the referee agents and the trustee agent, which can be directly between them or indirectly through a set of interacting agents. The clustering is done to filter off the noise in the recommendations in the form of outliers. The trustier agent clusters the recommendations received from referee agents on the basis of the distances between recommendations using the hierarchical agglomerative method. The dendogram hence obtained is cut at the required similarity level which restricts the maximum distance between any two recommendations within a cluster. The cluster with maximum number of elements denotes the views of the majority of recommenders. The center of this cluster represents the reputation of the trustee agent which can be computed using c-means algorithm.

  1. Early results from the Whisper instrument on Cluster: An overview

    DEFF Research Database (Denmark)

    Decreau, P.M.E.; Fergeau, P.; Krasnoselskikh, V.

    2001-01-01

    The Whisper instrument yields two data sets: (i) the electron density determined via the relaxation sounder, and (ii) the spectrum of natural plasma emissions in the frequency band 2-80 kHz. Both data sets allow for the three-dimensional exploration of the magnetosphere by the Cluster mission...... the drift velocity of density structures. Wave observations are also of crucial interest for studying small-scale structures, as demonstrated in an example in the fore-shock region. Early results from the Whisper instrument are very encouraging, and demonstrate that the four-point Cluster measurements...... largely overcomes the limited telemetry allocation. The natural emissions are usually related to the plasma frequency, as identified by the sounder, and the combination of an active sounding operation and a passive survey operation provides a time resolution for the total density determination of 2.2 s...

  2. Predicting healthcare outcomes in prematurely born infants using cluster analysis.

    Science.gov (United States)

    MacBean, Victoria; Lunt, Alan; Drysdale, Simon B; Yarzi, Muska N; Rafferty, Gerrard F; Greenough, Anne

    2018-05-23

    Prematurely born infants are at high risk of respiratory morbidity following neonatal unit discharge, though prediction of outcomes is challenging. We have tested the hypothesis that cluster analysis would identify discrete groups of prematurely born infants with differing respiratory outcomes during infancy. A total of 168 infants (median (IQR) gestational age 33 (31-34) weeks) were recruited in the neonatal period from consecutive births in a tertiary neonatal unit. The baseline characteristics of the infants were used to classify them into hierarchical agglomerative clusters. Rates of viral lower respiratory tract infections (LRTIs) were recorded for 151 infants in the first year after birth. Infants could be classified according to birth weight and duration of neonatal invasive mechanical ventilation (MV) into three clusters. Cluster one (MV ≤5 days) had few LRTIs. Clusters two and three (both MV ≥6 days, but BW ≥or <882 g respectively), had significantly higher LRTI rates. Cluster two had a higher proportion of infants experiencing respiratory syncytial virus LRTIs (P = 0.01) and cluster three a higher proportion of rhinovirus LRTIs (P < 0.001) CONCLUSIONS: Readily available clinical data allowed classification of prematurely born infants into one of three distinct groups with differing subsequent respiratory morbidity in infancy. © 2018 Wiley Periodicals, Inc.

  3. Female cluster headache in the United States of America: what are the gender differences? Results from the United States Cluster Headache Survey.

    Science.gov (United States)

    Rozen, Todd D; Fishman, Royce S

    2012-06-15

    To present results from the United States Cluster Headache Survey regarding gender differences in cluster headache demographics, clinical characteristics, diagnostic delay, triggers, treatment response and personal burden. Very few studies have looked at the gender differences in cluster headache presentation. The United States Cluster Headache Survey is the largest study of cluster headache sufferers ever completed in the United States and it is also the largest study of female cluster headache patients ever presented. The total survey consisted of 187 multiple choice questions which dealt with various issues related to cluster headache including: demographics, clinical characteristics, concomitant medical conditions, family history, triggers, smoking history, diagnosis, treatment response and personal burden. A group of questions were specifically targeted to female cluster headache patients. The survey was placed on a website from October to December 2008. For all survey responders the diagnosis of cluster headache needed to be made by a neurologist but there was no validation of the headache diagnosis by the authors. 1134 individuals completed the survey (816 male, 318 female). Key Points that define the differences between female and male cluster headache include: a. Age of onset: women develop cluster headache at an earlier age than men and are more likely to develop a second peak of cluster headache onset after 50 years of age. b. Family history: woman cluster headache sufferers are more likely to have a family history of both cluster headache and migraine and have an increased familial risk of Parkinson's disease. c. Comorbid conditions: female cluster headaches sufferers are significantly more likely to experience depression and have asthma than males. d. Aura issues: aura with cluster headache is equally common in both sexes, but aura duration is shorter in women. Women are much more likely to experience sensory, language and brainstem auras. e. Pain

  4. Self-assembly of Fe3O4 nanocrystal-clusters into cauliflower-like architectures: Synthesis and characterization

    International Nuclear Information System (INIS)

    Zhu Luping; Liao Guihong; Bing Naici; Wang Linlin; Xie Hongyong

    2011-01-01

    Large-scale cauliflower-like Fe 3 O 4 architectures consist of well-assembled magnetite nanocrystal clusters have been synthesized by a simple solvothermal process. The as-synthesized Fe 3 O 4 samples were characterized by XRD, XPS, FT-IR, SEM, TEM, etc. The results show that the samples exhibit cauliflower-like hierarchical microstructures. The influences of synthesis parameters on the morphology of the samples were experimentally investigated. Magnetic properties of the Fe 3 O 4 cauliflower-like hierarchical microstructures have been detected by VSM at room temperature, showing a relatively low saturation magnetization of 65 emu/g and an enhanced coercive force of 247 Oe. - Graphical Abstract: Cauliflower-like Fe 3 O 4 architectures consist of well-assembled magnetite nanocrystal clusters have been synthesized by a simple solvothermal process, using FeCl 3 .6H 2 O and EDA as the starting materials. Highlights: → Cauliflower-like Fe 3 O 4 architectures were successfully prepared by a simple solvothermal route. → The cauliflower-like Fe 3 O 4 architectures have a size in the range of 200-300 nm. → They show a low saturation magnetization of 65 emu/g and an enhanced coercive force of 247 Oe. → These Fe 3 O 4 architectures may have potential applications in catalysis and biological fields.

  5. Facile synthesis and photocatalytic activity of zinc oxide hierarchical microcrystals

    KAUST Repository

    Xu, Xinjiang

    2013-04-04

    ZnO microcrystals with hierarchical structure have been synthesized by a simple solvothermal approach. The microcrystals were studied by means of X-ray diffraction, transmission electron microscopy, and scanning electron microscopy. Research on the formation mechanism of the hierarchical microstructure shows that the coordination solvent and precursor concentration have considerable influence on the size and morphology of the microstructures. A possible formation mechanism of the hierarchical structure was suggested. Furthermore, the catalytic activity of the ZnO microcrystals was studied by treating low concentration Rhodamine B (RhB) solution under UV light, and research results show the hierarchical microstructures of ZnO display high catalytic activity in photocatalysis, the catalysis process follows first-order reaction kinetics, and the apparent rate constant k = 0.03195 min-1.

  6. Hierarchical modeling of genome-wide Short Tandem Repeat (STR) markers infers native American prehistory.

    Science.gov (United States)

    Lewis, Cecil M

    2010-02-01

    This study examines a genome-wide dataset of 678 Short Tandem Repeat loci characterized in 444 individuals representing 29 Native American populations as well as the Tundra Netsi and Yakut populations from Siberia. Using these data, the study tests four current hypotheses regarding the hierarchical distribution of neutral genetic variation in native South American populations: (1) the western region of South America harbors more variation than the eastern region of South America, (2) Central American and western South American populations cluster exclusively, (3) populations speaking the Chibchan-Paezan and Equatorial-Tucanoan language stock emerge as a group within an otherwise South American clade, (4) Chibchan-Paezan populations in Central America emerge together at the tips of the Chibchan-Paezan cluster. This study finds that hierarchical models with the best fit place Central American populations, and populations speaking the Chibchan-Paezan language stock, at a basal position or separated from the South American group, which is more consistent with a serial founder effect into South America than that previously described. Western (Andean) South America is found to harbor similar levels of variation as eastern (Equatorial-Tucanoan and Ge-Pano-Carib) South America, which is inconsistent with an initial west coast migration into South America. Moreover, in all relevant models, the estimates of genetic diversity within geographic regions suggest a major bottleneck or founder effect occurring within the North American subcontinent, before the peopling of Central and South America. 2009 Wiley-Liss, Inc.

  7. 2 x 2 Achievement Goals and Achievement Emotions: A Cluster Analysis of Students' Motivation

    Science.gov (United States)

    Jang, Leong Yeok; Liu, Woon Chia

    2012-01-01

    This study sought to better understand the adoption of multiple achievement goals at an intra-individual level, and its links to emotional well-being, learning, and academic achievement. Participants were 480 Secondary Two students (aged between 13 and 14 years) from two coeducational government schools. Hierarchical cluster analysis revealed the…

  8. The Hierarchical Perspective

    Directory of Open Access Journals (Sweden)

    Daniel Sofron

    2015-05-01

    Full Text Available This paper is focused on the hierarchical perspective, one of the methods for representing space that was used before the discovery of the Renaissance linear perspective. The hierarchical perspective has a more or less pronounced scientific character and its study offers us a clear image of the way the representatives of the cultures that developed it used to perceive the sensitive reality. This type of perspective is an original method of representing three-dimensional space on a flat surface, which characterises the art of Ancient Egypt and much of the art of the Middle Ages, being identified in the Eastern European Byzantine art, as well as in the Western European Pre-Romanesque and Romanesque art. At the same time, the hierarchical perspective is also present in naive painting and infantile drawing. Reminiscences of this method can be recognised also in the works of some precursors of the Italian Renaissance. The hierarchical perspective can be viewed as a subjective ranking criterion, according to which the elements are visually represented by taking into account their relevance within the image while perception is ignored. This paper aims to show how the main objective of the artists of those times was not to faithfully represent the objective reality, but rather to emphasize the essence of the world and its perennial aspects. This may represent a possible explanation for the refusal of perspective in the Egyptian, Romanesque and Byzantine painting, characterised by a marked two-dimensionality.

  9. Evolution of cluster X-ray luminosities and radii: Results from the 160 square degree rosat survey

    DEFF Research Database (Denmark)

    Vikhlinin, A.; McNamara, B.R.; Forman, W.

    1998-01-01

    We searched for cluster X-ray luminosity and radius evolution using our sample of 203 galaxy clusters detected in the 160 deg(2) survey with the ROSAT PSPC (Vikhlinin et al.). With such a large area survey, it is possible, for the first time with ROSAT, to test the evolution of luminous clusters, L......-X > 3 x 10(44) ergs s(-1) in the 0.5-2 keV band. We detect a factor of 3-4 deficit of such luminous clusters at z > 0.3 compared with the present. The evolution is much weaker or absent at modestly lower luminosities, (1-3) x 10(44) ergs s(-1). At still lower luminosities, we find no evolution from...... the analysis of the log N-log S relation. The results in the two upper L, bins are in agreement with the Einstein Extended Medium-Sensitivity Survey evolution result (Gioia et al.; Henry ct al.), which was obtained using a completely independent cluster sample. The low-L-X results are in agreement with other...

  10. Identifying influential nodes in large-scale directed networks: the role of clustering.

    Science.gov (United States)

    Chen, Duan-Bing; Gao, Hui; Lü, Linyuan; Zhou, Tao

    2013-01-01

    Identifying influential nodes in very large-scale directed networks is a big challenge relevant to disparate applications, such as accelerating information propagation, controlling rumors and diseases, designing search engines, and understanding hierarchical organization of social and biological networks. Known methods range from node centralities, such as degree, closeness and betweenness, to diffusion-based processes, like PageRank and LeaderRank. Some of these methods already take into account the influences of a node's neighbors but do not directly make use of the interactions among it's neighbors. Local clustering is known to have negative impacts on the information spreading. We further show empirically that it also plays a negative role in generating local connections. Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors' influences, but also the clustering coefficient. Subject to the susceptible-infected-recovered (SIR) spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank. Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition. In addition, ClusterRank, only making use of local information, is much more efficient than global methods: It takes only 191 seconds for a network with about [Formula: see text] nodes, more than 15 times faster than PageRank.

  11. Identifying influential nodes in large-scale directed networks: the role of clustering.

    Directory of Open Access Journals (Sweden)

    Duan-Bing Chen

    Full Text Available Identifying influential nodes in very large-scale directed networks is a big challenge relevant to disparate applications, such as accelerating information propagation, controlling rumors and diseases, designing search engines, and understanding hierarchical organization of social and biological networks. Known methods range from node centralities, such as degree, closeness and betweenness, to diffusion-based processes, like PageRank and LeaderRank. Some of these methods already take into account the influences of a node's neighbors but do not directly make use of the interactions among it's neighbors. Local clustering is known to have negative impacts on the information spreading. We further show empirically that it also plays a negative role in generating local connections. Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors' influences, but also the clustering coefficient. Subject to the susceptible-infected-recovered (SIR spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank. Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition. In addition, ClusterRank, only making use of local information, is much more efficient than global methods: It takes only 191 seconds for a network with about [Formula: see text] nodes, more than 15 times faster than PageRank.

  12. Identifying Influential Nodes in Large-Scale Directed Networks: The Role of Clustering

    Science.gov (United States)

    Chen, Duan-Bing; Gao, Hui; Lü, Linyuan; Zhou, Tao

    2013-01-01

    Identifying influential nodes in very large-scale directed networks is a big challenge relevant to disparate applications, such as accelerating information propagation, controlling rumors and diseases, designing search engines, and understanding hierarchical organization of social and biological networks. Known methods range from node centralities, such as degree, closeness and betweenness, to diffusion-based processes, like PageRank and LeaderRank. Some of these methods already take into account the influences of a node’s neighbors but do not directly make use of the interactions among it’s neighbors. Local clustering is known to have negative impacts on the information spreading. We further show empirically that it also plays a negative role in generating local connections. Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors’ influences, but also the clustering coefficient. Subject to the susceptible-infected-recovered (SIR) spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank. Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition. In addition, ClusterRank, only making use of local information, is much more efficient than global methods: It takes only 191 seconds for a network with about nodes, more than 15 times faster than PageRank. PMID:24204833

  13. Static and dynamic friction of hierarchical surfaces.

    Science.gov (United States)

    Costagliola, Gianluca; Bosia, Federico; Pugno, Nicola M

    2016-12-01

    Hierarchical structures are very common in nature, but only recently have they been systematically studied in materials science, in order to understand the specific effects they can have on the mechanical properties of various systems. Structural hierarchy provides a way to tune and optimize macroscopic mechanical properties starting from simple base constituents and new materials are nowadays designed exploiting this possibility. This can be true also in the field of tribology. In this paper we study the effect of hierarchical patterned surfaces on the static and dynamic friction coefficients of an elastic material. Our results are obtained by means of numerical simulations using a one-dimensional spring-block model, which has previously been used to investigate various aspects of friction. Despite the simplicity of the model, we highlight some possible mechanisms that explain how hierarchical structures can significantly modify the friction coefficients of a material, providing a means to achieve tunability.

  14. Deep hierarchical attention network for video description

    Science.gov (United States)

    Li, Shuohao; Tang, Min; Zhang, Jun

    2018-03-01

    Pairing video to natural language description remains a challenge in computer vision and machine translation. Inspired by image description, which uses an encoder-decoder model for reducing visual scene into a single sentence, we propose a deep hierarchical attention network for video description. The proposed model uses convolutional neural network (CNN) and bidirectional LSTM network as encoders while a hierarchical attention network is used as the decoder. Compared to encoder-decoder models used in video description, the bidirectional LSTM network can capture the temporal structure among video frames. Moreover, the hierarchical attention network has an advantage over single-layer attention network on global context modeling. To make a fair comparison with other methods, we evaluate the proposed architecture with different types of CNN structures and decoders. Experimental results on the standard datasets show that our model has a more superior performance than the state-of-the-art techniques.

  15. Kendall’s tau and agglomerative clustering for structure determination of hierarchical Archimedean copulas

    Czech Academy of Sciences Publication Activity Database

    Górecki, J.; Hofert, M.; Holeňa, Martin

    2017-01-01

    Roč. 5, č. 1 (2017), s. 75-87 ISSN 2300-2298 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : structure determination * agglomerative clustering * Kendall’s tau * Archimedean copula Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability

  16. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results.

    Science.gov (United States)

    Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario

    2011-01-17

    Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  17. Adaptive hierarchical multi-agent organizations

    NARCIS (Netherlands)

    Ghijsen, M.; Jansweijer, W.N.H.; Wielinga, B.J.; Babuška, R.; Groen, F.C.A.

    2010-01-01

    In this chapter, we discuss the design of adaptive hierarchical organizations for multi-agent systems (MAS). Hierarchical organizations have a number of advantages such as their ability to handle complex problems and their scalability to large organizations. By introducing adaptivity in the

  18. The Extended Northern ROSAT Galaxy Cluster Survey (NORAS II). I. Survey Construction and First Results

    International Nuclear Information System (INIS)

    Böhringer, Hans; Chon, Gayoung; Trümper, Joachim; Retzlaff, Jörg; Meisenheimer, Klaus; Schartel, Norbert

    2017-01-01

    As the largest, clearly defined building blocks of our universe, galaxy clusters are interesting astrophysical laboratories and important probes for cosmology. X-ray surveys for galaxy clusters provide one of the best ways to characterize the population of galaxy clusters. We provide a description of the construction of the NORAS II galaxy cluster survey based on X-ray data from the northern part of the ROSAT All-Sky Survey. NORAS II extends the NORAS survey down to a flux limit of 1.8 × 10 −12 erg s −1 cm −2 (0.1–2.4 keV), increasing the sample size by about a factor of two. The NORAS II cluster survey now reaches the same quality and depth as its counterpart, the southern REFLEX II survey, allowing us to combine the two complementary surveys. The paper provides information on the determination of the cluster X-ray parameters, the identification process of the X-ray sources, the statistics of the survey, and the construction of the survey selection function, which we provide in numerical format. Currently NORAS II contains 860 clusters with a median redshift of z  = 0.102. We provide a number of statistical functions, including the log N –log S and the X-ray luminosity function and compare these to the results from the complementary REFLEX II survey. Using the NORAS II sample to constrain the cosmological parameters, σ 8 and Ω m , yields results perfectly consistent with those of REFLEX II. Overall, the results show that the two hemisphere samples, NORAS II and REFLEX II, can be combined without problems into an all-sky sample, just excluding the zone of avoidance.

  19. The Extended Northern ROSAT Galaxy Cluster Survey (NORAS II). I. Survey Construction and First Results

    Energy Technology Data Exchange (ETDEWEB)

    Böhringer, Hans; Chon, Gayoung; Trümper, Joachim [Max-Planck-Institut für Extraterrestrische Physik, D-85748 Garching (Germany); Retzlaff, Jörg [ESO, D-85748 Garching (Germany); Meisenheimer, Klaus [Max-Planck-Institut für Astronomy, Königstuhl 17, D-69117 Heidelberg (Germany); Schartel, Norbert [ESAC, Camino Bajo del Castillo, Villanueva de la Cañada, E-28692 Madrid (Spain)

    2017-05-01

    As the largest, clearly defined building blocks of our universe, galaxy clusters are interesting astrophysical laboratories and important probes for cosmology. X-ray surveys for galaxy clusters provide one of the best ways to characterize the population of galaxy clusters. We provide a description of the construction of the NORAS II galaxy cluster survey based on X-ray data from the northern part of the ROSAT All-Sky Survey. NORAS II extends the NORAS survey down to a flux limit of 1.8 × 10{sup −12} erg s{sup −1} cm{sup −2} (0.1–2.4 keV), increasing the sample size by about a factor of two. The NORAS II cluster survey now reaches the same quality and depth as its counterpart, the southern REFLEX II survey, allowing us to combine the two complementary surveys. The paper provides information on the determination of the cluster X-ray parameters, the identification process of the X-ray sources, the statistics of the survey, and the construction of the survey selection function, which we provide in numerical format. Currently NORAS II contains 860 clusters with a median redshift of z  = 0.102. We provide a number of statistical functions, including the log N –log S and the X-ray luminosity function and compare these to the results from the complementary REFLEX II survey. Using the NORAS II sample to constrain the cosmological parameters, σ {sub 8} and Ω{sub m}, yields results perfectly consistent with those of REFLEX II. Overall, the results show that the two hemisphere samples, NORAS II and REFLEX II, can be combined without problems into an all-sky sample, just excluding the zone of avoidance.

  20. Identification of Counterfeit Alcoholic Beverages Using Cluster Analysis in Principal-Component Space

    Science.gov (United States)

    Khodasevich, M. A.; Sinitsyn, G. V.; Gres'ko, M. A.; Dolya, V. M.; Rogovaya, M. V.; Kazberuk, A. V.

    2017-07-01

    A study of 153 brands of commercial vodka products showed that counterfeit samples could be identified by introducing a unified additive at the minimum concentration acceptable for instrumental detection and multivariate analysis of UV-Vis transmission spectra. Counterfeit products were detected with 100% probability by using hierarchical cluster analysis or the C-means method in two-dimensional principal-component space.

  1. Metastable states in the hierarchical Dyson model drive parallel processing in the hierarchical Hopfield network

    International Nuclear Information System (INIS)

    Agliari, Elena; Barra, Adriano; Guerra, Francesco; Galluzzi, Andrea; Tantari, Daniele; Tavani, Flavia

    2015-01-01

    In this paper, we introduce and investigate the statistical mechanics of hierarchical neural networks. First, we approach these systems à la Mattis, by thinking of the Dyson model as a single-pattern hierarchical neural network. We also discuss the stability of different retrievable states as predicted by the related self-consistencies obtained both from a mean-field bound and from a bound that bypasses the mean-field limitation. The latter is worked out by properly reabsorbing the magnetization fluctuations related to higher levels of the hierarchy into effective fields for the lower levels. Remarkably, mixing Amit's ansatz technique for selecting candidate-retrievable states with the interpolation procedure for solving for the free energy of these states, we prove that, due to gauge symmetry, the Dyson model accomplishes both serial and parallel processing. We extend this scenario to multiple stored patterns by implementing the Hebb prescription for learning within the couplings. This results in Hopfield-like networks constrained on a hierarchical topology, for which, by restricting to the low-storage regime where the number of patterns grows at its most logarithmical with the amount of neurons, we prove the existence of the thermodynamic limit for the free energy, and we give an explicit expression of its mean-field bound and of its related improved bound. We studied the resulting self-consistencies for the Mattis magnetizations, which act as order parameters, are studied and the stability of solutions is analyzed to get a picture of the overall retrieval capabilities of the system according to both mean-field and non-mean-field scenarios. Our main finding is that embedding the Hebbian rule on a hierarchical topology allows the network to accomplish both serial and parallel processing. By tuning the level of fast noise affecting it or triggering the decay of the interactions with the distance among neurons, the system may switch from sequential retrieval to

  2. Analysis hierarchical model for discrete event systems

    Science.gov (United States)

    Ciortea, E. M.

    2015-11-01

    The This paper presents the hierarchical model based on discrete event network for robotic systems. Based on the hierarchical approach, Petri network is analysed as a network of the highest conceptual level and the lowest level of local control. For modelling and control of complex robotic systems using extended Petri nets. Such a system is structured, controlled and analysed in this paper by using Visual Object Net ++ package that is relatively simple and easy to use, and the results are shown as representations easy to interpret. The hierarchical structure of the robotic system is implemented on computers analysed using specialized programs. Implementation of hierarchical model discrete event systems, as a real-time operating system on a computer network connected via a serial bus is possible, where each computer is dedicated to local and Petri model of a subsystem global robotic system. Since Petri models are simplified to apply general computers, analysis, modelling, complex manufacturing systems control can be achieved using Petri nets. Discrete event systems is a pragmatic tool for modelling industrial systems. For system modelling using Petri nets because we have our system where discrete event. To highlight the auxiliary time Petri model using transport stream divided into hierarchical levels and sections are analysed successively. Proposed robotic system simulation using timed Petri, offers the opportunity to view the robotic time. Application of goods or robotic and transmission times obtained by measuring spot is obtained graphics showing the average time for transport activity, using the parameters sets of finished products. individually.

  3. Preliminary Cluster Size and Efficiencies results of CMS RPC at GIF++

    CERN Document Server

    Gonzalez Blanco Gonzalez, Genoveva

    2016-01-01

    A brief description and first preliminary results of the Efficiencies and Cluster Size measurements of the CMS Resistive Plate Chambers, will be presented inside the Gamma Irradiation Facility GIF++ at CERN. Preliminary studies that sets the base performance measurements of CMS RPC for starting aging studies.

  4. Hierarchical structure of stock price fluctuations in financial markets

    International Nuclear Information System (INIS)

    Gao, Ya-Chun; Cai, Shi-Min; Wang, Bing-Hong

    2012-01-01

    The financial market and turbulence have been broadly compared on account of the same quantitative methods and several common stylized facts they share. In this paper, the She–Leveque (SL) hierarchy, proposed to explain the anomalous scaling exponents deviating from Kolmogorov monofractal scaling of the velocity fluctuation in fluid turbulence, is applied to study and quantify the hierarchical structure of stock price fluctuations in financial markets. We therefore observed certain interesting results: (i) the hierarchical structure related to multifractal scaling generally presents in all the stock price fluctuations we investigated. (ii) The quantitatively statistical parameters that describe SL hierarchy are different between developed financial markets and emerging ones, distinctively. (iii) For the high-frequency stock price fluctuation, the hierarchical structure varies with different time periods. All these results provide a novel analogy in turbulence and financial market dynamics and an insight to deeply understand multifractality in financial markets. (paper)

  5. Low energy isomers of (H2O)25 from a hierarchical method based on Monte Carlo temperature basin paving and molecular tailoring approaches benchmarked by MP2 calculations

    International Nuclear Information System (INIS)

    Sahu, Nityananda; Gadre, Shridhar R.; Rakshit, Avijit; Bandyopadhyay, Pradipta; Miliordos, Evangelos; Xantheas, Sotiris S.

    2014-01-01

    We report new global minimum candidate structures for the (H 2 O) 25 cluster that are lower in energy than the ones reported previously and correspond to hydrogen bonded networks with 42 hydrogen bonds and an interior, fully coordinated water molecule. These were obtained as a result of a hierarchical approach based on initial Monte Carlo Temperature Basin Paving sampling of the cluster's Potential Energy Surface with the Effective Fragment Potential, subsequent geometry optimization using the Molecular Tailoring Approach with the fragments treated at the second order Møller-Plesset (MP2) perturbation (MTA-MP2) and final refinement of the entire cluster at the MP2 level of theory. The MTA-MP2 optimized cluster geometries, constructed from the fragments, were found to be within 2 O) 25 cluster. In addition, the grafting of the MTA-MP2 energies yields electronic energies that are within <0.3 kcal/mol from the MP2 energies of the entire cluster while preserving their energy rank order. Finally, the MTA-MP2 approach was found to reproduce the MP2 harmonic vibrational frequencies, constructed from the fragments, quite accurately when compared to the MP2 ones of the entire cluster in both the HOH bending and the OH stretching regions of the spectra

  6. Spitzer ’s View of the Candidate Cluster and Protocluster Catalog (CCPC)

    Energy Technology Data Exchange (ETDEWEB)

    Franck, J. R.; McGaugh, S. S. [Case Western Reserve University, 10900 Euclid Ave., Cleveland, OH 44106 (United States)

    2017-02-10

    The Candidate Cluster and Protocluster Catalog contains 218 galaxy overdensities composed of more than 2000 galaxies with spectroscopic redshifts spanning the first few Gyr after the Big Bang (2.0 ≤ z < 6.6). We use Spitzer archival data to track the underlying stellar mass of these overdense regions in various temporal cross sections by building rest-frame near-infrared luminosity functions (LFs) across the span of redshifts. This exercise maps the stellar growth of protocluster galaxies, as halos in the densest environments should be the most massive from hierarchical accretion. The characteristic apparent magnitude, m *( z ), is relatively flat from 2.0 ≤ z < 6.6, consistent with a passive evolution of an old stellar population. This trend maps smoothly to lower redshift results of cluster galaxies from other works. We find no difference in the LFs of galaxies in the field versus protoclusters at a given redshift apart from their density.

  7. Staff Distress Improves by Treating Pain in Nursing Home Patients With Dementia: Results From a Cluster-Randomized Controlled Trial.

    Science.gov (United States)

    Aasmul, Irene; Husebo, Bettina Sandgathe; Flo, Elisabeth

    2016-12-01

    Most people with dementia develop neuropsychiatric symptoms (NPSs), which are distressing for their carers. Untreated pain may increase the prevalence and severity of NPSs and thereby staff burden. We investigated the association between NPSs and the impact of individual pain treatment on distress in nursing home staff. Nursing home (NH) units were cluster-randomized to an intervention group (33 NH units; n = 175) or control group (27 NH units; n = 177). Patients in the intervention group received individual pain treatment for eight weeks, followed by a four-week washout period; control groups received care as usual. Staff informants (n = 138) used the Neuropsychiatric Inventory-NH version (including caregiver distress) as primary outcome to assess their own distress. Other outcomes were pain (Mobilization-Observation-Behavior-Intensity-Dementia-2 Pain Scale) and cognitive functioning (Mini-Mental State Examination). Using hierarchical regression analysis, all NPS items at baseline were associated with staff distress (P pain treatment reduced staff distress in the intervention group compared to control group especially in regard to agitation-related symptoms and apathy. Furthermore, our results indicated a multifactorial model of staff distress, in which enhanced knowledge and understanding of NPSs and pain in people with advanced dementia may play an important role. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Country clustering applied to the water and sanitation sector: a new tool with potential applications in research and policy.

    Science.gov (United States)

    Onda, Kyle; Crocker, Jonny; Kayser, Georgia Lyn; Bartram, Jamie

    2014-03-01

    The fields of global health and international development commonly cluster countries by geography and income to target resources and describe progress. For any given sector of interest, a range of relevant indicators can serve as a more appropriate basis for classification. We create a new typology of country clusters specific to the water and sanitation (WatSan) sector based on similarities across multiple WatSan-related indicators. After a literature review and consultation with experts in the WatSan sector, nine indicators were selected. Indicator selection was based on relevance to and suggested influence on national water and sanitation service delivery, and to maximize data availability across as many countries as possible. A hierarchical clustering method and a gap statistic analysis were used to group countries into a natural number of relevant clusters. Two stages of clustering resulted in five clusters, representing 156 countries or 6.75 billion people. The five clusters were not well explained by income or geography, and were distinct from existing country clusters used in international development. Analysis of these five clusters revealed that they were more compact and well separated than United Nations and World Bank country clusters. This analysis and resulting country typology suggest that previous geography- or income-based country groupings can be improved upon for applications in the WatSan sector by utilizing globally available WatSan-related indicators. Potential applications include guiding and discussing research, informing policy, improving resource targeting, describing sector progress, and identifying critical knowledge gaps in the WatSan sector. Copyright © 2013 Elsevier GmbH. All rights reserved.

  9. Comparison Of Keyword Based Clustering Of Web Documents By Using Openstack 4j And By Traditional Method

    Directory of Open Access Journals (Sweden)

    Shiza Anand

    2015-08-01

    Full Text Available As the number of hypertext documents are increasing continuously day by day on world wide web. Therefore clustering methods will be required to bind documents into the clusters repositories according to the similarity lying between the documents. Various clustering methods exist such as Hierarchical Based K-means Fuzzy Logic Based Centroid Based etc. These keyword based clustering methods takes much more amount of time for creating containers and putting documents in their respective containers. These traditional methods use File Handling techniques of different programming languages for creating repositories and transferring web documents into these containers. In contrast openstack4j SDK is a new technique for creating containers and shifting web documents into these containers according to the similarity in much more less amount of time as compared to the traditional methods. Another benefit of this technique is that this SDK understands and reads all types of files such as jpg html pdf doc etc. This paper compares the time required for clustering of documents by using openstack4j and by traditional methods and suggests various search engines to adopt this technique for clustering so that they give result to the user querries in less amount of time.

  10. TiO2 nanowire-templated hierarchical nanowire network as water-repelling coating

    Science.gov (United States)

    Hang, Tian; Chen, Hui-Jiuan; Xiao, Shuai; Yang, Chengduan; Chen, Meiwan; Tao, Jun; Shieh, Han-ping; Yang, Bo-ru; Liu, Chuan; Xie, Xi

    2017-12-01

    Extraordinary water-repelling properties of superhydrophobic surfaces make them novel candidates for a great variety of potential applications. A general approach to achieve superhydrophobicity requires low-energy coating on the surface and roughness on nano- and micrometre scale. However, typical construction of superhydrophobic surfaces with micro-nano structure through top-down fabrication is restricted by sophisticated fabrication techniques and limited choices of substrate materials. Micro-nanoscale topographies templated by conventional microparticles through surface coating may produce large variations in roughness and uncontrollable defects, resulting in poorly controlled surface morphology and wettability. In this work, micro-nanoscale hierarchical nanowire network was fabricated to construct self-cleaning coating using one-dimensional TiO2 nanowires as microscale templates. Hierarchical structure with homogeneous morphology was achieved by branching ZnO nanowires on the TiO2 nanowire backbones through hydrothermal reaction. The hierarchical nanowire network displayed homogeneous micro/nano-topography, in contrast to hierarchical structure templated by traditional microparticles. This hierarchical nanowire network film exhibited high repellency to both water and cell culture medium after functionalization with fluorinated organic molecules. The hierarchical structure templated by TiO2 nanowire coating significantly increased the surface superhydrophobicity compared to vertical ZnO nanowires with nanotopography alone. Our results demonstrated a promising strategy of using nanowires as microscale templates for the rational design of hierarchical coatings with desired superhydrophobicity that can also be applied to various substrate materials.

  11. Hierarchically nanostructured hydroxyapatite: hydrothermal synthesis, morphology control, growth mechanism, and biological activity

    Science.gov (United States)

    Ma, Ming-Guo

    2012-01-01

    Hierarchically nanosized hydroxyapatite (HA) with flower-like structure assembled from nanosheets consisting of nanorod building blocks was successfully synthesized by using CaCl2, NaH2PO4, and potassium sodium tartrate via a hydrothermal method at 200°C for 24 hours. The effects of heating time and heating temperature on the products were investigated. As a chelating ligand and template molecule, the potassium sodium tartrate plays a key role in the formation of hierarchically nanostructured HA. On the basis of experimental results, a possible mechanism based on soft-template and self-assembly was proposed for the formation and growth of the hierarchically nanostructured HA. Cytotoxicity experiments indicated that the hierarchically nanostructured HA had good biocompatibility. It was shown by in-vitro experiments that mesenchymal stem cells could attach to the hierarchically nanostructured HA after being cultured for 48 hours. Objective The purpose of this study was to develop facile and effective methods for the synthesis of novel hydroxyapatite (HA) with hierarchical nanostructures assembled from independent and discrete nanobuilding blocks. Methods A simple hydrothermal approach was applied to synthesize HA by using CaCl2, NaH2PO4, and potassium sodium tartrate at 200°C for 24 hours. The cell cytotoxicity of the hierarchically nanostructured HA was tested by MTT (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide) assay. Results HA displayed the flower-like structure assembled from nanosheets consisting of nanorod building blocks. The potassium sodium tartrate was used as a chelating ligand, inducing the formation and self-assembly of HA nanorods. The heating time and heating temperature influenced the aggregation and morphology of HA. The cell viability did not decrease with the increasing concentration of hierarchically nanostructured HA added. Conclusion A novel, simple and reliable hydrothermal route had been developed for the synthesis of

  12. Intensity-based hierarchical elastic registration using approximating splines.

    Science.gov (United States)

    Serifovic-Trbalic, Amira; Demirovic, Damir; Cattin, Philippe C

    2014-01-01

    We introduce a new hierarchical approach for elastic medical image registration using approximating splines. In order to obtain the dense deformation field, we employ Gaussian elastic body splines (GEBS) that incorporate anisotropic landmark errors and rotation information. Since the GEBS approach is based on a physical model in form of analytical solutions of the Navier equation, it can very well cope with the local as well as global deformations present in the images by varying the standard deviation of the Gaussian forces. The proposed GEBS approximating model is integrated into the elastic hierarchical image registration framework, which decomposes a nonrigid registration problem into numerous local rigid transformations. The approximating GEBS registration scheme incorporates anisotropic landmark errors as well as rotation information. The anisotropic landmark localization uncertainties can be estimated directly from the image data, and in this case, they represent the minimal stochastic localization error, i.e., the Cramér-Rao bound. The rotation information of each landmark obtained from the hierarchical procedure is transposed in an additional angular landmark, doubling the number of landmarks in the GEBS model. The modified hierarchical registration using the approximating GEBS model is applied to register 161 image pairs from a digital mammogram database. The obtained results are very encouraging, and the proposed approach significantly improved all registrations comparing the mean-square error in relation to approximating TPS with the rotation information. On artificially deformed breast images, the newly proposed method performed better than the state-of-the-art registration algorithm introduced by Rueckert et al. (IEEE Trans Med Imaging 18:712-721, 1999). The average error per breast tissue pixel was less than 2.23 pixels compared to 2.46 pixels for Rueckert's method. The proposed hierarchical elastic image registration approach incorporates the GEBS

  13. Hierarchical modularity in human brain functional networks

    Directory of Open Access Journals (Sweden)

    David Meunier

    2009-10-01

    Full Text Available The idea that complex systems have a hierarchical modular organization originates in the early 1960s and has recently attracted fresh support from quantitative studies of large scale, real-life networks. Here we investigate the hierarchical modular (or “modules-within-modules” decomposition of human brain functional networks, measured using functional magnetic resonance imaging (fMRI in 18 healthy volunteers under no-task or resting conditions. We used a customized template to extract networks with more than 1800 regional nodes, and we applied a fast algorithm to identify nested modular structure at several hierarchical levels. We used mutual information, 0 < I < 1, to estimate the similarity of community structure of networks in different subjects, and to identify the individual network that is most representative of the group. Results show that human brain functional networks have a hierarchical modular organization with a fair degree of similarity between subjects, I=0.63. The largest 5 modules at the highest level of the hierarchy were medial occipital, lateral occipital, central, parieto-frontal and fronto-temporal systems; occipital modules demonstrated less sub-modular organization than modules comprising regions of multimodal association cortex. Connector nodes and hubs, with a key role in inter-modular connectivity, were also concentrated in association cortical areas. We conclude that methods are available for hierarchical modular decomposition of large numbers of high resolution brain functional networks using computationally expedient algorithms. This could enable future investigations of Simon's original hypothesis that hierarchy or near-decomposability of physical symbol systems is a critical design feature for their fast adaptivity to changing environmental conditions.

  14. Magnetic ordering of CoCl2-GIC, a spin ceramic: hierarchical successive transitions and the intermediate glassy phase

    International Nuclear Information System (INIS)

    Suzuki, Masatsugu; Suzuki, Itsuko S; Matsuura, Motohiro

    2007-01-01

    Stage-2 CoCl 2 -graphite intercalation compound (GIC) is a spin ceramic which shows hierarchical successive transitions at T cu (= 8.9 K) and T cl (= 7.0 K) from the paramagnetic phase into an intra-cluster (two-dimensional ferromagnetic) order with inter-cluster disorder and then to an inter-cluster (three-dimensional antiferromagnetic like) order over the whole system. The nature of the inter-cluster disorder was suggested to be of spin glass by nonlinear magnetic response analyses around T cu and by studies on dynamical aspects of ordering between T cu and T cl . Here, we present a further extensive examination of a series of time dependence of zero-field cooled magnetization M ZFC after the ageing protocol below T cu . The time dependence of the relaxation rates S ZFC (t) = (1/H) dM ZFC (t)/dlnt dramatically changes from the curves of simple spin glass ageing effect below T cl to those of two peaks above T cl . The characteristic relaxation behaviour apparently indicates that there coexist two different kinds of glassy correlated region below T cu

  15. Topology-based hierarchical scheduling using deficit round robin

    DEFF Research Database (Denmark)

    Yu, Hao; Yan, Ying; Berger, Michael Stubert

    2009-01-01

    according to the topology. The mapping process could be completed through the network management plane or by manual configuration. Based on the knowledge of the network, the scheduler can manage the traffic on behalf of other less advanced nodes, avoid potential traffic congestion, and provide flow...... protection and isolation. Comparisons between hierarchical scheduling, flow-based scheduling, and class-based scheduling schemes have been carried out under a symmetric tree topology. Results have shown that the hierarchical scheduling scheme provides better flow protection and isolation from attack...

  16. GLOBULAR CLUSTER POPULATIONS: FIRST RESULTS FROM S{sup 4}G EARLY-TYPE GALAXIES

    Energy Technology Data Exchange (ETDEWEB)

    Zaritsky, Dennis [Steward Observatory, University of Arizona, 933 North Cherry Avenue, Tucson, AZ 85721 (United States); Aravena, Manuel [Núcleo de Astronomía, Facultad de Ingeniería, Universidad Diego Portales, Avenida Ejército 441, Santiago (Chile); Athanassoula, E.; Bosma, Albert [Aix Marseille Université, CNRS, LAM (Laboratoire d' Astrophysique de Marseille) UMR 7326, F-13388 Marseille (France); Comerón, Sébastien; Laine, Jarkko; Laurikainen, Eija; Salo, Heikki [Astronomy Division, Department of Physics, P.O. Box 3000, FI-90014 University of Oulu (Finland); Elmegreen, Bruce G. [IBM T. J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY 10598 (United States); Erroz-Ferrer, Santiago; Knapen, Johan H. [Instituto de Astrofísica de Canarias, Vía Lácteas, E-38205 La Laguna (Spain); Gadotti, Dimitri A.; Muñoz-Mateos, Juan Carlos [European Southern Observatory, Casilla 19001, Santiago 19 (Chile); Hinz, Joannah L. [MMT Observatory, P.O. Box 210065, Tucson, AZ 85721 (United States); Ho, Luis C. [Kavli Institute for Astronomy and Astrophysics, Peking University, Beijing 100871 (China); Holwerda, Benne [Leiden Observatory, University of Leiden, Niels Bohrweg 4, NL-2333 Leiden (Netherlands); Sheth, Kartik, E-mail: dennis.zaritsky@gmail.com [National Radio Astronomy Observatory/NAASC, 520 Edgemont Road, Charlottesville, VA 22903 (United States)

    2015-02-01

    Using 3.6 μm images of 97 early-type galaxies, we develop and verify methodology to measure globular cluster populations from the S{sup 4}G survey images. We find that (1) the ratio, T {sub N}, of the number of clusters, N {sub CL}, to parent galaxy stellar mass, M {sub *}, rises weakly with M {sub *} for early-type galaxies with M {sub *} > 10{sup 10} M {sub ☉} when we calculate galaxy masses using a universal stellar initial mass function (IMF) but that the dependence of T {sub N} on M {sub *} is removed entirely once we correct for the recently uncovered systematic variation of IMF with M {sub *}; and (2) for M {sub *} < 10{sup 10} M {sub ☉}, there is no trend between N {sub CL} and M {sub *}, the scatter in T {sub N} is significantly larger (approaching two orders of magnitude), and there is evidence to support a previous, independent suggestion of two families of galaxies. The behavior of N {sub CL} in the lower-mass systems is more difficult to measure because these systems are inherently cluster-poor, but our results may add to previous evidence that large variations in cluster formation and destruction efficiencies are to be found among low-mass galaxies. The average fraction of stellar mass in clusters is ∼0.0014 for M {sub *} > 10{sup 10} M {sub ☉} and can be as large as ∼0.02 for less massive galaxies. These are the first results from the S{sup 4}G sample of galaxies and will be enhanced by the sample of early-type galaxies now being added to S{sup 4}G and complemented by the study of later-type galaxies within S{sup 4}G.

  17. Hierarchically nanostructured hydroxyapatite: hydrothermal synthesis, morphology control, growth mechanism, and biological activity

    Directory of Open Access Journals (Sweden)

    Ma MG

    2012-04-01

    Full Text Available Ming-Guo MaInstitute of Biomass Chemistry and Technology, College of Materials Science and Technology, Beijing Forestry University, Beijing, People's Republic of ChinaAbstract: Hierarchically nanosized hydroxyapatite (HA with flower-like structure assembled from nanosheets consisting of nanorod building blocks was successfully synthesized by using CaCl2, NaH2PO4, and potassium sodium tartrate via a hydrothermal method at 200°C for 24 hours. The effects of heating time and heating temperature on the products were investigated. As a chelating ligand and template molecule, the potassium sodium tartrate plays a key role in the formation of hierarchically nanostructured HA. On the basis of experimental results, a possible mechanism based on soft-template and self-assembly was proposed for the formation and growth of the hierarchically nanostructured HA. Cytotoxicity experiments indicated that the hierarchically nanostructured HA had good biocompatibility. It was shown by in-vitro experiments that mesenchymal stem cells could attach to the hierarchically nanostructured HA after being cultured for 48 hours.Objective: The purpose of this study was to develop facile and effective methods for the synthesis of novel hydroxyapatite (HA with hierarchical nanostructures assembled from independent and discrete nanobuilding blocks.Methods: A simple hydrothermal approach was applied to synthesize HA by using CaCl2, NaH2PO4, and potassium sodium tartrate at 200°C for 24 hours. The cell cytotoxicity of the hierarchically nanostructured HA was tested by MTT (3-(4,5-dimethylthiazol-2-yl-2,5-diphenyltetrazolium bromide assay.Results: HA displayed the flower-like structure assembled from nanosheets consisting of nanorod building blocks. The potassium sodium tartrate was used as a chelating ligand, inducing the formation and self-assembly of HA nanorods. The heating time and heating temperature influenced the aggregation and morphology of HA. The cell viability did

  18. Lung Cancer Signature Biomarkers: tissue specific semantic similarity based clustering of Digital Differential Display (DDD data

    Directory of Open Access Journals (Sweden)

    Srivastava Mousami

    2012-11-01

    Full Text Available Abstract Background The tissue-specific Unigene Sets derived from more than one million expressed sequence tags (ESTs in the NCBI, GenBank database offers a platform for identifying significantly and differentially expressed tissue-specific genes by in-silico methods. Digital differential display (DDD rapidly creates transcription profiles based on EST comparisons and numerically calculates, as a fraction of the pool of ESTs, the relative sequence abundance of known and novel genes. However, the process of identifying the most likely tissue for a specific disease in which to search for candidate genes from the pool of differentially expressed genes remains difficult. Therefore, we have used ‘Gene Ontology semantic similarity score’ to measure the GO similarity between gene products of lung tissue-specific candidate genes from control (normal and disease (cancer sets. This semantic similarity score matrix based on hierarchical clustering represents in the form of a dendrogram. The dendrogram cluster stability was assessed by multiple bootstrapping. Multiple bootstrapping also computes a p-value for each cluster and corrects the bias of the bootstrap probability. Results Subsequent hierarchical clustering by the multiple bootstrapping method (α = 0.95 identified seven clusters. The comparative, as well as subtractive, approach revealed a set of 38 biomarkers comprising four distinct lung cancer signature biomarker clusters (panel 1–4. Further gene enrichment analysis of the four panels revealed that each panel represents a set of lung cancer linked metastasis diagnostic biomarkers (panel 1, chemotherapy/drug resistance biomarkers (panel 2, hypoxia regulated biomarkers (panel 3 and lung extra cellular matrix biomarkers (panel 4. Conclusions Expression analysis reveals that hypoxia induced lung cancer related biomarkers (panel 3, HIF and its modulating proteins (TGM2, CSNK1A1, CTNNA1, NAMPT/Visfatin, TNFRSF1A, ETS1, SRC-1, FN1, APLP2, DMBT1

  19. Hierarchically Nanostructured Materials for Sustainable Environmental Applications

    Directory of Open Access Journals (Sweden)

    Zheng eRen

    2013-11-01

    Full Text Available This article presents a comprehensive overview of the hierarchical nanostructured materials with either geometry or composition complexity in environmental applications. The hierarchical nanostructures offer advantages of high surface area, synergistic interactions and multiple functionalities towards water remediation, environmental gas sensing and monitoring as well as catalytic gas treatment. Recent advances in synthetic strategies for various hierarchical morphologies such as hollow spheres and urchin-shaped architectures have been reviewed. In addition to the chemical synthesis, the physical mechanisms associated with the materials design and device fabrication have been discussed for each specific application. The development and application of hierarchical complex perovskite oxide nanostructures have also been introduced in photocatalytic water remediation, gas sensing and catalytic converter. Hierarchical nanostructures will open up many possibilities for materials design and device fabrication in environmental chemistry and technology.

  20. The VMC Survey. XXIX. Turbulence-controlled Hierarchical Star Formation in the Small Magellanic Cloud

    Science.gov (United States)

    Sun, Ning-Chen; de Grijs, Richard; Cioni, Maria-Rosa L.; Rubele, Stefano; Subramanian, Smitha; van Loon, Jacco Th.; Bekki, Kenji; Bell, Cameron P. M.; Ivanov, Valentin D.; Marconi, Marcella; Muraveva, Tatiana; Oliveira, Joana M.; Ripepi, Vincenzo

    2018-05-01

    In this paper we report a clustering analysis of upper main-sequence stars in the Small Magellanic Cloud, using data from the VMC survey (the VISTA near-infrared YJK s survey of the Magellanic system). Young stellar structures are identified as surface overdensities on a range of significance levels. They are found to be organized in a hierarchical pattern, such that larger structures at lower significance levels contain smaller ones at higher significance levels. They have very irregular morphologies, with a perimeter–area dimension of 1.44 ± 0.02 for their projected boundaries. They have a power-law mass–size relation, power-law size/mass distributions, and a log-normal surface density distribution. We derive a projected fractal dimension of 1.48 ± 0.03 from the mass–size relation, or of 1.4 ± 0.1 from the size distribution, reflecting significant lumpiness of the young stellar structures. These properties are remarkably similar to those of a turbulent interstellar medium, supporting a scenario of hierarchical star formation regulated by supersonic turbulence.

  1. Psychosocial Clusters and their Associations with Well-Being and Health: An Empirical Strategy for Identifying Psychosocial Predictors Most Relevant to Racially/Ethnically Diverse Women’s Health

    Science.gov (United States)

    Jabson, Jennifer M.; Bowen, Deborah; Weinberg, Janice; Kroenke, Candyce; Luo, Juhua; Messina, Catherine; Shumaker, Sally; Tindle, Hilary A.

    2016-01-01

    BACKGROUND Strategies for identifying the most relevant psychosocial predictors in studies of racial/ethnic minority women’s health are limited because they largely exclude cultural influences and they assume that psychosocial predictors are independent. This paper proposes and tests an empirical solution. METHODS Hierarchical cluster analysis, conducted with data from 140,652 Women’s Health Initiative participants, identified clusters among individual psychosocial predictors. Multivariable analyses tested associations between clusters and health outcomes. RESULTS A Social Cluster and a Stress Cluster were identified. The Social Cluster was positively associated with well-being and inversely associated with chronic disease index, and the Stress Cluster was inversely associated with well-being and positively associated with chronic disease index. As hypothesized, the magnitude of association between clusters and outcomes differed by race/ethnicity. CONCLUSIONS By identifying psychosocial clusters and their associations with health, we have taken an important step toward understanding how individual psychosocial predictors interrelate and how empirically formed Stress and Social clusters relate to health outcomes. This study has also demonstrated important insight about differences in associations between these psychosocial clusters and health among racial/ethnic minorities. These differences could signal the best pathways for intervention modification and tailoring. PMID:27279761

  2. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong; Wu, Tao

    2017-01-01

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced

  3. Planck early results. XII. Cluster Sunyaev-Zeldovich optical scaling relations

    DEFF Research Database (Denmark)

    Poutanen, T.; Natoli, P.; Polenta, G.

    2011-01-01

    We present the Sunyaev-Zeldovich (SZ) signal-to-richness scaling relation (Y500 - N200) for the MaxBCG cluster catalogue. Employing a multi-frequency matched filter on the Planck sky maps, we measure the SZ signal for each cluster by adapting the filter according to weak-lensing calibrated mass-r...

  4. Planck 2015 results: XXIV. Cosmology from Sunyaev-Zeldovich cluster counts

    DEFF Research Database (Denmark)

    Ade, P. A R; Aghanim, N.; Arnaud, M.

    2016-01-01

    We present cluster counts and corresponding cosmological constraints from the Planck full mission data set. Our catalogue consists of 439 clusters detected via their Sunyaev-Zeldovich (SZ) signal down to a signal-to-noise ratio of 6, and is more than a factor of 2 larger than the 2013 Planck clus...

  5. Heuristics for Hierarchical Partitioning with Application to Model Checking

    DEFF Research Database (Denmark)

    Möller, Michael Oliver; Alur, Rajeev

    2001-01-01

    Given a collection of connected components, it is often desired to cluster together parts of strong correspondence, yielding a hierarchical structure. We address the automation of this process and apply heuristics to battle the combinatorial and computational complexity. We define a cost function...... that captures the quality of a structure relative to the connections and favors shallow structures with a low degree of branching. Finding a structure with minimal cost is NP-complete. We present a greedy polynomial-time algorithm that approximates good solutions incrementally by local evaluation of a heuristic...... function. We argue for a heuristic function based on four criteria: the number of enclosed connections, the number of components, the number of touched connections and the depth of the structure. We report on an application in the context of formal verification, where our algorithm serves as a preprocessor...

  6. Hierarchically organized layout for visualization of biochemical pathways.

    Science.gov (United States)

    Tsay, Jyh-Jong; Wu, Bo-Liang; Jeng, Yu-Sen

    2010-01-01

    Many complex pathways are described as hierarchical structures in which a pathway is recursively partitioned into several sub-pathways, and organized hierarchically as a tree. The hierarchical structure provides a natural way to visualize the global structure of a complex pathway. However, none of the previous research on pathway visualization explores the hierarchical structures provided by many complex pathways. In this paper, we aim to develop algorithms that can take advantages of hierarchical structures, and give layouts that explore the global structures as well as local structures of pathways. We present a new hierarchically organized layout algorithm to produce layouts for hierarchically organized pathways. Our algorithm first decomposes a complex pathway into sub-pathway groups along the hierarchical organization, and then partition each sub-pathway group into basic components. It then applies conventional layout algorithms, such as hierarchical layout and force-directed layout, to compute the layout of each basic component. Finally, component layouts are joined to form a final layout of the pathway. Our main contribution is the development of algorithms for decomposing pathways and joining layouts. Experiment shows that our algorithm is able to give comprehensible visualization for pathways with hierarchies, cycles as well as complex structures. It clearly renders the global component structures as well as the local structure in each component. In addition, it runs very fast, and gives better visualization for many examples from previous related research. 2009 Elsevier B.V. All rights reserved.

  7. Hierarchical Cu precipitation in lamellated steel after multistage heat treatment

    Science.gov (United States)

    Liu, Qingdong; Gu, Jianfeng

    2017-09-01

    The hierarchical distribution of Cu-rich precipitates (CRPs) and related partitioning and segregation behaviours of solute atoms were investigated in a 1.54 Cu-3.51 Ni (wt.%) low-carbon high-strength low-alloy (HSLA) steel after multistage heat treatment by using the combination of electron backscatter diffraction (EBSD), transmission electron microscopy (TEM) and atom probe tomography (APT). Intercritical tempering at 725 °C of as-quenched lathlike martensitic structure leads to the coprecipitation of CRPs at the periphery of a carbide precipitate which is possibly in its paraequilibrium state due to distinct solute segregation at the interface. The alloyed carbide and CRPs provide constituent elements for each other and make the coprecipitation thermodynamically favourable. Meanwhile, austenite reversion occurs to form fresh secondary martensite (FSM) zone where is rich in Cu and pertinent Ni and Mn atoms, which gives rise to a different distributional morphology of CRPs with large size and high density. In addition, conventional tempering at 500 °C leads to the formation of nanoscale Cu-rich clusters in α-Fe matrix. As a consequence, three populations of CRPs are hierarchically formed around carbide precipitate, at FSM zone and in α-Fe matrix. The formation of different precipitated features can be turned by controlling diffusion pathways of related solute atoms and further to tailor mechanical properties via proper multistage heat treatments.

  8. Organization of excitable dynamics in hierarchical biological networks.

    Directory of Open Access Journals (Sweden)

    Mark Müller-Linow

    Full Text Available This study investigates the contributions of network topology features to the dynamic behavior of hierarchically organized excitable networks. Representatives of different types of hierarchical networks as well as two biological neural networks are explored with a three-state model of node activation for systematically varying levels of random background network stimulation. The results demonstrate that two principal topological aspects of hierarchical networks, node centrality and network modularity, correlate with the network activity patterns at different levels of spontaneous network activation. The approach also shows that the dynamic behavior of the cerebral cortical systems network in the cat is dominated by the network's modular organization, while the activation behavior of the cellular neuronal network of Caenorhabditis elegans is strongly influenced by hub nodes. These findings indicate the interaction of multiple topological features and dynamic states in the function of complex biological networks.

  9. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong

    2017-08-03

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced on a large-scale approach. The unique HNDCM holds great promise as components in separation and advanced carbon devices because they could offer unconventional fluidic transport phenomena on the nanoscale. Overall, the invention set forth herein covers a hierarchically structured, nitrogen-doped carbon membranes and methods of making and using such a membranes.

  10. Hierarchical Rhetorical Sentence Categorization for Scientific Papers

    Science.gov (United States)

    Rachman, G. H.; Khodra, M. L.; Widyantoro, D. H.

    2018-03-01

    Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naïve Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child-category can be built by hierarchical classifier.

  11. A generic algorithm for constructing hierarchical representations of geometric objects

    International Nuclear Information System (INIS)

    Xavier, P.G.

    1995-01-01

    For a number of years, robotics researchers have exploited hierarchical representations of geometrical objects and scenes in motion-planning, collision-avoidance, and simulation. However, few general techniques exist for automatically constructing them. We present a generic, bottom-up algorithm that uses a heuristic clustering technique to produced balanced, coherent hierarchies. Its worst-case running time is O(N 2 logN), but for non-pathological cases it is O(NlogN), where N is the number of input primitives. We have completed a preliminary C++ implementation for input collections of 3D convex polygons and 3D convex polyhedra and conducted simple experiments with scenes of up to 12,000 polygons, which take only a few minutes to process. We present examples using spheres and convex hulls as hierarchy primitives

  12. Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples.

    Science.gov (United States)

    Henry, David; Dymnicki, Allison B; Mohatt, Nathaniel; Allen, James; Kelly, James G

    2015-10-01

    Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a "real-world" example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.

  13. Clustering Methods with Qualitative Data: A Mixed Methods Approach for Prevention Research with Small Samples

    Science.gov (United States)

    Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.

    2016-01-01

    Qualitative methods potentially add depth to prevention research, but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data, but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-Means clustering, and latent class analysis produced similar levels of accuracy with binary data, and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities. PMID:25946969

  14. Cardiometabolic risk clustering in spinal cord injury: results of exploratory factor analysis.

    Science.gov (United States)

    Libin, Alexander; Tinsley, Emily A; Nash, Mark S; Mendez, Armando J; Burns, Patricia; Elrod, Matt; Hamm, Larry F; Groah, Suzanne L

    2013-01-01

    Evidence suggests an elevated prevalence of cardiometabolic risks among persons with spinal cord injury (SCI); however, the unique clustering of risk factors in this population has not been fully explored. The purpose of this study was to describe unique clustering of cardiometabolic risk factors differentiated by level of injury. One hundred twenty-one subjects (mean 37 ± 12 years; range, 18-73) with chronic C5 to T12 motor complete SCI were studied. Assessments included medical histories, anthropometrics and blood pressure, and fasting serum lipids, glucose, insulin, and hemoglobin A1c (HbA1c). The most common cardiometabolic risk factors were overweight/obesity, high levels of low-density lipoprotein (LDL-C), and low levels of high-density lipoprotein (HDL-C). Risk clustering was found in 76.9% of the population. Exploratory principal component factor analysis using varimax rotation revealed a 3-factor model in persons with paraplegia (65.4% variance) and a 4-factor solution in persons with tetraplegia (73.3% variance). The differences between groups were emphasized by the varied composition of the extracted factors: Lipid Profile A (total cholesterol [TC] and LDL-C), Body Mass-Hypertension Profile (body mass index [BMI], systolic blood pressure [SBP], and fasting insulin [FI]); Glycemic Profile (fasting glucose and HbA1c), and Lipid Profile B (TG and HDL-C). BMI and SBP formed a separate factor only in persons with tetraplegia. Although the majority of the population with SCI has risk clustering, the composition of the risk clusters may be dependent on level of injury, based on a factor analysis group comparison. This is clinically plausible and relevant as tetraplegics tend to be hypo- to normotensive and more sedentary, resulting in lower HDL-C and a greater propensity toward impaired carbohydrate metabolism.

  15. IDENTIFYING REGIONAL CLUSTER MANAGEMENT POTENTIALS EMPIRICAL RESULTS FROM THREE NORTH RHINEWESTPHALIAN REGIONS

    OpenAIRE

    Rudiger Hamm; Christiane Goebel

    2010-01-01

    The development and support of clusters is an issue that became quite popular by players dealing with regional economic policy. But before a regional development agency can start to implement a cluster-oriented strategy there a two question that have to be answered: 1. What are the regional fields of competence (cluster potentials) that fulfill the requirements for a cluster-oriented regional development policy? 2. If you find such regional fields of competence, are the enterprises willing to...

  16. One-step synthesis of SnCo nanoconfined in hierarchical carbon nanostructures for lithium ion battery anode.

    Science.gov (United States)

    Qin, Jian; Liu, Dongye; Zhang, Xiang; Zhao, Naiqin; Shi, Chunsheng; Liu, En-Zuo; He, Fang; Ma, Liying; Li, Qunying; Li, Jiajun; He, Chunnian

    2017-10-26

    A new strategy for the one-step synthesis of a 0D SnCo nanoparticles-1D carbon nanotubes-3D hollow carbon submicrocube cluster (denoted as SnCo@CNT-3DC) hierarchical nanostructured material was developed via a simple chemical vapor deposition (CVD) process with the assistance of a water-soluble salt (NaCl). The adopted NaCl not only acted as a cubic template for inducing the formation of the 3D hollow carbon submicrocube cluster but also provides a substrate for the SnCo catalysts impregnation and CNT growth, ultimately leading to the successful construction of the unique 0D-1D-3D structured SnCo@CNT-3DC during the CVD of C 2 H 2 . When utilized as a lithium-ion battery anode, the SnCo@CNT-3DC composite electrode demonstrated an excellent rate performance and cycling stability for Li-ion storage. Specifically, an impressive reversible capacity of 826 mA h g -1 after 100 cycles at 0.1 A g -1 and a high rate capacity of 278 mA h g -1 even after 1000 cycles at 5 A g -1 were achieved. This remarkable electrochemical performance could be ascribed to the unique hierarchical nanostructure of SnCo@CNT-3DC, which guarantees a deep permeation of electrolytes and a shortened lithium salt diffusion pathway in the solid phase as well as numerous hyperchannels for electron transfer.

  17. Prevalence of Metabolic Syndrome and Its Components among Japanese Workers by Clustered Business Category.

    Science.gov (United States)

    Hidaka, Tomoo; Hayakawa, Takehito; Kakamu, Takeyasu; Kumagai, Tomohiro; Hiruta, Yuhei; Hata, Junko; Tsuji, Masayoshi; Fukushima, Tetsuhito

    2016-01-01

    The present study was a cross-sectional study conducted to reveal the prevalence of metabolic syndrome and its components and describe the features of such prevalence among Japanese workers by clustered business category using big data. The data of approximately 120,000 workers were obtained from a national representative insurance organization, and the study analyzed the health checkup and questionnaire results according to the field of business of each subject. Abnormalities found during the checkups such as excessive waist circumference, hypertension or glucose intolerance, and metabolic syndrome, were recorded. All subjects were classified by business field into 18 categories based on The North American Industry Classification System. Based on the criteria of the Japanese Committee for the Diagnostic Criteria of Metabolic Syndrome, the standardized prevalence ratio (SPR) of metabolic syndrome and its components by business category was calculated, and the 95% confidence interval of the SPR was computed. Hierarchical cluster analysis was then performed based on the SPR of metabolic syndrome components, and the 18 business categories were classified into three clusters for both males and females. The following business categories were at significantly high risk of metabolic syndrome: among males, Construction, Transportation, Professional Services, and Cooperative Association; and among females, Health Care and Cooperative Association. The results of the cluster analysis indicated one cluster for each gender with a higher prevalence of metabolic syndrome components; among males, a cluster consisting of Manufacturing, Transportation, Finance, and Cooperative Association, and among females, a cluster consisting of Mining, Transportation, Finance, Accommodation, and Cooperative Association. These findings reveal that, when providing health guidance and support regarding metabolic syndrome, consideration must be given to its components and the variety of its

  18. Prevalence of Metabolic Syndrome and Its Components among Japanese Workers by Clustered Business Category.

    Directory of Open Access Journals (Sweden)

    Tomoo Hidaka

    Full Text Available The present study was a cross-sectional study conducted to reveal the prevalence of metabolic syndrome and its components and describe the features of such prevalence among Japanese workers by clustered business category using big data. The data of approximately 120,000 workers were obtained from a national representative insurance organization, and the study analyzed the health checkup and questionnaire results according to the field of business of each subject. Abnormalities found during the checkups such as excessive waist circumference, hypertension or glucose intolerance, and metabolic syndrome, were recorded. All subjects were classified by business field into 18 categories based on The North American Industry Classification System. Based on the criteria of the Japanese Committee for the Diagnostic Criteria of Metabolic Syndrome, the standardized prevalence ratio (SPR of metabolic syndrome and its components by business category was calculated, and the 95% confidence interval of the SPR was computed. Hierarchical cluster analysis was then performed based on the SPR of metabolic syndrome components, and the 18 business categories were classified into three clusters for both males and females. The following business categories were at significantly high risk of metabolic syndrome: among males, Construction, Transportation, Professional Services, and Cooperative Association; and among females, Health Care and Cooperative Association. The results of the cluster analysis indicated one cluster for each gender with a higher prevalence of metabolic syndrome components; among males, a cluster consisting of Manufacturing, Transportation, Finance, and Cooperative Association, and among females, a cluster consisting of Mining, Transportation, Finance, Accommodation, and Cooperative Association. These findings reveal that, when providing health guidance and support regarding metabolic syndrome, consideration must be given to its components and the

  19. Ecosystem health pattern analysis of urban clusters based on emergy synthesis: Results and implication for management

    International Nuclear Information System (INIS)

    Su, Meirong; Fath, Brian D.; Yang, Zhifeng; Chen, Bin; Liu, Gengyuan

    2013-01-01

    The evaluation of ecosystem health in urban clusters will help establish effective management that promotes sustainable regional development. To standardize the application of emergy synthesis and set pair analysis (EM–SPA) in ecosystem health assessment, a procedure for using EM–SPA models was established in this paper by combining the ability of emergy synthesis to reflect health status from a biophysical perspective with the ability of set pair analysis to describe extensive relationships among different variables. Based on the EM–SPA model, the relative health levels of selected urban clusters and their related ecosystem health patterns were characterized. The health states of three typical Chinese urban clusters – Jing-Jin-Tang, Yangtze River Delta, and Pearl River Delta – were investigated using the model. The results showed that the health status of the Pearl River Delta was relatively good; the health for the Yangtze River Delta was poor. As for the specific health characteristics, the Pearl River Delta and Yangtze River Delta urban clusters were relatively strong in Vigor, Resilience, and Urban ecosystem service function maintenance, while the Jing-Jin-Tang was relatively strong in organizational structure and environmental impact. Guidelines for managing these different urban clusters were put forward based on the analysis of the results of this study. - Highlights: • The use of integrated emergy synthesis and set pair analysis model was standardized. • The integrated model was applied on the scale of an urban cluster. • Health patterns of different urban clusters were compared. • Policy suggestions were provided based on the health pattern analysis

  20. Blooming Trees: Substructures and Surrounding Groups of Galaxy Clusters

    Science.gov (United States)

    Yu, Heng; Diaferio, Antonaldo; Serra, Ana Laura; Baldi, Marco

    2018-06-01

    We develop the Blooming Tree Algorithm, a new technique that uses spectroscopic redshift data alone to identify the substructures and the surrounding groups of galaxy clusters, along with their member galaxies. Based on the estimated binding energy of galaxy pairs, the algorithm builds a binary tree that hierarchically arranges all of the galaxies in the field of view. The algorithm searches for buds, corresponding to gravitational potential minima on the binary tree branches; for each bud, the algorithm combines the number of galaxies, their velocity dispersion, and their average pairwise distance into a parameter that discriminates between the buds that do not correspond to any substructure or group, and thus eventually die, and the buds that correspond to substructures and groups, and thus bloom into the identified structures. We test our new algorithm with a sample of 300 mock redshift surveys of clusters in different dynamical states; the clusters are extracted from a large cosmological N-body simulation of a ΛCDM model. We limit our analysis to substructures and surrounding groups identified in the simulation with mass larger than 1013 h ‑1 M ⊙. With mock redshift surveys with 200 galaxies within 6 h ‑1 Mpc from the cluster center, the technique recovers 80% of the real substructures and 60% of the surrounding groups; in 57% of the identified structures, at least 60% of the member galaxies of the substructures and groups belong to the same real structure. These results improve by roughly a factor of two the performance of the best substructure identification algorithm currently available, the σ plateau algorithm, and suggest that our Blooming Tree Algorithm can be an invaluable tool for detecting substructures of galaxy clusters and investigating their complex dynamics.

  1. Zeolitic materials with hierarchical porous structures.

    Science.gov (United States)

    Lopez-Orozco, Sofia; Inayat, Amer; Schwab, Andreas; Selvam, Thangaraj; Schwieger, Wilhelm

    2011-06-17

    During the past several years, different kinds of hierarchical structured zeolitic materials have been synthesized due to their highly attractive properties, such as superior mass/heat transfer characteristics, lower restriction of the diffusion of reactants in the mesopores, and low pressure drop. Our contribution provides general information regarding types and preparation methods of hierarchical zeolitic materials and their relative advantages and disadvantages. Thereafter, recent advances in the preparation and characterization of hierarchical zeolitic structures within the crystallites by post-synthetic treatment methods, such as dealumination or desilication; and structured devices by in situ and ex situ zeolite coatings on open-cellular ceramic foams as (non-reactive as well as reactive) supports are highlighted. Specific advantages of using hierarchical zeolitic catalysts/structures in selected catalytic reactions, such as benzene to phenol (BTOP) and methanol to olefins (MTO) are presented. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Cluster analysis of obesity and asthma phenotypes.

    Directory of Open Access Journals (Sweden)

    E Rand Sutherland

    Full Text Available Asthma is a heterogeneous disease with variability among patients in characteristics such as lung function, symptoms and control, body weight, markers of inflammation, and responsiveness to glucocorticoids (GC. Cluster analysis of well-characterized cohorts can advance understanding of disease subgroups in asthma and point to unsuspected disease mechanisms. We utilized an hypothesis-free cluster analytical approach to define the contribution of obesity and related variables to asthma phenotype.In a cohort of clinical trial participants (n = 250, minimum-variance hierarchical clustering was used to identify clinical and inflammatory biomarkers important in determining disease cluster membership in mild and moderate persistent asthmatics. In a subset of participants, GC sensitivity was assessed via expression of GC receptor alpha (GCRα and induction of MAP kinase phosphatase-1 (MKP-1 expression by dexamethasone. Four asthma clusters were identified, with body mass index (BMI, kg/m(2 and severity of asthma symptoms (AEQ score the most significant determinants of cluster membership (F = 57.1, p<0.0001 and F = 44.8, p<0.0001, respectively. Two clusters were composed of predominantly obese individuals; these two obese asthma clusters differed from one another with regard to age of asthma onset, measures of asthma symptoms (AEQ and control (ACQ, exhaled nitric oxide concentration (F(ENO and airway hyperresponsiveness (methacholine PC(20 but were similar with regard to measures of lung function (FEV(1 (% and FEV(1/FVC, airway eosinophilia, IgE, leptin, adiponectin and C-reactive protein (hsCRP. Members of obese clusters demonstrated evidence of reduced expression of GCRα, a finding which was correlated with a reduced induction of MKP-1 expression by dexamethasoneObesity is an important determinant of asthma phenotype in adults. There is heterogeneity in expression of clinical and inflammatory biomarkers of asthma across obese individuals

  3. Wetting and Dewetting Transitions on Submerged Superhydrophobic Surfaces with Hierarchical Structures.

    Science.gov (United States)

    Wu, Huaping; Yang, Zhe; Cao, Binbin; Zhang, Zheng; Zhu, Kai; Wu, Bingbing; Jiang, Shaofei; Chai, Guozhong

    2017-01-10

    The wetting transition on submersed superhydrophobic surfaces with hierarchical structures and the influence of trapped air on superhydrophobic stability are predicted based on the thermodynamics and mechanical analyses. The dewetting transition on the hierarchically structured surfaces is investigated, and two necessary thermodynamic conditions and a mechanical balance condition for dewetting transition are proposed. The corresponding thermodynamic phase diagram of reversible transition and the critical reversed pressure well explain the experimental results reported previously. Our theory provides a useful guideline for precise controlling of breaking down and recovering of superhydrophobicity by designing superhydrophobic surfaces with hierarchical structures under water.

  4. Band structures of two dimensional solid/air hierarchical phononic crystals

    International Nuclear Information System (INIS)

    Xu, Y.L.; Tian, X.G.; Chen, C.Q.

    2012-01-01

    The hierarchical phononic crystals to be considered show a two-order “hierarchical” feature, which consists of square array arranged macroscopic periodic unit cells with each unit cell itself including four sub-units. Propagation of acoustic wave in such two dimensional solid/air phononic crystals is investigated by the finite element method (FEM) with the Bloch theory. Their band structure, wave filtering property, and the physical mechanism responsible for the broadened band gap are explored. The corresponding ordinary phononic crystal without hierarchical feature is used for comparison. Obtained results show that the solid/air hierarchical phononic crystals possess tunable outstanding band gap features, which are favorable for applications such as sound insulation and vibration attenuation.

  5. Resolution of Singularities Introduced by Hierarchical Structure in Deep Neural Networks.

    Science.gov (United States)

    Nitta, Tohru

    2017-10-01

    We present a theoretical analysis of singular points of artificial deep neural networks, resulting in providing deep neural network models having no critical points introduced by a hierarchical structure. It is considered that such deep neural network models have good nature for gradient-based optimization. First, we show that there exist a large number of critical points introduced by a hierarchical structure in deep neural networks as straight lines, depending on the number of hidden layers and the number of hidden neurons. Second, we derive a sufficient condition for deep neural networks having no critical points introduced by a hierarchical structure, which can be applied to general deep neural networks. It is also shown that the existence of critical points introduced by a hierarchical structure is determined by the rank and the regularity of weight matrices for a specific class of deep neural networks. Finally, two kinds of implementation methods of the sufficient conditions to have no critical points are provided. One is a learning algorithm that can avoid critical points introduced by the hierarchical structure during learning (called avoidant learning algorithm). The other is a neural network that does not have some critical points introduced by the hierarchical structure as an inherent property (called avoidant neural network).

  6. Multiperiod Hierarchical Location Problem of Transit Hub in Urban Agglomeration Area

    Directory of Open Access Journals (Sweden)

    Ting-ting Li

    2017-01-01

    Full Text Available With the rapid urbanization in developing countries, urban agglomeration area (UAA forms. Also, transportation demand in UAA grows rapidly and presents hierarchical feature. Therefore, it is imperative to develop models for transit hubs to guide the development of UAA and better meet the time-varying and hierarchical transportation demand. In this paper, the multiperiod hierarchical location problem of transit hub in urban agglomeration area (THUAA is studied. A hierarchical service network of THUAA with a multiflow, nested, and noncoherent structure is described. Then a multiperiod hierarchical mathematical programming model is proposed, aiming at minimizing the total demand weighted travel time. Moreover, an improved adaptive clonal selection algorithm is presented to solve the model. Both the model and algorithm are verified by the application to a real-life problem of Beijing-Tianjin-Hebei Region in China. The results of different scenarios in the case show that urban population migration has a great impact on the THUAA location scheme. Sustained and appropriate urban population migration helps to reduce travel time for urban residents.

  7. Hierarchical Nanoceramics for Industrial Process Sensors

    Energy Technology Data Exchange (ETDEWEB)

    Ruud, James, A.; Brosnan, Kristen, H.; Striker, Todd; Ramaswamy, Vidya; Aceto, Steven, C.; Gao, Yan; Willson, Patrick, D.; Manoharan, Mohan; Armstrong, Eric, N., Wachsman, Eric, D.; Kao, Chi-Chang

    2011-07-15

    This project developed a robust, tunable, hierarchical nanoceramics materials platform for industrial process sensors in harsh-environments. Control of material structure at multiple length scales from nano to macro increased the sensing response of the materials to combustion gases. These materials operated at relatively high temperatures, enabling detection close to the source of combustion. It is anticipated that these materials can form the basis for a new class of sensors enabling widespread use of efficient combustion processes with closed loop feedback control in the energy-intensive industries. The first phase of the project focused on materials selection and process development, leading to hierarchical nanoceramics that were evaluated for sensing performance. The second phase focused on optimizing the materials processes and microstructures, followed by validation of performance of a prototype sensor in a laboratory combustion environment. The objectives of this project were achieved by: (1) synthesizing and optimizing hierarchical nanostructures; (2) synthesizing and optimizing sensing nanomaterials; (3) integrating sensing functionality into hierarchical nanostructures; (4) demonstrating material performance in a sensing element; and (5) validating material performance in a simulated service environment. The project developed hierarchical nanoceramic electrodes for mixed potential zirconia gas sensors with increased surface area and demonstrated tailored electrocatalytic activity operable at high temperatures enabling detection of products of combustion such as NOx close to the source of combustion. Methods were developed for synthesis of hierarchical nanostructures with high, stable surface area, integrated catalytic functionality within the structures for gas sensing, and demonstrated materials performance in harsh lab and combustion gas environments.

  8. Formation of stable products from cluster-cluster collisions

    International Nuclear Information System (INIS)

    Alamanova, Denitsa; Grigoryan, Valeri G; Springborg, Michael

    2007-01-01

    The formation of stable products from copper cluster-cluster collisions is investigated by using classical molecular-dynamics simulations in combination with an embedded-atom potential. The dependence of the product clusters on impact energy, relative orientation of the clusters, and size of the clusters is studied. The structures and total energies of the product clusters are analysed and compared with those of the colliding clusters before impact. These results, together with the internal temperature, are used in obtaining an increased understanding of cluster fusion processes

  9. Classification using Hierarchical Naive Bayes models

    DEFF Research Database (Denmark)

    Langseth, Helge; Dyhre Nielsen, Thomas

    2006-01-01

    Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe......, termed Hierarchical Naïve Bayes models. Hierarchical Naïve Bayes models extend the modeling flexibility of Naïve Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naïve Bayes models...

  10. Improved Adhesion and Compliancy of Hierarchical Fibrillar Adhesives.

    Science.gov (United States)

    Li, Yasong; Gates, Byron D; Menon, Carlo

    2015-08-05

    The gecko relies on van der Waals forces to cling onto surfaces with a variety of topography and composition. The hierarchical fibrillar structures on their climbing feet, ranging from mesoscale to nanoscale, are hypothesized to be key elements for the animal to conquer both smooth and rough surfaces. An epoxy-based artificial hierarchical fibrillar adhesive was prepared to study the influence of the hierarchical structures on the properties of a dry adhesive. The presented experiments highlight the advantages of a hierarchical structure despite a reduction of overall density and aspect ratio of nanofibrils. In contrast to an adhesive containing only nanometer-size fibrils, the hierarchical fibrillar adhesives exhibited a higher adhesion force and better compliancy when tested on an identical substrate.

  11. Clustering of Whole-Brain White Matter Short Association Bundles Using HARDI Data

    Directory of Open Access Journals (Sweden)

    Claudio Román

    2017-12-01

    Full Text Available Human brain connectivity is extremely complex and variable across subjects. While long association and projection bundles are stable and have been deeply studied, short association bundles present higher intersubject variability, and few studies have been carried out to adequately describe the structure, shape, and reproducibility of these bundles. However, their analysis is crucial to understand brain function and better characterize the human connectome. In this study, we propose an automatic method to identify reproducible short association bundles of the superficial white matter, based on intersubject hierarchical clustering. The method is applied to the whole brain and finds representative clusters of similar fibers belonging to a group of subjects, according to a distance metric between fibers. We experimented with both affine and non-linear registrations and, due to better reproducibility, chose the results obtained from non-linear registration. Once the clusters are calculated, our method performs automatic labeling of the most stable connections based on individual cortical parcellations. We compare results between two independent groups of subjects from a HARDI database to generate reproducible connections for the creation of an atlas. To perform a better validation of the results, we used a bagging strategy that uses pairs of groups of 27 subjects from a database of 74 subjects. The result is an atlas with 44 bundles in the left hemisphere and 49 in the right hemisphere, of which 33 bundles are found in both hemispheres. Finally, we use the atlas to automatically segment 78 new subjects from a different HARDI database and to analyze stability and lateralization results.

  12. Application of hierarchical matrices for partial inverse

    KAUST Repository

    Litvinenko, Alexander

    2013-11-26

    In this work we combine hierarchical matrix techniques (Hackbusch, 1999) and domain decomposition methods to obtain fast and efficient algorithms for the solution of multiscale problems. This combination results in the hierarchical domain decomposition (HDD) method, which can be applied for solution multi-scale problems. Multiscale problems are problems that require the use of different length scales. Using only the finest scale is very expensive, if not impossible, in computational time and memory. Domain decomposition methods decompose the complete problem into smaller systems of equations corresponding to boundary value problems in subdomains. Then fast solvers can be applied to each subdomain. Subproblems in subdomains are independent, much smaller and require less computational resources as the initial problem.

  13. Maximum-likelihood model averaging to profile clustering of site types across discrete linear sequences.

    Directory of Open Access Journals (Sweden)

    Zhang Zhang

    2009-06-01

    Full Text Available A major analytical challenge in computational biology is the detection and description of clusters of specified site types, such as polymorphic or substituted sites within DNA or protein sequences. Progress has been stymied by a lack of suitable methods to detect clusters and to estimate the extent of clustering in discrete linear sequences, particularly when there is no a priori specification of cluster size or cluster count. Here we derive and demonstrate a maximum likelihood method of hierarchical clustering. Our method incorporates a tripartite divide-and-conquer strategy that models sequence heterogeneity, delineates clusters, and yields a profile of the level of clustering associated with each site. The clustering model may be evaluated via model selection using the Akaike Information Criterion, the corrected Akaike Information Criterion, and the Bayesian Information Criterion. Furthermore, model averaging using weighted model likelihoods may be applied to incorporate model uncertainty into the profile of heterogeneity across sites. We evaluated our method by examining its performance on a number of simulated datasets as well as on empirical polymorphism data from diverse natural alleles of the Drosophila alcohol dehydrogenase gene. Our method yielded greater power for the detection of clustered sites across a breadth of parameter ranges, and achieved better accuracy and precision of estimation of clusters, than did the existing empirical cumulative distribution function statistics.

  14. Shape Analysis of HII Regions - I. Statistical Clustering

    Science.gov (United States)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

    2018-04-01

    We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  15. Spatial Region Estimation for Autonomous CoT Clustering Using Hidden Markov Model

    Directory of Open Access Journals (Sweden)

    Joon‐young Jung

    2018-02-01

    Full Text Available This paper proposes a hierarchical dual filtering (HDF algorithm to estimate the spatial region between a Cloud of Things (CoT gateway and an Internet of Things (IoT device. The accuracy of the spatial region estimation is important for autonomous CoT clustering. We conduct spatial region estimation using a hidden Markov model (HMM with a raw Bluetooth received signal strength indicator (RSSI. However, the accuracy of the region estimation using the validation data is only 53.8%. To increase the accuracy of the spatial region estimation, the HDF algorithm removes the high‐frequency signals hierarchically, and alters the parameters according to whether the IoT device moves. The accuracy of spatial region estimation using a raw RSSI, Kalman filter, and HDF are compared to evaluate the effectiveness of the HDF algorithm. The success rate and root mean square error (RMSE of all regions are 0.538, 0.622, and 0.75, and 0.997, 0.812, and 0.5 when raw RSSI, a Kalman filter, and HDF are used, respectively. The HDF algorithm attains the best results in terms of the success rate and RMSE of spatial region estimation using HMM.

  16. Parameter-Invariant Hierarchical Exclusive Alphabet Design for 2-WRC with HDF Strategy

    Directory of Open Access Journals (Sweden)

    T. Uřičář

    2010-01-01

    Full Text Available Hierarchical eXclusive Code (HXC for the Hierarchical Decode and Forward (HDF strategy in the Wireless 2-Way Relay Channel (2-WRC has the achievable rate region extended beyond the classical MAC region. Although direct HXC design is in general highly complex, a layered approach to HXC design is a feasible solution. While the outer layer code of the layered HXC can be any state-of-the-art capacity approaching code, the inner layer must be designed in such a way that the exclusive property of hierarchical symbols (received at the relay will be provided. The simplest case of the inner HXC layer is a simple signal space channel symbol memoryless mapper called Hierarchical eXclusive Alphabet (HXA. The proper design of HXA is important, especially in the case of parametric channels, where channel parametrization (e.g. phase rotation can violate the exclusive property of hierarchical symbols (as seen by the relay, resulting in significant capacity degradation. In this paper we introduce an example of a geometrical approach to Parameter-Invariant HXA design, and we show that the corresponding hierarchical MAC capacity region extends beyond the classical MAC region, irrespective of the channel pametrization.

  17. Identifying novel phenotypes of acute heart failure using cluster analysis of clinical variables.

    Science.gov (United States)

    Horiuchi, Yu; Tanimoto, Shuzou; Latif, A H M Mahbub; Urayama, Kevin Y; Aoki, Jiro; Yahagi, Kazuyuki; Okuno, Taishi; Sato, Yu; Tanaka, Tetsu; Koseki, Keita; Komiyama, Kota; Nakajima, Hiroyoshi; Hara, Kazuhiro; Tanabe, Kengo

    2018-07-01

    Acute heart failure (AHF) is a heterogeneous disease caused by various cardiovascular (CV) pathophysiology and multiple non-CV comorbidities. We aimed to identify clinically important subgroups to improve our understanding of the pathophysiology of AHF and inform clinical decision-making. We evaluated detailed clinical data of 345 consecutive AHF patients using non-hierarchical cluster analysis of 77 variables, including age, sex, HF etiology, comorbidities, physical findings, laboratory data, electrocardiogram, echocardiogram and treatment during hospitalization. Cox proportional hazards regression analysis was performed to estimate the association between the clusters and clinical outcomes. Three clusters were identified. Cluster 1 (n=108) represented "vascular failure". This cluster had the highest average systolic blood pressure at admission and lung congestion with type 2 respiratory failure. Cluster 2 (n=89) represented "cardiac and renal failure". They had the lowest ejection fraction (EF) and worst renal function. Cluster 3 (n=148) comprised mostly older patients and had the highest prevalence of atrial fibrillation and preserved EF. Death or HF hospitalization within 12-month occurred in 23% of Cluster 1, 36% of Cluster 2 and 36% of Cluster 3 (p=0.034). Compared with Cluster 1, risk of death or HF hospitalization was 1.74 (95% CI, 1.03-2.95, p=0.037) for Cluster 2 and 1.82 (95% CI, 1.13-2.93, p=0.014) for Cluster 3. Cluster analysis may be effective in producing clinically relevant categories of AHF, and may suggest underlying pathophysiology and potential utility in predicting clinical outcomes. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Patterns of comorbidity in community-dwelling older people hospitalised for fall-related injury: A cluster analysis

    Directory of Open Access Journals (Sweden)

    Finch Caroline F

    2011-08-01

    Full Text Available Abstract Background Community-dwelling older people aged 65+ years sustain falls frequently; these can result in physical injuries necessitating medical attention including emergency department care and hospitalisation. Certain health conditions and impairments have been shown to contribute independently to the risk of falling or experiencing a fall injury, suggesting that individuals with these conditions or impairments should be the focus of falls prevention. Since older people commonly have multiple conditions/impairments, knowledge about which conditions/impairments coexist in at-risk individuals would be valuable in the implementation of a targeted prevention approach. The objective of this study was therefore to examine the prevalence and patterns of comorbidity in this population group. Methods We analysed hospitalisation data from Victoria, Australia's second most populous state, to estimate the prevalence of comorbidity in patients hospitalised at least once between 2005-6 and 2007-8 for treatment of acute fall-related injuries. In patients with two or more comorbid conditions (multicomorbidity we used an agglomerative hierarchical clustering method to cluster comorbidity variables and identify constellations of conditions. Results More than one in four patients had at least one comorbid condition and among patients with comorbidity one in three had multicomorbidity (range 2-7. The prevalence of comorbidity varied by gender, age group, ethnicity and injury type; it was also associated with a significant increase in the average cumulative length of stay per patient. The cluster analysis identified five distinct, biologically plausible clusters of comorbidity: cardiopulmonary/metabolic, neurological, sensory, stroke and cancer. The cardiopulmonary/metabolic cluster was the largest cluster among the clusters identified. Conclusions The consequences of comorbidity clustering in terms of falls and/or injury outcomes of hospitalised patients

  19. Hybrid Swarm Intelligence Energy Efficient Clustered Routing Algorithm for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Rajeev Kumar

    2016-01-01

    Full Text Available Currently, wireless sensor networks (WSNs are used in many applications, namely, environment monitoring, disaster management, industrial automation, and medical electronics. Sensor nodes carry many limitations like low battery life, small memory space, and limited computing capability. To create a wireless sensor network more energy efficient, swarm intelligence technique has been applied to resolve many optimization issues in WSNs. In many existing clustering techniques an artificial bee colony (ABC algorithm is utilized to collect information from the field periodically. Nevertheless, in the event based applications, an ant colony optimization (ACO is a good solution to enhance the network lifespan. In this paper, we combine both algorithms (i.e., ABC and ACO and propose a new hybrid ABCACO algorithm to solve a Nondeterministic Polynomial (NP hard and finite problem of WSNs. ABCACO algorithm is divided into three main parts: (i selection of optimal number of subregions and further subregion parts, (ii cluster head selection using ABC algorithm, and (iii efficient data transmission using ACO algorithm. We use a hierarchical clustering technique for data transmission; the data is transmitted from member nodes to the subcluster heads and then from subcluster heads to the elected cluster heads based on some threshold value. Cluster heads use an ACO algorithm to discover the best route for data transmission to the base station (BS. The proposed approach is very useful in designing the framework for forest fire detection and monitoring. The simulation results show that the ABCACO algorithm enhances the stability period by 60% and also improves the goodput by 31% against LEACH and WSNCABC, respectively.

  20. Ant Colony Optimization Approaches to Clustering of Lung Nodules from CT Images

    Directory of Open Access Journals (Sweden)

    Ravichandran C. Gopalakrishnan

    2014-01-01

    Full Text Available Lung cancer is becoming a threat to mankind. Applying machine learning algorithms for detection and segmentation of irregular shaped lung nodules remains a remarkable milestone in CT scan image analysis research. In this paper, we apply ACO algorithm for lung nodule detection. We have compared the performance against three other algorithms, namely, Otsu algorithm, watershed algorithm, and global region based segmentation. In addition, we suggest a novel approach which involves variations of ACO, namely, refined ACO, logical ACO, and variant ACO. Variant ACO shows better reduction in false positives. In addition we propose black circular neighborhood approach to detect nodule centers from the edge detected image. Genetic algorithm based clustering is performed to cluster the nodules based on intensity, shape, and size. The performance of the overall approach is compared with hierarchical clustering to establish the improvisation in the proposed approach.