WorldWideScience

Sample records for hierarchical agglomerative cluster

  1. Clinical fracture risk evaluated by hierarchical agglomerative clustering

    DEFF Research Database (Denmark)

    Kruse, C; Eiken, P; Vestergaard, P

    2017-01-01

    reimbursement, primary healthcare sector use and comorbidity of female subjects were combined. Standardized variable means, Euclidean distances and Ward's D2 method of hierarchical agglomerative clustering (HAC), were used to form the clustering object. K number of clusters was selected with the lowest cluster...

  2. Kendall’s tau and agglomerative clustering for structure determination of hierarchical Archimedean copulas

    Directory of Open Access Journals (Sweden)

    Górecki J.

    2017-01-01

    Full Text Available Several successful approaches to structure determination of hierarchical Archimedean copulas (HACs proposed in the literature rely on agglomerative clustering and Kendall’s correlation coefficient. However, there has not been presented any theoretical proof justifying such approaches. This work fills this gap and introduces a theorem showing that, given the matrix of the pairwise Kendall correlation coefficients corresponding to a HAC, its structure can be recovered by an agglomerative clustering technique.

  3. Radar Emission Sources Identification Based on Hierarchical Agglomerative Clustering for Large Data Sets

    Directory of Open Access Journals (Sweden)

    Janusz Dudczyk

    2016-01-01

    Full Text Available More advanced recognition methods, which may recognize particular copies of radars of the same type, are called identification. The identification process of radar devices is a more specialized task which requires methods based on the analysis of distinctive features. These features are distinguished from the signals coming from the identified devices. Such a process is called Specific Emitter Identification (SEI. The identification of radar emission sources with the use of classic techniques based on the statistical analysis of basic measurable parameters of a signal such as Radio Frequency, Amplitude, Pulse Width, or Pulse Repetition Interval is not sufficient for SEI problems. This paper presents the method of hierarchical data clustering which is used in the process of radar identification. The Hierarchical Agglomerative Clustering Algorithm (HACA based on Generalized Agglomerative Scheme (GAS implemented and used in the research method is parameterized; therefore, it is possible to compare the results. The results of clustering are presented in dendrograms in this paper. The received results of grouping and identification based on HACA are compared with other SEI methods in order to assess the degree of their usefulness and effectiveness for systems of ESM/ELINT class.

  4. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Directory of Open Access Journals (Sweden)

    I. Crawford

    2015-11-01

    Full Text Available In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the

  5. SDN‐Based Hierarchical Agglomerative Clustering Algorithm for Interference Mitigation in Ultra‐Dense Small Cell Networks

    Directory of Open Access Journals (Sweden)

    Guang Yang

    2018-04-01

    Full Text Available Ultra‐dense small cell networks (UD‐SCNs have been identified as a promising scheme for next‐generation wireless networks capable of meeting the ever‐increasing demand for higher transmission rates and better quality of service. However, UD‐SCNs will inevitably suffer from severe interference among the small cell base stations, which will lower their spectral efficiency. In this paper, we propose a software‐defined networking (SDN‐based hierarchical agglomerative clustering (SDN‐HAC framework, which leverages SDN to centrally control all sub‐channels in the network, and decides on cluster merging using a similarity criterion based on a suitability function. We evaluate the proposed algorithm through simulation. The obtained results show that the proposed algorithm performs well and improves system payoff by 18.19% and 436.34% when compared with the traditional network architecture algorithms and non‐cooperative scenarios, respectively.

  6. Assessment of genetic divergence in tomato through agglomerative hierarchical clustering and principal component analysis

    International Nuclear Information System (INIS)

    Iqbal, Q.; Saleem, M.Y.; Hameed, A.; Asghar, M.

    2014-01-01

    For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)

  7. Kendall’s tau and agglomerative clustering for structure determination of hierarchical Archimedean copulas

    Czech Academy of Sciences Publication Activity Database

    Górecki, J.; Hofert, M.; Holeňa, Martin

    2017-01-01

    Roč. 5, č. 1 (2017), s. 75-87 ISSN 2300-2298 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : structure determination * agglomerative clustering * Kendall’s tau * Archimedean copula Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability

  8. Agglomerative clustering of growing squares

    NARCIS (Netherlands)

    Castermans, Thom; Speckmann, Bettina; Staals, Frank; Verbeek, Kevin; Bender, M.A.; Farach-Colton, M.; Mosteiro, M.A.

    2018-01-01

    We study an agglomerative clustering problem motivated by interactive glyphs in geo-visualization. Consider a set of disjoint square glyphs on an interactive map. When the user zooms out, the glyphs grow in size relative to the map, possibly with different speeds. When two glyphs intersect, we wish

  9. Unit commitment solution using agglomerative and divisive cluster algorithm : an effective new methodology

    Energy Technology Data Exchange (ETDEWEB)

    Reddy, N.M.; Reddy, K.R. [G. Narayanamma Inst. of Technology and Science, Hyderabad (India). Dept. of Electrical Engineering; Ramana, N.V. [JNTU College of Engineering, Jagityala (India). Dept. of Electrical Engineering

    2008-07-01

    Thermal power plants consist of several generating units with different generating capacities, fuel cost per MWH generated, minimum up/down times, and start-up or shut-down costs. The Unit Commitment (UC) problem in power systems involves determining the start-up and shut-down schedules of thermal generating units to meet forecasted load over a future short term for a period of one to seven days. This paper presented a new approach for the most complex UC problem using agglomerative and divisive hierarchical clustering. Euclidean costs, which are a measure of differences in fuel cost and start-up costs of any two units, were first calculated. Then, depending on the value of Euclidean costs, similar type of units were placed in a cluster. The proposed methodology has 2 individual algorithms. An agglomerative cluster algorithm is used while the load is increasing, and a divisive cluster algorithm is used when the load is decreasing. A search was conducted for an optimal solution for a minimal number of clusters and cluster data points. A standard ten-unit thermal unit power system was used to test and evaluate the performance of the method for a period of 24 hours. The new approach proved to be quite effective and satisfactory. 15 refs., 9 tabs., 5 figs.

  10. APLIKASI METODE-METODE AGGLOMERATIVE DALAM ANALISIS KLASTER PADA DATA TINGKAT POLUSI UDARA

    Directory of Open Access Journals (Sweden)

    Dewi Rachmatin

    2014-09-01

    Full Text Available ABSTRAK   Analisis Klaster merupakan analisis pengelompokkan data yang mengelompokkan data berdasarkan informasi yang ditemukan pada data. Tujuan dari analisis klaster adalah agar objek-objek di dalam satu kelompok memiliki kesamaan satu sama lain sedangkan dengan objek-objek yang berbeda kelompok memiliki perbedaan. Analisis klaster dibagi menjadi dua metode yaitu metode hirarki dan metode non-hirarki. Metode hirarki dibagi menjadi dua, yaitu metode agglomerative (pemusatan dan metode divisive (penyebaran. Metode-metode yang termasuk dalam metode agglomerative adalah Single Linkage Method, Complete Linkage Method, Average Linkage Method, Ward’s Method, Centroid Method dan Median Method. Pada artikel ini dibahas metode-metode agglomerative tersebut yang diterapkan pada data tingkat polusi udara. Masing-masing metode tersebut memberikan jumlah klaster yang berbeda.   Kata Kunci : Analisis Klaster, Single Linkage Method, Complete Linkage Method, Average Linkage Method, Ward’s Method, Centroid Method dan Median Method.     ABSTRACT Cluster analysis is an analysis of the data classification based on information found in the data.The objective of cluster analysis is that the objects in the group have in common with each other, while the different objects have different groups. Cluster analysis is divided into two methods : the method of non-hierarchical and hierarchical methods.Hierarchical method is divided into two methods, namely agglomerative methods (concentration and divisive methods (deployment. The methods included in the agglomerative method is Single Linkage Method, Complete Linkage Method, Average Linkage Method, Ward 's Method, Method and Median Centroid Method. In this article discussed the agglomerative methods were applied to the data rate of air pollution. Each of these methods provides a different number of clusters.   Keywords: Cluster Analysis , Single Linkage Method, Complete Linkage Method, Average Linkage Method, Ward

  11. The efficiency of average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling in identifying homogeneous precipitation catchments

    Science.gov (United States)

    Chuan, Zun Liang; Ismail, Noriszura; Shinyie, Wendy Ling; Lit Ken, Tan; Fam, Soo-Fen; Senawi, Azlyna; Yusoff, Wan Nur Syahidah Wan

    2018-04-01

    Due to the limited of historical precipitation records, agglomerative hierarchical clustering algorithms widely used to extrapolate information from gauged to ungauged precipitation catchments in yielding a more reliable projection of extreme hydro-meteorological events such as extreme precipitation events. However, identifying the optimum number of homogeneous precipitation catchments accurately based on the dendrogram resulted using agglomerative hierarchical algorithms are very subjective. The main objective of this study is to propose an efficient regionalized algorithm to identify the homogeneous precipitation catchments for non-stationary precipitation time series. The homogeneous precipitation catchments are identified using average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling, while uncentered correlation coefficient as the similarity measure. The regionalized homogeneous precipitation is consolidated using K-sample Anderson Darling non-parametric test. The analysis result shows the proposed regionalized algorithm performed more better compared to the proposed agglomerative hierarchical clustering algorithm in previous studies.

  12. Improving CLOPE’s Profit Value and Stability with an Optimized Agglomerative Approach

    Directory of Open Access Journals (Sweden)

    Yefeng Li

    2015-06-01

    Full Text Available CLOPE (Clustering with sLOPE is a simple and fast histogram-based clustering algorithm for categorical data. However, given the same data set with the same input parameter, the clustering results by this algorithm would possibly be different if the transactions are input in a different sequence. In this paper, a hierarchical clustering framework is proposed as an extension of CLOPE to generate stable and satisfactory clustering results based on an optimized agglomerative merge process. The new clustering profit is defined as the merge criteria and the cluster graph structure is proposed to optimize the merge iteration process. The experiments conducted on two datasets both demonstrate that the agglomerative approach achieves stable clustering results with a better profit value, but costs much more time due to the worse complexity.

  13. Application of agglomerative clustering for analyzing phylogenetically on bacterium of saliva

    Science.gov (United States)

    Bustamam, A.; Fitria, I.; Umam, K.

    2017-07-01

    Analyzing population of Streptococcus bacteria is important since these species can cause dental caries, periodontal, halitosis (bad breath) and more problems. This paper will discuss the phylogenetically relation between the bacterium Streptococcus in saliva using a phylogenetic tree of agglomerative clustering methods. Starting with the bacterium Streptococcus DNA sequence obtained from the GenBank, then performed characteristic extraction of DNA sequences. The characteristic extraction result is matrix form, then performed normalization using min-max normalization and calculate genetic distance using Manhattan distance. Agglomerative clustering technique consisting of single linkage, complete linkage and average linkage. In this agglomerative algorithm number of group is started with the number of individual species. The most similar species is grouped until the similarity decreases and then formed a single group. Results of grouping is a phylogenetic tree and branches that join an established level of distance, that the smaller the distance the more the similarity of the larger species implementation is using R, an open source program.

  14. Codebook Generation Using Partition and Agglomerative Clustering

    Directory of Open Access Journals (Sweden)

    CHANG, C.-T.

    2011-08-01

    Full Text Available In this paper, we present a codebook generation algorithm to produce a codebook with lower distortion. Our method combines a fast codebook generation algorithm (CGAUCD with doubling technique and fast agglomerative clustering algorithm (FACA to generate a codebook with less computing time and lower distortion. Instead of using FACA directly to divide training vectors into M clusters, our proposed method first generates qM clusters from these training vectors, where q>1 is an integer, and then applies FACA to merge these qM clusters into M cells. This is due to the computational complexity of CGAUCD with doubling technique is less than that of FACA. These M cluster centers are used as the initial codebook for CGAUCD. Using three real images as the training set, our method can reduce the MSE and computing time of FPNN+CGAUCD, which is the available best method to our knowledge, by 0.19 to 0.38 and 74.6% to 84.3%, respectively.

  15. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li

    2017-01-01

    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  16. A similarity based agglomerative clustering algorithm in networks

    Science.gov (United States)

    Liu, Zhiyuan; Wang, Xiujuan; Ma, Yinghong

    2018-04-01

    The detection of clusters is benefit for understanding the organizations and functions of networks. Clusters, or communities, are usually groups of nodes densely interconnected but sparsely linked with any other clusters. To identify communities, an efficient and effective community agglomerative algorithm based on node similarity is proposed. The proposed method initially calculates similarities between each pair of nodes, and form pre-partitions according to the principle that each node is in the same community as its most similar neighbor. After that, check each partition whether it satisfies community criterion. For the pre-partitions who do not satisfy, incorporate them with others that having the biggest attraction until there are no changes. To measure the attraction ability of a partition, we propose an attraction index that based on the linked node's importance in networks. Therefore, our proposed method can better exploit the nodes' properties and network's structure. To test the performance of our algorithm, both synthetic and empirical networks ranging in different scales are tested. Simulation results show that the proposed algorithm can obtain superior clustering results compared with six other widely used community detection algorithms.

  17. Which, When, and How: Hierarchical Clustering with Human–Machine Cooperation

    Directory of Open Access Journals (Sweden)

    Huanyang Zheng

    2016-12-01

    Full Text Available Human–Machine Cooperations (HMCs can balance the advantages and disadvantages of human computation (accurate but costly and machine computation (cheap but inaccurate. This paper studies HMCs in agglomerative hierarchical clusterings, where the machine can ask the human some questions. The human will return the answers to the machine, and the machine will use these answers to correct errors in its current clustering results. We are interested in the machine’s strategy on handling the question operations, in terms of three problems: (1 Which question should the machine ask? (2 When should the machine ask the question (early or late? (3 How does the machine adjust the clustering result, if the machine’s mistake is found by the human? Based on the insights of these problems, an efficient algorithm is proposed with five implementation variations. Experiments on image clusterings show that the proposed algorithm can improve the clustering accuracy with few question operations.

  18. Multi-documents summarization based on clustering of learning object using hierarchical clustering

    Science.gov (United States)

    Mustamiin, M.; Budi, I.; Santoso, H. B.

    2018-03-01

    The Open Educational Resources (OER) is a portal of teaching, learning and research resources that is available in public domain and freely accessible. Learning contents or Learning Objects (LO) are granular and can be reused for constructing new learning materials. LO ontology-based searching techniques can be used to search for LO in the Indonesia OER. In this research, LO from search results are used as an ingredient to create new learning materials according to the topic searched by users. Summarizing-based grouping of LO use Hierarchical Agglomerative Clustering (HAC) with the dependency context to the user’s query which has an average value F-Measure of 0.487, while summarizing by K-Means F-Measure only has an average value of 0.336.

  19. AGGLOMERATIVE CLUSTERING OF SOUND RECORD SPEECH SEGMENTS BASED ON BAYESIAN INFORMATION CRITERION

    Directory of Open Access Journals (Sweden)

    O. Yu. Kydashev

    2013-01-01

    Full Text Available This paper presents the detailed description of agglomerative clustering system implementation for speech segments based on Bayesian information criterion. Numerical experiment results with different acoustic features, as well as the full and diagonal covariance matrices application are given. The error rate DER equal to 6.4% for audio records of radio «Svoboda» was achieved by means of designed system.

  20. URL Mining Using Agglomerative Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Chinmay R. Deshmukh

    2015-02-01

    Full Text Available Abstract The tremendous growth of the web world incorporates application of data mining techniques to the web logs. Data Mining and World Wide Web encompasses an important and active area of research. Web log mining is analysis of web log files with web pages sequences. Web mining is broadly classified as web content mining web usage mining and web structure mining. Web usage mining is a technique to discover usage patterns from Web data in order to understand and better serve the needs of Web-based applications. URL mining refers to a subclass of Web mining that helps us to investigate the details of a Uniform Resource Locator. URL mining can be advantageous in the fields of security and protection. The paper introduces a technique for mining a collection of user transactions with an Internet search engine to discover clusters of similar queries and similar URLs. The information we exploit is a clickthrough data each record consist of a users query to a search engine along with the URL which the user selected from among the candidates offered by search engine. By viewing this dataset as a bipartite graph with the vertices on one side corresponding to queries and on the other side to URLs one can apply an agglomerative clustering algorithm to the graphs vertices to identify related queries and URLs.

  1. Agglomerative concentric hypersphere clustering applied to structural damage detection

    Science.gov (United States)

    Silva, Moisés; Santos, Adam; Santos, Reginaldo; Figueiredo, Eloi; Sales, Claudomiro; Costa, João C. W. A.

    2017-08-01

    The present paper proposes a novel cluster-based method, named as agglomerative concentric hypersphere (ACH), to detect structural damage in engineering structures. Continuous structural monitoring systems often require unsupervised approaches to automatically infer the health condition of a structure. However, when a structure is under linear and nonlinear effects caused by environmental and operational variability, data normalization procedures are also required to overcome these effects. The proposed approach aims, through a straightforward clustering procedure, to discover automatically the optimal number of clusters, representing the main state conditions of a structural system. Three initialization procedures are introduced to evaluate the impact of deterministic and stochastic initializations on the performance of this approach. The ACH is compared to state-of-the-art approaches, based on Gaussian mixture models and Mahalanobis squared distance, on standard data sets from a post-tensioned bridge located in Switzerland: the Z-24 Bridge. The proposed approach demonstrates more efficiency in modeling the normal condition of the structure and its corresponding main clusters. Furthermore, it reveals a better classification performance than the alternative ones in terms of false-positive and false-negative indications of damage, demonstrating a promising applicability in real-world structural health monitoring scenarios.

  2. A Comparison of Two Approaches to Beta-Flexible Clustering.

    Science.gov (United States)

    Belbin, Lee; And Others

    1992-01-01

    A method for hierarchical agglomerative polythetic (multivariate) clustering, based on unweighted pair group using arithmetic averages (UPGMA) is compared with the original beta-flexible technique, a weighted average method. Reasons the flexible UPGMA strategy is recommended are discussed, focusing on the ability to recover cluster structure over…

  3. Clustering User Behavior in Scientific Collections

    OpenAIRE

    Blixhavn, Øystein Hoel

    2014-01-01

    This master thesis looks at how clustering techniques can be appliedto a collection of scientific documents. Approximately one year of serverlogs from the CERN Document Server (CDS) are analyzed and preprocessed.Based on the findings of this analysis, and a review of thecurrent state of the art, three different clustering methods are selectedfor further work: Simple k-Means, Hierarchical Agglomerative Clustering(HAC) and Graph Partitioning. In addition, a custom, agglomerativeclustering algor...

  4. Neutrosophic Hierarchical Clustering Algoritms

    Directory of Open Access Journals (Sweden)

    Rıdvan Şahin

    2014-03-01

    Full Text Available Interval neutrosophic set (INS is a generalization of interval valued intuitionistic fuzzy set (IVIFS, whose the membership and non-membership values of elements consist of fuzzy range, while single valued neutrosophic set (SVNS is regarded as extension of intuitionistic fuzzy set (IFS. In this paper, we extend the hierarchical clustering techniques proposed for IFSs and IVIFSs to SVNSs and INSs respectively. Based on the traditional hierarchical clustering procedure, the single valued neutrosophic aggregation operator, and the basic distance measures between SVNSs, we define a single valued neutrosophic hierarchical clustering algorithm for clustering SVNSs. Then we extend the algorithm to classify an interval neutrosophic data. Finally, we present some numerical examples in order to show the effectiveness and availability of the developed clustering algorithms.

  5. Convex Clustering: An Attractive Alternative to Hierarchical Clustering

    Science.gov (United States)

    Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth

    2015-01-01

    The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340

  6. Statistical Significance for Hierarchical Clustering

    Science.gov (United States)

    Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

    2017-01-01

    Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990

  7. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.

    Science.gov (United States)

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A

    2014-10-16

    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.

  8. Short-Term Wind Power Forecasting Based on Clustering Pre-Calculated CFD Method

    Directory of Open Access Journals (Sweden)

    Yimei Wang

    2018-04-01

    Full Text Available To meet the increasing wind power forecasting (WPF demands of newly built wind farms without historical data, physical WPF methods are widely used. The computational fluid dynamics (CFD pre-calculated flow fields (CPFF-based WPF is a promising physical approach, which can balance well the competing demands of computational efficiency and accuracy. To enhance its adaptability for wind farms in complex terrain, a WPF method combining wind turbine clustering with CPFF is first proposed where the wind turbines in the wind farm are clustered and a forecasting is undertaken for each cluster. K-means, hierarchical agglomerative and spectral analysis methods are used to establish the wind turbine clustering models. The Silhouette Coefficient, Calinski-Harabaz index and within-between index are proposed as criteria to evaluate the effectiveness of the established clustering models. Based on different clustering methods and schemes, various clustering databases are built for clustering pre-calculated CFD (CPCC-based short-term WPF. For the wind farm case studied, clustering evaluation criteria show that hierarchical agglomerative clustering has reasonable results, spectral clustering is better and K-means gives the best performance. The WPF results produced by different clustering databases also prove the effectiveness of the three evaluation criteria in turn. The newly developed CPCC model has a much higher WPF accuracy than the CPFF model without using clustering techniques, both on temporal and spatial scales. The research provides supports for both the development and improvement of short-term physical WPF systems.

  9. Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.

    Science.gov (United States)

    Loewenstein, Yaniv; Portugaly, Elon; Fromer, Menachem; Linial, Michal

    2008-07-01

    UPGMA (average linking) is probably the most popular algorithm for hierarchical data clustering, especially in computational biology. However, UPGMA requires the entire dissimilarity matrix in memory. Due to this prohibitive requirement, UPGMA is not scalable to very large datasets. We present a novel class of memory-constrained UPGMA (MC-UPGMA) algorithms. Given any practical memory size constraint, this framework guarantees the correct clustering solution without explicitly requiring all dissimilarities in memory. The algorithms are general and are applicable to any dataset. We present a data-dependent characterization of hardness and clustering efficiency. The presented concepts are applicable to any agglomerative clustering formulation. We apply our algorithm to the entire collection of protein sequences, to automatically build a comprehensive evolutionary-driven hierarchy of proteins from sequence alone. The newly created tree captures protein families better than state-of-the-art large-scale methods such as CluSTr, ProtoNet4 or single-linkage clustering. We demonstrate that leveraging the entire mass embodied in all sequence similarities allows to significantly improve on current protein family clusterings which are unable to directly tackle the sheer mass of this data. Furthermore, we argue that non-metric constraints are an inherent complexity of the sequence space and should not be overlooked. The robustness of UPGMA allows significant improvement, especially for multidomain proteins, and for large or divergent families. A comprehensive tree built from all UniProt sequence similarities, together with navigation and classification tools will be made available as part of the ProtoNet service. A C++ implementation of the algorithm is available on request.

  10. bcl::Cluster : A method for clustering biological molecules coupled with visualization in the Pymol Molecular Graphics System.

    Science.gov (United States)

    Alexander, Nathan; Woetzel, Nils; Meiler, Jens

    2011-02-01

    Clustering algorithms are used as data analysis tools in a wide variety of applications in Biology. Clustering has become especially important in protein structure prediction and virtual high throughput screening methods. In protein structure prediction, clustering is used to structure the conformational space of thousands of protein models. In virtual high throughput screening, databases with millions of drug-like molecules are organized by structural similarity, e.g. common scaffolds. The tree-like dendrogram structure obtained from hierarchical clustering can provide a qualitative overview of the results, which is important for focusing detailed analysis. However, in practice it is difficult to relate specific components of the dendrogram directly back to the objects of which it is comprised and to display all desired information within the two dimensions of the dendrogram. The current work presents a hierarchical agglomerative clustering method termed bcl::Cluster. bcl::Cluster utilizes the Pymol Molecular Graphics System to graphically depict dendrograms in three dimensions. This allows simultaneous display of relevant biological molecules as well as additional information about the clusters and the members comprising them.

  11. Comparative analysis of clustering methods for gene expression time course data

    Directory of Open Access Journals (Sweden)

    Ivan G. Costa

    2004-01-01

    Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.

  12. Hierarchical clustering using correlation metric and spatial continuity constraint

    Science.gov (United States)

    Stork, Christopher L.; Brewer, Luke N.

    2012-10-02

    Large data sets are analyzed by hierarchical clustering using correlation as a similarity measure. This provides results that are superior to those obtained using a Euclidean distance similarity measure. A spatial continuity constraint may be applied in hierarchical clustering analysis of images.

  13. DOCUMENT REPRESENTATION FOR CLUSTERING OF SCIENTIFIC ABSTRACTS

    Directory of Open Access Journals (Sweden)

    S. V. Popova

    2014-01-01

    Full Text Available The key issue of the present paper is clustering of narrow-domain short texts, such as scientific abstracts. The work is based on the observations made when improving the performance of key phrase extraction algorithm. An extended stop-words list was used that was built automatically for the purposes of key phrase extraction and gave the possibility for a considerable quality enhancement of the phrases extracted from scientific publications. A description of the stop- words list creation procedure is given. The main objective is to investigate the possibilities to increase the performance and/or speed of clustering by the above-mentioned list of stop-words as well as information about lexeme parts of speech. In the latter case a vocabulary is applied for the document representation, which contains not all the words that occurred in the collection, but only nouns and adjectives or their sequences encountered in the documents. Two base clustering algorithms are applied: k-means and hierarchical clustering (average agglomerative method. The results show that the use of an extended stop-words list and adjective-noun document representation makes it possible to improve the performance and speed of k-means clustering. In a similar case for average agglomerative method a decline in performance quality may be observed. It is shown that the use of adjective-noun sequences for document representation lowers the clustering quality for both algorithms and can be justified only when a considerable reduction of feature space dimensionality is necessary.

  14. Cluster Based Hierarchical Routing Protocol for Wireless Sensor Network

    OpenAIRE

    Rashed, Md. Golam; Kabir, M. Hasnat; Rahim, Muhammad Sajjadur; Ullah, Shaikh Enayet

    2012-01-01

    The efficient use of energy source in a sensor node is most desirable criteria for prolong the life time of wireless sensor network. In this paper, we propose a two layer hierarchical routing protocol called Cluster Based Hierarchical Routing Protocol (CBHRP). We introduce a new concept called head-set, consists of one active cluster head and some other associate cluster heads within a cluster. The head-set members are responsible for control and management of the network. Results show that t...

  15. Environmental Gradient Analysis, Ordination, and Classification in Environmental Impact Assessments.

    Science.gov (United States)

    1987-09-01

    agglomerative clustering algorithms for mainframe computers: (1) the unweighted pair-group method that V uses arithmetic averages ( UPGMA ), (2) the...hierarchical agglomerative unweighted pair-group method using arithmetic averages ( UPGMA ), which is also called average linkage clustering. This method was...dendrograms produced by weighted clustering (93). Sneath and Sokal (94), Romesburg (84), and Seber• (90) also strongly recommend the UPGMA . A dendrogram

  16. Peringkasan Tweet Berdasarkan Trending Topic Twitter Dengan Pembobotan TF-IDF dan Single Linkage AngglomerativeHierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Annisa Annisa

    2016-10-01

    Full Text Available Trending topic is a feature provided by twitter that informs something widely discussed by users in a particular time. The form of a trending topic is a hashtag and can be selected by clicking. However, the number of tweets for each trending topics can be very large, so it will be difficult if we want to know all the contents. So, in order to make easy when reading the topic, a small number of tweets can be selected as the main idea of the topic. In this study, we applied the Agglomerative Single Linkage Hierarchical Clustering by calculating the TF-IDF value for each word in advance. We used 100 trending topics, where each topic consists of 50 tweets in Indonesian. For testing, we provided 30 trending topics which consist of 2 until 9 sub-topics. The result is that each trending topics can be summarized into shorter text contains 2 until 9 tweets. We were able to summarize 1 trending topics exactly same as the topic summarized by human expert. However, the rest of topics corresponded partially with human expert.

  17. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    Science.gov (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  18. Similarity maps and hierarchical clustering for annotating FT-IR spectral images.

    Science.gov (United States)

    Zhong, Qiaoyong; Yang, Chen; Großerüschkamp, Frederik; Kallenbach-Thieltges, Angela; Serocka, Peter; Gerwert, Klaus; Mosig, Axel

    2013-11-20

    Unsupervised segmentation of multi-spectral images plays an important role in annotating infrared microscopic images and is an essential step in label-free spectral histopathology. In this context, diverse clustering approaches have been utilized and evaluated in order to achieve segmentations of Fourier Transform Infrared (FT-IR) microscopic images that agree with histopathological characterization. We introduce so-called interactive similarity maps as an alternative annotation strategy for annotating infrared microscopic images. We demonstrate that segmentations obtained from interactive similarity maps lead to similarly accurate segmentations as segmentations obtained from conventionally used hierarchical clustering approaches. In order to perform this comparison on quantitative grounds, we provide a scheme that allows to identify non-horizontal cuts in dendrograms. This yields a validation scheme for hierarchical clustering approaches commonly used in infrared microscopy. We demonstrate that interactive similarity maps may identify more accurate segmentations than hierarchical clustering based approaches, and thus are a viable and due to their interactive nature attractive alternative to hierarchical clustering. Our validation scheme furthermore shows that performance of hierarchical two-means is comparable to the traditionally used Ward's clustering. As the former is much more efficient in time and memory, our results suggest another less resource demanding alternative for annotating large spectral images.

  19. Merging K-means with hierarchical clustering for identifying general-shaped groups.

    Science.gov (United States)

    Peterson, Anna D; Ghosh, Arka P; Maitra, Ranjan

    2018-01-01

    Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and K -means clustering are two approaches but have different strengths and weaknesses. For instance, hierarchical clustering identifies groups in a tree-like structure but suffers from computational complexity in large datasets while K -means clustering is efficient but designed to identify homogeneous spherically-shaped clusters. We present a hybrid non-parametric clustering approach that amalgamates the two methods to identify general-shaped clusters and that can be applied to larger datasets. Specifically, we first partition the dataset into spherical groups using K -means. We next merge these groups using hierarchical methods with a data-driven distance measure as a stopping criterion. Our proposal has the potential to reveal groups with general shapes and structure in a dataset. We demonstrate good performance on several simulated and real datasets.

  20. Robust Pseudo-Hierarchical Support Vector Clustering

    DEFF Research Database (Denmark)

    Hansen, Michael Sass; Sjöstrand, Karl; Olafsdóttir, Hildur

    2007-01-01

    Support vector clustering (SVC) has proven an efficient algorithm for clustering of noisy and high-dimensional data sets, with applications within many fields of research. An inherent problem, however, has been setting the parameters of the SVC algorithm. Using the recent emergence of a method...... for calculating the entire regularization path of the support vector domain description, we propose a fast method for robust pseudo-hierarchical support vector clustering (HSVC). The method is demonstrated to work well on generated data, as well as for detecting ischemic segments from multidimensional myocardial...

  1. Hierarchical modeling of cluster size in wildlife surveys

    Science.gov (United States)

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  2. Energy Efficient Hierarchical Clustering Approaches in Wireless Sensor Networks: A Survey

    Directory of Open Access Journals (Sweden)

    Bilal Jan

    2017-01-01

    Full Text Available Wireless sensor networks (WSN are one of the significant technologies due to their diverse applications such as health care monitoring, smart phones, military, disaster management, and other surveillance systems. Sensor nodes are usually deployed in large number that work independently in unattended harsh environments. Due to constraint resources, typically the scarce battery power, these wireless nodes are grouped into clusters for energy efficient communication. In clustering hierarchical schemes have achieved great interest for minimizing energy consumption. Hierarchical schemes are generally categorized as cluster-based and grid-based approaches. In cluster-based approaches, nodes are grouped into clusters, where a resourceful sensor node is nominated as a cluster head (CH while in grid-based approach the network is divided into confined virtual grids usually performed by the base station. This paper highlights and discusses the design challenges for cluster-based schemes, the important cluster formation parameters, and classification of hierarchical clustering protocols. Moreover, existing cluster-based and grid-based techniques are evaluated by considering certain parameters to help users in selecting appropriate technique. Furthermore, a detailed summary of these protocols is presented with their advantages, disadvantages, and applicability in particular cases.

  3. Clustering-based classification of road traffic accidents using hierarchical clustering and artificial neural networks.

    Science.gov (United States)

    Taamneh, Madhar; Taamneh, Salah; Alkheder, Sharaf

    2017-09-01

    Artificial neural networks (ANNs) have been widely used in predicting the severity of road traffic crashes. All available information about previously occurred accidents is typically used for building a single prediction model (i.e., classifier). Too little attention has been paid to the differences between these accidents, leading, in most cases, to build less accurate predictors. Hierarchical clustering is a well-known clustering method that seeks to group data by creating a hierarchy of clusters. Using hierarchical clustering and ANNs, a clustering-based classification approach for predicting the injury severity of road traffic accidents was proposed. About 6000 road accidents occurred over a six-year period from 2008 to 2013 in Abu Dhabi were used throughout this study. In order to reduce the amount of variation in data, hierarchical clustering was applied on the data set to organize it into six different forms, each with different number of clusters (i.e., clusters from 1 to 6). Two ANN models were subsequently built for each cluster of accidents in each generated form. The first model was built and validated using all accidents (training set), whereas only 66% of the accidents were used to build the second model, and the remaining 34% were used to test it (percentage split). Finally, the weighted average accuracy was computed for each type of models in each from of data. The results show that when testing the models using the training set, clustering prior to classification achieves (11%-16%) more accuracy than without using clustering, while the percentage split achieves (2%-5%) more accuracy. The results also suggest that partitioning the accidents into six clusters achieves the best accuracy if both types of models are taken into account.

  4. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  5. A Hierarchical Clustering Methodology for the Estimation of Toxicity

    Science.gov (United States)

    A Quantitative Structure Activity Relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural sim...

  6. Exploitation of Clustering Techniques in Transactional Healthcare Data

    Directory of Open Access Journals (Sweden)

    Naeem Ahmad Mahoto

    2014-03-01

    Full Text Available Healthcare service centres equipped with electronic health systems have improved their resources as well as treatment processes. The dynamic nature of healthcare data of each individual makes it complex and difficult for physicians to manually mediate them; therefore, automatic techniques are essential to manage the quality and standardization of treatment procedures. Exploratory data analysis, patternanalysis and grouping of data is managed using clustering techniques, which work as an unsupervised classification. A number of healthcare applications are developed that use several data mining techniques for classification, clustering and extracting useful information from healthcare data. The challenging issue in this domain is to select adequate data mining algorithm for optimal results. This paper exploits three different clustering algorithms: DBSCAN (Density-Based Clustering, agglomerative hierarchical and k-means in real transactional healthcare data of diabetic patients (taken as case study to analyse their performance in large and dispersed healthcare data. The best solution of cluster sets among the exploited algorithms is evaluated using clustering quality indexes and is selected to identify the possible subgroups of patients having similar treatment patterns

  7. Hierarchical Control for Multiple DC Microgrids Clusters

    DEFF Research Database (Denmark)

    Shafiee, Qobad; Dragicevic, Tomislav; Vasquez, Juan Carlos

    2014-01-01

    This paper presents a distributed hierarchical control framework to ensure reliable operation of dc Microgrid (MG) clusters. In this hierarchy, primary control is used to regulate the common bus voltage inside each MG locally. An adaptive droop method is proposed for this level which determines...

  8. Hierarchical video summarization based on context clustering

    Science.gov (United States)

    Tseng, Belle L.; Smith, John R.

    2003-11-01

    A personalized video summary is dynamically generated in our video personalization and summarization system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order to maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. Besides finding the desired contents, the objective is to present a coherent summary. There are diverse methods for creating summaries, and we focus on the challenges of generating a hierarchical video summary based on context information. In our summarization algorithm, three inputs are used to generate the hierarchical video summary output. These inputs are (1) MPEG-7 metadata descriptions of the contents in the server, (2) user preference and usage environment declarations from the user client, and (3) context information including MPEG-7 controlled term list and classification scheme. In a video sequence, descriptions and relevance scores are assigned to each shot. Based on these shot descriptions, context clustering is performed to collect consecutively similar shots to correspond to hierarchical scene representations. The context clustering is based on the available context information, and may be derived from domain knowledge or rules engines. Finally, the selection of structured video segments to generate the hierarchical summary efficiently balances between scene representation and shot selection.

  9. Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination.

    Science.gov (United States)

    Yau, Christopher; Holmes, Chris

    2011-07-01

    We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.

  10. Recurrent daily rainfall patterns over South Africa and associated dynamics during the core of the austral summer

    CSIR Research Space (South Africa)

    Cretat, J

    2010-12-01

    Full Text Available field instead of atmospheric processes and dynamics. An original agglomerative hierarchical clustering approach is used to classify daily rainfall patterns recorded at 5352 stations from DJF 1971 to DJF 1999. Five clusters are retained for analysis...

  11. Improved Gravitation Field Algorithm and Its Application in Hierarchical Clustering

    Science.gov (United States)

    Zheng, Ming; Sun, Ying; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang

    2012-01-01

    Background Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. Method An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Results Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved. PMID:23173043

  12. Unsupervised active learning based on hierarchical graph-theoretic clustering.

    Science.gov (United States)

    Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve

    2009-10-01

    Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.

  13. Hierarchical clusters of phytoplankton variables in dammed water bodies

    Science.gov (United States)

    Silva, Eliana Costa e.; Lopes, Isabel Cristina; Correia, Aldina; Gonçalves, A. Manuela

    2017-06-01

    In this paper a dataset containing biological variables of the water column of several Portuguese reservoirs is analyzed. Hierarchical cluster analysis is used to obtain clusters of phytoplankton variables of the phylum Cyanophyta, with the objective of validating the classification of Portuguese reservoirs previewly presented in [1] which were divided into three clusters: (1) Interior Tagus and Aguieira; (2) Douro; and (3) Other rivers. Now three new clusters of Cyanophyta variables were found. Kruskal-Wallis and Mann-Whitney tests are used to compare the now obtained Cyanophyta clusters and the previous Reservoirs clusters, in order to validate the classification of the water quality of reservoirs. The amount of Cyanophyta algae present in the reservoirs from the three clusters is significantly different, which validates the previous classification.

  14. Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data.

    Science.gov (United States)

    Tian, Ting; McLachlan, Geoffrey J; Dieters, Mark J; Basford, Kaye E

    2015-01-01

    It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances.

  15. A hierarchical cluster analysis of normal-tension glaucoma using spectral-domain optical coherence tomography parameters.

    Science.gov (United States)

    Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2015-01-01

    Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.

  16. Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

    Science.gov (United States)

    Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2014-04-29

    To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  17. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    Science.gov (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Hierarchical clustering of HPV genotype patterns in the ASCUS-LSIL triage study

    Science.gov (United States)

    Wentzensen, Nicolas; Wilson, Lauren E.; Wheeler, Cosette M.; Carreon, Joseph D.; Gravitt, Patti E.; Schiffman, Mark; Castle, Philip E.

    2010-01-01

    Anogenital cancers are associated with about 13 carcinogenic HPV types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common which complicate the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the ASCUS-LSIL triage study using unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered using complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: Cluster 1 included all CIN3 histology with abnormal cytology; Cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion (HSIL) cytology; Cluster 3 included older women with normal or low grade histology/cytology and low viral load; Cluster 4 included younger women with low grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: Group 1 included only HPV16; Group 2 included nine carcinogenic types plus non-carcinogenic HPV53 and HPV66; and Group 3 included non-carcinogenic types plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and HSIL. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease using unsupervised hierarchical clustering can address complex genotype distributions on a population level. PMID:20959485

  19. Star Cluster Structure from Hierarchical Star Formation

    Science.gov (United States)

    Grudic, Michael; Hopkins, Philip; Murray, Norman; Lamberts, Astrid; Guszejnov, David; Schmitz, Denise; Boylan-Kolchin, Michael

    2018-01-01

    Young massive star clusters (YMCs) spanning 104-108 M⊙ in mass generally have similar radial surface density profiles, with an outer power-law index typically between -2 and -3. This similarity suggests that they are shaped by scale-free physics at formation. Recent multi-physics MHD simulations of YMC formation have also produced populations of YMCs with this type of surface density profile, allowing us to narrow down the physics necessary to form a YMC with properties as observed. We show that the shallow density profiles of YMCs are a natural result of phase-space mixing that occurs as they assemble from the clumpy, hierarchically-clustered configuration imprinted by the star formation process. We develop physical intuition for this process via analytic arguments and collisionless N-body experiments, elucidating the connection between star formation physics and star cluster structure. This has implications for the early-time structure and evolution of proto-globular clusters, and prospects for simulating their formation in the FIRE cosmological zoom-in simulations.

  20. Segmenting Student Markets with a Student Satisfaction and Priorities Survey.

    Science.gov (United States)

    Borden, Victor M. H.

    1995-01-01

    A market segmentation analysis of 872 university students compared 2 hierarchical clustering procedures for deriving market segments: 1 using matching-type measures and an agglomerative clustering algorithm, and 1 using the chi-square based automatic interaction detection. Results and implications for planning, evaluating, and improving academic…

  1. The structure of nearby clusters of galaxies Hierarchical clustering and an application to the Leo region

    CERN Document Server

    Materne, J

    1978-01-01

    A new method of classifying groups of galaxies, called hierarchical clustering, is presented as a tool for the investigation of nearby groups of galaxies. The method is free from model assumptions about the groups. The scaling of the different coordinates is necessary, and the level from which one accepts the groups as real has to be determined. Hierarchical clustering is applied to an unbiased sample of galaxies in the Leo region. Five distinct groups result which have reasonable physical properties, such as low crossing times and conservative mass-to-light ratios, and which follow a radial velocity- luminosity relation. Only 4 out of 39 galaxies were adopted as field galaxies. (27 refs).

  2. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    Energy Technology Data Exchange (ETDEWEB)

    Grasha, K.; Calzetti, D. [Astronomy Department, University of Massachusetts, Amherst, MA 01003 (United States); Adamo, A.; Messa, M. [Dept. of Astronomy, The Oskar Klein Centre, Stockholm University, Stockholm (Sweden); Kim, H. [Gemini Observatory, La Serena (Chile); Elmegreen, B. G. [IBM Research Division, T.J. Watson Research Center, Yorktown Hts., NY (United States); Gouliermis, D. A. [Zentrum für Astronomie der Universität Heidelberg, Institut für Theoretische Astrophysik, Albert-Ueberle-Str. 2, D-69120 Heidelberg (Germany); Dale, D. A. [Dept. of Physics and Astronomy, University of Wyoming, Laramie, WY (United States); Fumagalli, M. [Institute for Computational Cosmology and Centre for Extragalactic Astronomy, Durham University, Durham (United Kingdom); Grebel, E. K.; Shabani, F. [Astronomisches Rechen-Institut, Zentrum für Astronomie der Universität Heidelberg, Mönchhofstr. 12-14, D-69120 Heidelberg (Germany); Johnson, K. E. [Dept. of Astronomy, University of Virginia, Charlottesville, VA (United States); Kahre, L. [Dept. of Astronomy, New Mexico State University, Las Cruces, NM (United States); Kennicutt, R. C. [Institute of Astronomy, University of Cambridge, Cambridge (United Kingdom); Pellerin, A. [Dept. of Physics and Astronomy, State University of New York at Geneseo, Geneseo NY (United States); Ryon, J. E.; Ubeda, L. [Space Telescope Science Institute, Baltimore, MD (United States); Smith, L. J. [European Space Agency/Space Telescope Science Institute, Baltimore, MD (United States); Thilker, D., E-mail: kgrasha@astro.umass.edu [Dept. of Physics and Astronomy, The Johns Hopkins University, Baltimore, MD (United States)

    2017-05-10

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3–15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ∼40–60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  3. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    Science.gov (United States)

    Grasha, K.; Calzetti, D.; Adamo, A.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Dale, D. A.; Fumagalli, M.; Grebel, E. K.; Johnson, K. E.; Kahre, L.; Kennicutt, R. C.; Messa, M.; Pellerin, A.; Ryon, J. E.; Smith, L. J.; Shabani, F.; Thilker, D.; Ubeda, L.

    2017-05-01

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3-15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ˜40-60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  4. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    International Nuclear Information System (INIS)

    Grasha, K.; Calzetti, D.; Adamo, A.; Messa, M.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Dale, D. A.; Fumagalli, M.; Grebel, E. K.; Shabani, F.; Johnson, K. E.; Kahre, L.; Kennicutt, R. C.; Pellerin, A.; Ryon, J. E.; Ubeda, L.; Smith, L. J.; Thilker, D.

    2017-01-01

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3–15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ∼40–60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  5. Graph coarsening and clustering on the GPU

    NARCIS (Netherlands)

    Fagginger Auer, B.O.; Bisseling, R.H.

    2013-01-01

    Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained sharedmemory parallel graph

  6. Cognitive Clusters in Specific Learning Disorder.

    Science.gov (United States)

    Poletti, Michele; Carretta, Elisa; Bonvicini, Laura; Giorgi-Rossi, Paolo

    The heterogeneity among children with learning disabilities still represents a barrier and a challenge in their conceptualization. Although a dimensional approach has been gaining support, the categorical approach is still the most adopted, as in the recent fifth edition of the Diagnostic and Statistical Manual of Mental Disorders. The introduction of the single overarching diagnostic category of specific learning disorder (SLD) could underemphasize interindividual clinical differences regarding intracategory cognitive functioning and learning proficiency, according to current models of multiple cognitive deficits at the basis of neurodevelopmental disorders. The characterization of specific cognitive profiles associated with an already manifest SLD could help identify possible early cognitive markers of SLD risk and distinct trajectories of atypical cognitive development leading to SLD. In this perspective, we applied a cluster analysis to identify groups of children with a Diagnostic and Statistical Manual-based diagnosis of SLD with similar cognitive profiles and to describe the association between clusters and SLD subtypes. A sample of 205 children with a diagnosis of SLD were enrolled. Cluster analyses (agglomerative hierarchical and nonhierarchical iterative clustering technique) were used successively on 10 core subtests of the Wechsler Intelligence Scale for Children-Fourth Edition. The 4-cluster solution was adopted, and external validation found differences in terms of SLD subtype frequencies and learning proficiency among clusters. Clinical implications of these findings are discussed, tracing directions for further studies.

  7. Prediction of Solvent Physical Properties using the Hierarchical Clustering Method

    Science.gov (United States)

    Recently a QSAR (Quantitative Structure Activity Relationship) method, the hierarchical clustering method, was developed to estimate acute toxicity values for large, diverse datasets. This methodology has now been applied to the estimate solvent physical properties including sur...

  8. Determination of genetic structure of germplasm collections: are traditional hierarchical clustering methods appropriate for molecular marker data?

    NARCIS (Netherlands)

    Odong, T.L.; Heerwaarden, van J.; Jansen, J.; Hintum, van T.J.L.; Eeuwijk, van F.A.

    2011-01-01

    Despite the availability of newer approaches, traditional hierarchical clustering remains very popular in genetic diversity studies in plants. However, little is known about its suitability for molecular marker data. We studied the performance of traditional hierarchical clustering techniques using

  9. Predicting healthcare outcomes in prematurely born infants using cluster analysis.

    Science.gov (United States)

    MacBean, Victoria; Lunt, Alan; Drysdale, Simon B; Yarzi, Muska N; Rafferty, Gerrard F; Greenough, Anne

    2018-05-23

    Prematurely born infants are at high risk of respiratory morbidity following neonatal unit discharge, though prediction of outcomes is challenging. We have tested the hypothesis that cluster analysis would identify discrete groups of prematurely born infants with differing respiratory outcomes during infancy. A total of 168 infants (median (IQR) gestational age 33 (31-34) weeks) were recruited in the neonatal period from consecutive births in a tertiary neonatal unit. The baseline characteristics of the infants were used to classify them into hierarchical agglomerative clusters. Rates of viral lower respiratory tract infections (LRTIs) were recorded for 151 infants in the first year after birth. Infants could be classified according to birth weight and duration of neonatal invasive mechanical ventilation (MV) into three clusters. Cluster one (MV ≤5 days) had few LRTIs. Clusters two and three (both MV ≥6 days, but BW ≥or <882 g respectively), had significantly higher LRTI rates. Cluster two had a higher proportion of infants experiencing respiratory syncytial virus LRTIs (P = 0.01) and cluster three a higher proportion of rhinovirus LRTIs (P < 0.001) CONCLUSIONS: Readily available clinical data allowed classification of prematurely born infants into one of three distinct groups with differing subsequent respiratory morbidity in infancy. © 2018 Wiley Periodicals, Inc.

  10. Clustering of the Self-Organizing Map based Approach in Induction Machine Rotor Faults Diagnostics

    Directory of Open Access Journals (Sweden)

    Ahmed TOUMI

    2009-12-01

    Full Text Available Self-Organizing Maps (SOM is an excellent method of analyzingmultidimensional data. The SOM based classification is attractive, due to itsunsupervised learning and topology preserving properties. In this paper, theperformance of the self-organizing methods is investigated in induction motorrotor fault detection and severity evaluation. The SOM is based on motor currentsignature analysis (MCSA. The agglomerative hierarchical algorithms using theWard’s method is applied to automatically dividing the map into interestinginterpretable groups of map units that correspond to clusters in the input data. Theresults obtained with this approach make it possible to detect a rotor bar fault justdirectly from the visualization results. The system is also able to estimate theextent of rotor faults.

  11. Strong influence of variable treatment on the performance of numerically defined ecological regions.

    Science.gov (United States)

    Snelder, Ton; Lehmann, Anthony; Lamouroux, Nicolas; Leathwick, John; Allenbach, Karin

    2009-10-01

    Numerical clustering has frequently been used to define hierarchically organized ecological regionalizations, but there has been little robust evaluation of their performance (i.e., the degree to which regions discriminate areas with similar ecological character). In this study we investigated the effect of the weighting and treatment of input variables on the performance of regionalizations defined by agglomerative clustering across a range of hierarchical levels. For this purpose, we developed three ecological regionalizations of Switzerland of increasing complexity using agglomerative clustering. Environmental data for our analysis were drawn from a 400 m grid and consisted of estimates of 11 environmental variables for each grid cell describing climate, topography and lithology. Regionalization 1 was defined from the environmental variables which were given equal weights. We used the same variables in Regionalization 2 but weighted and transformed them on the basis of a dissimilarity model that was fitted to land cover composition data derived for a random sample of cells from interpretation of aerial photographs. Regionalization 3 was a further two-stage development of Regionalization 2 where specific classifications, also weighted and transformed using dissimilarity models, were applied to 25 small scale "sub-domains" defined by Regionalization 2. Performance was assessed in terms of the discrimination of land cover composition for an independent set of sites using classification strength (CS), which measured the similarity of land cover composition within classes and the dissimilarity between classes. Regionalization 2 performed significantly better than Regionalization 1, but the largest gains in performance, compared to Regionalization 1, occurred at coarse hierarchical levels (i.e., CS did not increase significantly beyond the 25-region level). Regionalization 3 performed better than Regionalization 2 beyond the 25-region level and CS values continued to

  12. Technique for fast and efficient hierarchical clustering

    Science.gov (United States)

    Stork, Christopher

    2013-10-08

    A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.

  13. The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

    KAUST Repository

    Euá n, Carolina; Ombao, Hernando; Ortega, Joaquí n

    2018-01-01

    We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms

  14. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    Science.gov (United States)

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  15. Hierarchical Adaptive Means (HAM) clustering for hardware-efficient, unsupervised and real-time spike sorting.

    Science.gov (United States)

    Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G

    2014-09-30

    This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling

    Science.gov (United States)

    Rahman, Md. Habibur; Matin, M. A.; Salma, Umma

    2017-12-01

    The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.

  17. Assessment of surface water quality using hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Dheeraj Kumar Dabgerwal

    2016-02-01

    Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016,  pp: 32-44

  18. Clustering recommendations to compute agent reputation

    Science.gov (United States)

    Bedi, Punam; Kaur, Harmeet

    2005-03-01

    Traditional centralized approaches to security are difficult to apply to multi-agent systems which are used nowadays in e-commerce applications. Developing a notion of trust that is based on the reputation of an agent can provide a softer notion of security that is sufficient for many multi-agent applications. Our paper proposes a mechanism for computing reputation of the trustee agent for use by the trustier agent. The trustier agent computes the reputation based on its own experience as well as the experience the peer agents have with the trustee agents. The trustier agents intentionally interact with the peer agents to get their experience information in the form of recommendations. We have also considered the case of unintentional encounters between the referee agents and the trustee agent, which can be directly between them or indirectly through a set of interacting agents. The clustering is done to filter off the noise in the recommendations in the form of outliers. The trustier agent clusters the recommendations received from referee agents on the basis of the distances between recommendations using the hierarchical agglomerative method. The dendogram hence obtained is cut at the required similarity level which restricts the maximum distance between any two recommendations within a cluster. The cluster with maximum number of elements denotes the views of the majority of recommenders. The center of this cluster represents the reputation of the trustee agent which can be computed using c-means algorithm.

  19. Constructing storyboards based on hierarchical clustering analysis

    Science.gov (United States)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  20. Hierarchical Network Design

    DEFF Research Database (Denmark)

    Thomadsen, Tommy

    2005-01-01

    Communication networks are immensely important today, since both companies and individuals use numerous services that rely on them. This thesis considers the design of hierarchical (communication) networks. Hierarchical networks consist of layers of networks and are well-suited for coping...... with changing and increasing demands. Two-layer networks consist of one backbone network, which interconnects cluster networks. The clusters consist of nodes and links, which connect the nodes. One node in each cluster is a hub node, and the backbone interconnects the hub nodes of each cluster and thus...... the clusters. The design of hierarchical networks involves clustering of nodes, hub selection, and network design, i.e. selection of links and routing of ows. Hierarchical networks have been in use for decades, but integrated design of these networks has only been considered for very special types of networks...

  1. The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

    KAUST Repository

    Euán, Carolina

    2018-04-12

    We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data.

  2. Novel density-based and hierarchical density-based clustering algorithms for uncertain data.

    Science.gov (United States)

    Zhang, Xianchao; Liu, Han; Zhang, Xiaotong

    2017-09-01

    Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing

  3. Coronal Mass Ejection Data Clustering and Visualization of Decision Trees

    Science.gov (United States)

    Ma, Ruizhe; Angryk, Rafal A.; Riley, Pete; Filali Boubrahimi, Soukaina

    2018-05-01

    Coronal mass ejections (CMEs) can be categorized as either “magnetic clouds” (MCs) or non-MCs. Features such as a large magnetic field, low plasma-beta, and low proton temperature suggest that a CME event is also an MC event; however, so far there is neither a definitive method nor an automatic process to distinguish the two. Human labeling is time-consuming, and results can fluctuate owing to the imprecise definition of such events. In this study, we approach the problem of MC and non-MC distinction from a time series data analysis perspective and show how clustering can shed some light on this problem. Although many algorithms exist for traditional data clustering in the Euclidean space, they are not well suited for time series data. Problems such as inadequate distance measure, inaccurate cluster center description, and lack of intuitive cluster representations need to be addressed for effective time series clustering. Our data analysis in this work is twofold: clustering and visualization. For clustering we compared the results from the popular hierarchical agglomerative clustering technique to a distance density clustering heuristic we developed previously for time series data clustering. In both cases, dynamic time warping will be used for similarity measure. For classification as well as visualization, we use decision trees to aggregate single-dimensional clustering results to form a multidimensional time series decision tree, with averaged time series to present each decision. In this study, we achieved modest accuracy and, more importantly, an intuitive interpretation of how different parameters contribute to an MC event.

  4. A survey of text clustering techniques used for web mining

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2005-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to clustering. Then it presents two important clustering paradigms: a bottom-up agglomerative technique, which collects similar documents into larger and larger groups, and a top-down partitioning technique, which divides a corpus into topic-oriented partitions.

  5. Hierarchical Star Formation in Turbulent Media: Evidence from Young Star Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Grasha, K.; Calzetti, D. [Astronomy Department, University of Massachusetts, Amherst, MA 01003 (United States); Elmegreen, B. G. [IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY (United States); Adamo, A.; Messa, M. [Department of Astronomy, The Oskar Klein Centre, Stockholm University, Stockholm (Sweden); Aloisi, A.; Bright, S. N.; Lee, J. C.; Ryon, J. E.; Ubeda, L. [Space Telescope Science Institute, Baltimore, MD (United States); Cook, D. O. [California Institute of Technology, 1200 East California Boulevard, Pasadena, CA (United States); Dale, D. A. [Department of Physics and Astronomy, University of Wyoming, Laramie, WY (United States); Fumagalli, M. [Institute for Computational Cosmology and Centre for Extragalactic Astronomy, Department of Physics, Durham University, Durham (United Kingdom); Gallagher III, J. S. [Department of Astronomy, University of Wisconsin–Madison, Madison, WI (United States); Gouliermis, D. A. [Zentrum für Astronomie der Universität Heidelberg, Institut für Theoretische Astrophysik, Albert-Ueberle-Str. 2, D-69120 Heidelberg (Germany); Grebel, E. K. [Astronomisches Rechen-Institut, Zentrum für Astronomie der Universität Heidelberg, Mönchhofstr. 12-14, D-69120, Heidelberg (Germany); Kahre, L. [Department of Astronomy, New Mexico State University, Las Cruces, NM (United States); Kim, H. [Gemini Observatory, La Serena (Chile); Krumholz, M. R., E-mail: kgrasha@astro.umass.edu [Research School of Astronomy and Astrophysics, Australian National University, Canberra, ACT 2611 (Australia)

    2017-06-10

    We present an analysis of the positions and ages of young star clusters in eight local galaxies to investigate the connection between the age difference and separation of cluster pairs. We find that star clusters do not form uniformly but instead are distributed so that the age difference increases with the cluster pair separation to the 0.25–0.6 power, and that the maximum size over which star formation is physically correlated ranges from ∼200 pc to ∼1 kpc. The observed trends between age difference and separation suggest that cluster formation is hierarchical both in space and time: clusters that are close to each other are more similar in age than clusters born further apart. The temporal correlations between stellar aggregates have slopes that are consistent with predictions of turbulence acting as the primary driver of star formation. The velocity associated with the maximum size is proportional to the galaxy’s shear, suggesting that the galactic environment influences the maximum size of the star-forming structures.

  6. Eigenspaces of networks reveal the overlapping and hierarchical community structure more precisely

    International Nuclear Information System (INIS)

    Ma, Xiaoke; Gao, Lin; Yong, Xuerong

    2010-01-01

    Identifying community structure is fundamental for revealing the structure–functionality relationship in complex networks, and spectral algorithms have been shown to be powerful for this purpose. In a traditional spectral algorithm, each vertex of a network is embedded into a spectral space by making use of the eigenvectors of the adjacency matrix or Laplacian matrix of the graph. In this paper, a novel spectral approach for revealing the overlapping and hierarchical community structure of complex networks is proposed by not only using the eigenvalues and eigenvectors but also the properties of eigenspaces of the networks involved. This gives us a better characterization of community. We first show that the communicability between a pair of vertices can be rewritten in term of eigenspaces of a network. An agglomerative clustering algorithm is then presented to discover the hierarchical communities using the communicability matrix. Finally, these overlapping vertices are discovered with the corresponding eigenspaces, based on the fact that the vertices more densely connected amongst one another are more likely to be linked through short cycles. Compared with the traditional spectral algorithms, our algorithm can identify both the overlapping and hierarchical community without increasing the time complexity O(n 3 ), where n is the size of the network. Furthermore, our algorithm can also distinguish the overlapping vertices from bridges. The method is tested by applying it to some computer-generated and real-world networks. The experimental results indicate that our algorithm can reveal community structure more precisely than the traditional spectral approaches

  7. Inferring hierarchical clustering structures by deterministic annealing

    International Nuclear Information System (INIS)

    Hofmann, T.; Buhmann, J.M.

    1996-01-01

    The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree evaluation and we derive a non-greedy technique for tree growing. Applying the principles of maximum entropy and minimum cross entropy, a deterministic annealing algorithm is derived in a meanfield approximation. This technique allows us to canonically superimpose tree structures and to fit parameters to averaged or open-quote fuzzified close-quote trees

  8. Intensity-based hierarchical clustering in CT-scans: application to interactive segmentation in cardiology

    Science.gov (United States)

    Hadida, Jonathan; Desrosiers, Christian; Duong, Luc

    2011-03-01

    The segmentation of anatomical structures in Computed Tomography Angiography (CTA) is a pre-operative task useful in image guided surgery. Even though very robust and precise methods have been developed to help achieving a reliable segmentation (level sets, active contours, etc), it remains very time consuming both in terms of manual interactions and in terms of computation time. The goal of this study is to present a fast method to find coarse anatomical structures in CTA with few parameters, based on hierarchical clustering. The algorithm is organized as follows: first, a fast non-parametric histogram clustering method is proposed to compute a piecewise constant mask. A second step then indexes all the space-connected regions in the piecewise constant mask. Finally, a hierarchical clustering is achieved to build a graph representing the connections between the various regions in the piecewise constant mask. This step builds up a structural knowledge about the image. Several interactive features for segmentation are presented, for instance association or disassociation of anatomical structures. A comparison with the Mean-Shift algorithm is presented.

  9. The outbreak of SARS mirrored by bibliometric mapping: Combining bibliographic coupling with the complete link cluster method

    Directory of Open Access Journals (Sweden)

    Bo Jarneving

    2007-01-01

    Full Text Available In this study a novel method of science mapping is presented which combines bibliographic coupling, as a measure of document-document similarity, with an agglomerative hierarchical cluster method. The focus in this study is on the mapping of so called ‘core documents’, a concept presented first in 1995 by Glänzel and Czerwon. The term ‘core document’ denote documents that have a central position in the research front in terms of many and strong bibliographic coupling links. The identification and mapping of core documents usually requires a large multidisciplinary research setting and in this study the 2003 volume of the Science Citation Index was applied. From this database, a sub-set of core documents reporting on the outbreak of SARS in 2002 was chosen for the demonstration of the application of this mapping method. It was demonstrated that the method, in this case, successfully identified interpretable research themes and that iterative clustering on two subsequent levels of cluster agglomeration may provide with useful and current information.

  10. A data-driven approach to estimating the number of clusters in hierarchical clustering [version 1; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Antoine E. Zambelli

    2016-12-01

    Full Text Available DNA microarray and gene expression problems often require a researcher to perform clustering on their data in a bid to better understand its structure. In cases where the number of clusters is not known, one can resort to hierarchical clustering methods. However, there currently exist very few automated algorithms for determining the true number of clusters in the data. We propose two new methods (mode and maximum difference for estimating the number of clusters in a hierarchical clustering framework to create a fully automated process with no human intervention. These methods are compared to the established elbow and gap statistic algorithms using simulated datasets and the Biobase Gene ExpressionSet. We also explore a data mixing procedure inspired by cross validation techniques. We find that the overall performance of the maximum difference method is comparable or greater to that of the gap statistic in multi-cluster scenarios, and achieves that performance at a fraction of the computational cost. This method also responds well to our mixing procedure, which opens the door to future research. We conclude that both the mode and maximum difference methods warrant further study related to their mixing and cross-validation potential. We particularly recommend the use of the maximum difference method in multi-cluster scenarios given its accuracy and execution times, and present it as an alternative to existing algorithms.

  11. Parkinson's Disease Subtypes Identified from Cluster Analysis of Motor and Non-motor Symptoms.

    Science.gov (United States)

    Mu, Jesse; Chaudhuri, Kallol R; Bielza, Concha; de Pedro-Cuesta, Jesus; Larrañaga, Pedro; Martinez-Martin, Pablo

    2017-01-01

    Parkinson's disease is now considered a complex, multi-peptide, central, and peripheral nervous system disorder with considerable clinical heterogeneity. Non-motor symptoms play a key role in the trajectory of Parkinson's disease, from prodromal premotor to end stages. To understand the clinical heterogeneity of Parkinson's disease, this study used cluster analysis to search for subtypes from a large, multi-center, international, and well-characterized cohort of Parkinson's disease patients across all motor stages, using a combination of cardinal motor features (bradykinesia, rigidity, tremor, axial signs) and, for the first time, specific validated rater-based non-motor symptom scales. Two independent international cohort studies were used: (a) the validation study of the Non-Motor Symptoms Scale ( n = 411) and (b) baseline data from the global Non-Motor International Longitudinal Study ( n = 540). k -means cluster analyses were performed on the non-motor and motor domains (domains clustering) and the 30 individual non-motor symptoms alone (symptoms clustering), and hierarchical agglomerative clustering was performed to group symptoms together. Four clusters are identified from the domains clustering supporting previous studies: mild, non-motor dominant, motor-dominant, and severe. In addition, six new smaller clusters are identified from the symptoms clustering, each characterized by clinically-relevant non-motor symptoms. The clusters identified in this study present statistical confirmation of the increasingly important role of non-motor symptoms (NMS) in Parkinson's disease heterogeneity and take steps toward subtype-specific treatment packages.

  12. Prediction of in vitro and in vivo oestrogen receptor activity using hierarchical clustering

    Science.gov (United States)

    In this study, hierarchical clustering classification models were developed to predict in vitro and in vivo oestrogen receptor (ER) activity. Classification models were developed for binding, agonist, and antagonist in vitro ER activity and for mouse in vivo uterotrophic ER bindi...

  13. Communication Base Station Log Analysis Based on Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Shao-Hua

    2017-01-01

    Full Text Available Communication base stations generate massive data every day, these base station logs play an important value in mining of the business circles. This paper use data mining technology and hierarchical clustering algorithm to group the scope of business circle for the base station by recording the data of these base stations.Through analyzing the data of different business circle based on feature extraction and comparing different business circle category characteristics, which can choose a suitable area for operators of commercial marketing.

  14. Evolutionary-Hierarchical Bases of the Formation of Cluster Model of Innovation Economic Development

    Directory of Open Access Journals (Sweden)

    Yuliya Vladimirovna Dubrovskaya

    2016-10-01

    Full Text Available The functioning of a modern economic system is based on the interaction of objects of different hierarchical levels. Thus, the problem of the study of innovation processes taking into account the mutual influence of the activities of these economic actors becomes important. The paper dwells evolutionary basis for the formation of models of innovation development on the basis of micro and macroeconomic analysis. Most of the concepts recognized that despite a big number of diverse models, the coordination of the relations between economic agents is of crucial importance for the successful innovation development. According to the results of the evolutionary-hierarchical analysis, the authors reveal key phases of the development of forms of business cooperation, science and government in the domestic economy. It has become the starting point of the conception of the characteristics of the interaction in the cluster models of innovation development of the economy. Considerable expectancies on improvement of the national innovative system are connected with the development of cluster and network structures. The main objective of government authorities is the formation of mechanisms and institutions that will foster cooperation between members of the clusters. The article explains that the clusters cannot become the factors in the growth of the national economy, not being an effective tool for interaction between the actors of the regional innovative systems.

  15. Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space

    OpenAIRE

    Loewenstein, Yaniv; Portugaly, Elon; Fromer, Menachem; Linial, Michal

    2008-01-01

    Motivation: UPGMA (average linking) is probably the most popular algorithm for hierarchical data clustering, especially in computational biology. However, UPGMA requires the entire dissimilarity matrix in memory. Due to this prohibitive requirement, UPGMA is not scalable to very large datasets. Application: We present a novel class of memory-constrained UPGMA (MC-UPGMA) algorithms. Given any practical memory size constraint, this framework guarantees the correct clustering solution without ex...

  16. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

    Science.gov (United States)

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

    2017-07-01

    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  17. Monitoring the sensory quality of canned white asparagus through cluster analysis.

    Science.gov (United States)

    Arana, Inés; Ibañez, Francisco C; Torre, Paloma

    2016-05-01

    White asparagus is one of the 30 vegetables most consumed in the world. This paper unifies the stages of their sensory quality control. The aims of this work were to describe the sensory properties of canned white asparagus and their quality control and to evaluate the applicability of agglomerative hierarchical clustering (AHC) for classifying and monitoring the sensory quality of manufacturers. Sixteen sensory descriptors and their evaluation technique were defined. The sensory profile of canned white asparagus was high flavor characteristic, little acidity and bitterness, medium firmness and very light fibrosity, among other characteristics. The dendrogram established groups of manufacturers that had similar scores in the same set of descriptors, and each cluster grouped the manufacturers that had a similar quality profile. The sensory profile of canned white asparagus was clearly defined through the intensity evaluation of 16 descriptors, and the sensory quality report provided to the manufacturers is in detail and of easy interpretation. AHC grouped the manufacturers according to the highest quality scores in certain descriptors and is a useful tool because it is very visual. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.

  18. A two-stage approach to estimate spatial and spatio-temporal disease risks in the presence of local discontinuities and clusters.

    Science.gov (United States)

    Adin, A; Lee, D; Goicoa, T; Ugarte, María Dolores

    2018-01-01

    Disease risk maps for areal unit data are often estimated from Poisson mixed models with local spatial smoothing, for example by incorporating random effects with a conditional autoregressive prior distribution. However, one of the limitations is that local discontinuities in the spatial pattern are not usually modelled, leading to over-smoothing of the risk maps and a masking of clusters of hot/coldspot areas. In this paper, we propose a novel two-stage approach to estimate and map disease risk in the presence of such local discontinuities and clusters. We propose approaches in both spatial and spatio-temporal domains, where for the latter the clusters can either be fixed or allowed to vary over time. In the first stage, we apply an agglomerative hierarchical clustering algorithm to training data to provide sets of potential clusters, and in the second stage, a two-level spatial or spatio-temporal model is applied to each potential cluster configuration. The superiority of the proposed approach with regard to a previous proposal is shown by simulation, and the methodology is applied to two important public health problems in Spain, namely stomach cancer mortality across Spain and brain cancer incidence in the Navarre and Basque Country regions of Spain.

  19. Compulsive buying disorder clustering based on sex, age, onset and personality traits.

    Science.gov (United States)

    Granero, Roser; Fernández-Aranda, Fernando; Baño, Marta; Steward, Trevor; Mestre-Bach, Gemma; Del Pino-Gutiérrez, Amparo; Moragas, Laura; Mallorquí-Bagué, Núria; Aymamí, Neus; Goméz-Peña, Mónica; Tárrega, Salomé; Menchón, José M; Jiménez-Murcia, Susana

    2016-07-01

    In spite of the revived interest in compulsive buying disorder (CBD), its classification into the contemporary nosologic systems continues to be debated, and scarce studies have addressed heterogeneity in the clinical phenotype through methodologies based on a person-centered approach. To identify empirical clusters of CBD employing personality traits, as well as patients' sex, age and the age of CBD onset as indicators. An agglomerative hierarchical clustering method defining a combination of the Schwarz Bayesian Information Criterion and log-likelihood was used. Three clusters were identified in a sample of n=110 patients attending a specialized CBD unit a) "male compulsive buyers" reported the highest prevalence of comorbid gambling disorder and the lowest levels of reward dependence; b) "female low-dysfunctional" mainly included employed women, with the highest level of education, the oldest age of onset, the lowest scores in harm avoidance and the highest levels of persistence, self-directedness and cooperativeness; and c) "female highly-dysfunctional" with the youngest age of onset, the highest levels of comorbid psychopathology and harm avoidance, and the lowest score in self-directedness. Sociodemographic characteristics and personality traits can be used to determine CBD clusters which represent different clinical subtypes. These subtypes should be considered when developing assessment instruments, preventive programs and treatment interventions. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. An Energy Efficient Cooperative Hierarchical MIMO Clustering Scheme for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Sungyoung Lee

    2011-12-01

    Full Text Available In this work, we present an energy efficient hierarchical cooperative clustering scheme for wireless sensor networks. Communication cost is a crucial factor in depleting the energy of sensor nodes. In the proposed scheme, nodes cooperate to form clusters at each level of network hierarchy ensuring maximal coverage and minimal energy expenditure with relatively uniform distribution of load within the network. Performance is enhanced by cooperative multiple-input multiple-output (MIMO communication ensuring energy efficiency for WSN deployments over large geographical areas. We test our scheme using TOSSIM and compare the proposed scheme with cooperative multiple-input multiple-output (CMIMO clustering scheme and traditional multihop Single-Input-Single-Output (SISO routing approach. Performance is evaluated on the basis of number of clusters, number of hops, energy consumption and network lifetime. Experimental results show significant energy conservation and increase in network lifetime as compared to existing schemes.

  1. Hierarchical clustering analysis of blood plasma lipidomics profiles from mono- and dizygotic twin families

    NARCIS (Netherlands)

    Draisma, H.H.; Reijmers, T.H.; Meulman, J.J.; Greef, J. van der; Hankemeier, T.; Boomsma, D.I.

    2013-01-01

    Twin and family studies are typically used to elucidate the relative contribution of genetic and environmental variation to phenotypic variation. Here, we apply a quantitative genetic method based on hierarchical clustering, to blood plasma lipidomics data obtained in a healthy cohort consisting of

  2. Data Clustering

    Science.gov (United States)

    Wagstaff, Kiri L.

    2012-03-01

    particular application involves considerations of the kind of data being analyzed, algorithm runtime efficiency, and how much prior knowledge is available about the problem domain, which can dictate the nature of clusters sought. Fundamentally, the clustering method and its representations of clusters carries with it a definition of what a cluster is, and it is important that this be aligned with the analysis goals for the problem at hand. In this chapter, I emphasize this point by identifying for each algorithm the cluster representation as a model, m_j , even for algorithms that are not typically thought of as creating a “model.” This chapter surveys a basic collection of clustering methods useful to any practitioner who is interested in applying clustering to a new data set. The algorithms include k-means (Section 25.2), EM (Section 25.3), agglomerative (Section 25.4), and spectral (Section 25.5) clustering, with side mentions of variants such as kernel k-means and divisive clustering. The chapter also discusses each algorithm’s strengths and limitations and provides pointers to additional in-depth reading for each subject. Section 25.6 discusses methods for incorporating domain knowledge into the clustering process. This chapter concludes with a brief survey of interesting applications of clustering methods to astronomy data (Section 25.7). The chapter begins with k-means because it is both generally accessible and so widely used that understanding it can be considered a necessary prerequisite for further work in the field. EM can be viewed as a more sophisticated version of k-means that uses a generative model for each cluster and probabilistic item assignments. Agglomerative clustering is the most basic form of hierarchical clustering and provides a basis for further exploration of algorithms in that vein. Spectral clustering permits a departure from feature-vector-based clustering and can operate on data sets instead represented as affinity, or similarity

  3. A clustering approach applied to time-lapse ERT interpretation - Case study of Lascaux cave

    Science.gov (United States)

    Xu, Shan; Sirieix, Colette; Riss, Joëlle; Malaurent, Philippe

    2017-09-01

    The Lascaux cave, located in southwest France, is one of the most important prehistoric cave in the world that shows Paleolithic paintings. This study aims to characterize the structure of the weathered epikarst setting located above the cave using Time-Lapse Electrical Resistivity Tomography (ERT) combined with local hydrogeological and climatic environmental data. Twenty ERT profiles were carried out for two years and helped us to record the seasonal and spatial variations of the electrical resistivity of the hydraulic upstream area of the Lascaux cave. The 20 interpreted resistivity models were merged into a single synthetic model using a multidimensional statistical method (Hierarchical Agglomerative Clustering). The individual blocks from the synthetic model associated with a similar resistivity variability were gathered into 7 clusters. We combined the resistivity temporal variations with climatic and hydrogeological data to propose a geo-electrical model that relates to a conceptual geological model. We provide a geological interpretation for each cluster regarding epikarst features. The superficial clusters (no 1 & 2) are linked to effective rainfall and trees, probably a fractured limestone. Another two clusters (no 6 & 7) are linked to detrital formations (sand and clay respectively). The cluster 3 may correspond to a marly limestone that forms a non-permeable horizon. Finally, the electrical behavior of the last two clusters (no 4 & 5) is correlated with the variation of flow rate; they may be a privileged feed zone of the flow in the cave.

  4. On the Disruption of Star Clusters in a Hierarchical Interstellar Medium

    Science.gov (United States)

    Elmegreen, Bruce G.; Hunter, Deidre A.

    2010-03-01

    The distribution of the number of clusters as a function of mass M and age T suggests that clusters get eroded or dispersed in a regular way over time, such that the cluster number decreases inversely as an approximate power law with T within each fixed interval of M. This power law is inconsistent with standard dispersal mechanisms such as cluster evaporation and cloud collisions. In the conventional interpretation, it requires the unlikely situation where diverse mechanisms stitch together over time in a way that is independent of environment or M. Here, we consider another model in which the large-scale distribution of gas in each star-forming region plays an important role. We note that star clusters form with positional and temporal correlations in giant cloud complexes, and suggest that these complexes dominate the tidal force and collisional influence on a cluster during its first several hundred million years. Because the cloud complex density decreases regularly with position from the cluster birth site, the harassment and collision rates between the cluster and the cloud pieces decrease regularly with age as the cluster drifts. This decrease is typically a power law of the form required to explain the mass-age distribution. We reproduce this distribution for a variety of cases, including rapid disruption, slow erosion, combinations of these two, cluster-cloud collisions, cluster disruption by hierarchical disassembly, and partial cluster disruption. We also consider apparent cluster mass loss by fading below the surface brightness limit of a survey. In all cases, the observed log M-log T diagram can be reproduced under reasonable assumptions.

  5. ON THE DISRUPTION OF STAR CLUSTERS IN A HIERARCHICAL INTERSTELLAR MEDIUM

    International Nuclear Information System (INIS)

    Elmegreen, Bruce G.; Hunter, Deidre A.

    2010-01-01

    The distribution of the number of clusters as a function of mass M and age T suggests that clusters get eroded or dispersed in a regular way over time, such that the cluster number decreases inversely as an approximate power law with T within each fixed interval of M. This power law is inconsistent with standard dispersal mechanisms such as cluster evaporation and cloud collisions. In the conventional interpretation, it requires the unlikely situation where diverse mechanisms stitch together over time in a way that is independent of environment or M. Here, we consider another model in which the large-scale distribution of gas in each star-forming region plays an important role. We note that star clusters form with positional and temporal correlations in giant cloud complexes, and suggest that these complexes dominate the tidal force and collisional influence on a cluster during its first several hundred million years. Because the cloud complex density decreases regularly with position from the cluster birth site, the harassment and collision rates between the cluster and the cloud pieces decrease regularly with age as the cluster drifts. This decrease is typically a power law of the form required to explain the mass-age distribution. We reproduce this distribution for a variety of cases, including rapid disruption, slow erosion, combinations of these two, cluster-cloud collisions, cluster disruption by hierarchical disassembly, and partial cluster disruption. We also consider apparent cluster mass loss by fading below the surface brightness limit of a survey. In all cases, the observed log M-log T diagram can be reproduced under reasonable assumptions.

  6. Regional Planning and Development Under the Maximization of Urban Agglomerative Economies%城市集聚经济最大化视角下的区域规划发展研究

    Institute of Scientific and Technical Information of China (English)

    朱英明

    2005-01-01

    Starting from the meaning and types of urban agglomerative economies, with the analysis of the characteristics and causes of urban agglomerative economies, this paper puts forward that the regional planning and development should attach importance to urban agglomerative economies, follow the law of the maximization of regional urban agglomerative economies. It also points out the countermeasures and advices tofacilitate the regional planning and development based on the principle.

  7. The Hierarchical Clustering of Tax Burden in the EU27

    Directory of Open Access Journals (Sweden)

    Simkova Nikola

    2015-09-01

    Full Text Available The issue of taxation has become more important due to a significant share of the government revenue. There are several ways of expressing the tax burden of countries. This paper describes the traditional approach as a share of tax revenue to GDP which is applied to the total taxation and the capital taxation as a part of tax systems affecting investment decisions. The implicit tax rate on capital created by Eurostat also offers a possible explanation of the tax burden on capital, so its components are analysed in detail. This study uses one of the econometric methods called the hierarchical clustering. The data on which the clustering is based comprises countries in the EU27 for the period of 1995 – 2012. The aim of this paper is to reveal clusters of countries in the EU27 with similar tax burden or tax changes. The findings suggest that mainly newly acceding countries (2004 and 2007 are in a group of countries with a low tax burden which tried to encourage investors by favourable tax rates. On the other hand, there are mostly countries from the original EU15. Some clusters may be explained by similar historical development, geographic and demographic characteristics.

  8. MUSE: An Efficient and Accurate Verifiable Privacy-Preserving Multikeyword Text Search over Encrypted Cloud Data

    Directory of Open Access Journals (Sweden)

    Zhu Xiangyang

    2017-01-01

    Full Text Available With the development of cloud computing, services outsourcing in clouds has become a popular business model. However, due to the fact that data storage and computing are completely outsourced to the cloud service provider, sensitive data of data owners is exposed, which could bring serious privacy disclosure. In addition, some unexpected events, such as software bugs and hardware failure, could cause incomplete or incorrect results returned from clouds. In this paper, we propose an efficient and accurate verifiable privacy-preserving multikeyword text search over encrypted cloud data based on hierarchical agglomerative clustering, which is named MUSE. In order to improve the efficiency of text searching, we proposed a novel index structure, HAC-tree, which is based on a hierarchical agglomerative clustering method and tends to gather the high-relevance documents in clusters. Based on the HAC-tree, a noncandidate pruning depth-first search algorithm is proposed, which can filter the unqualified subtrees and thus accelerate the search process. The secure inner product algorithm is used to encrypted the HAC-tree index and the query vector. Meanwhile, a completeness verification algorithm is given to verify search results. Experiment results demonstrate that the proposed method outperforms the existing works, DMRS and MRSE-HCI, in efficiency and accuracy, respectively.

  9. ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time.

    Science.gov (United States)

    Cai, Yunpeng; Sun, Yijun

    2011-08-01

    Taxonomy-independent analysis plays an essential role in microbial community analysis. Hierarchical clustering is one of the most widely employed approaches to finding operational taxonomic units, the basis for many downstream analyses. Most existing algorithms have quadratic space and computational complexities, and thus can be used only for small or medium-scale problems. We propose a new online learning-based algorithm that simultaneously addresses the space and computational issues of prior work. The basic idea is to partition a sequence space into a set of subspaces using a partition tree constructed using a pseudometric, then recursively refine a clustering structure in these subspaces. The technique relies on new methods for fast closest-pair searching and efficient dynamic insertion and deletion of tree nodes. To avoid exhaustive computation of pairwise distances between clusters, we represent each cluster of sequences as a probabilistic sequence, and define a set of operations to align these probabilistic sequences and compute genetic distances between them. We present analyses of space and computational complexity, and demonstrate the effectiveness of our new algorithm using a human gut microbiota data set with over one million sequences. The new algorithm exhibits a quasilinear time and space complexity comparable to greedy heuristic clustering algorithms, while achieving a similar accuracy to the standard hierarchical clustering algorithm.

  10. Hierarchical Clustering of Large Databases and Classification of Antibiotics at High Noise Levels

    Directory of Open Access Journals (Sweden)

    Alexander V. Yarkov

    2008-12-01

    Full Text Available A new algorithm for divisive hierarchical clustering of chemical compounds based on 2D structural fragments is suggested. The algorithm is deterministic, and given a random ordering of the input, will always give the same clustering and can process a database up to 2 million records on a standard PC. The algorithm was used for classification of 1,183 antibiotics mixed with 999,994 random chemical structures. Similarity threshold, at which best separation of active and non active compounds took place, was estimated as 0.6. 85.7% of the antibiotics were successfully classified at this threshold with 0.4% of inaccurate compounds. A .sdf file was created with the probe molecules for clustering of external databases.

  11. Hierarchical clustering into groups of human brain regions according to elemental composition

    International Nuclear Information System (INIS)

    Stedman, J.D.; Spyrou, N.M.

    1998-01-01

    Thirteen brain regions were dissected from both hemispheres of fifteen 'normal' ageing subjects (8 females, 7 males) of mean age 79±7 years. Elemental compositions were determined by simultaneous application of particle induced X-ray emission (PIXE) and Rutherford backscattering (RBS) analyses using a 2 MeV, 4 nA proton beam scanned over 4 mm 2 of the sample surface. Elemental concentrations were found to be dependent upon the brain region and hemisphere studied. Hierarchical cluster analysis was applied to group the brain regions according to the sample concentrations of eight elements. The resulting dendrogram is presented and its clusters related to the sample compositions of grey and white matter. (author)

  12. Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters.

    Science.gov (United States)

    Hensman, James; Lawrence, Neil D; Rattray, Magnus

    2013-08-20

    Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.

  13. CHIMERA: Top-down model for hierarchical, overlapping and directed cluster structures in directed and weighted complex networks

    Science.gov (United States)

    Franke, R.

    2016-11-01

    In many networks discovered in biology, medicine, neuroscience and other disciplines special properties like a certain degree distribution and hierarchical cluster structure (also called communities) can be observed as general organizing principles. Detecting the cluster structure of an unknown network promises to identify functional subdivisions, hierarchy and interactions on a mesoscale. It is not trivial choosing an appropriate detection algorithm because there are multiple network, cluster and algorithmic properties to be considered. Edges can be weighted and/or directed, clusters overlap or build a hierarchy in several ways. Algorithms differ not only in runtime, memory requirements but also in allowed network and cluster properties. They are based on a specific definition of what a cluster is, too. On the one hand, a comprehensive network creation model is needed to build a large variety of benchmark networks with different reasonable structures to compare algorithms. On the other hand, if a cluster structure is already known, it is desirable to separate effects of this structure from other network properties. This can be done with null model networks that mimic an observed cluster structure to improve statistics on other network features. A third important application is the general study of properties in networks with different cluster structures, possibly evolving over time. Currently there are good benchmark and creation models available. But what is left is a precise sandbox model to build hierarchical, overlapping and directed clusters for undirected or directed, binary or weighted complex random networks on basis of a sophisticated blueprint. This gap shall be closed by the model CHIMERA (Cluster Hierarchy Interconnection Model for Evaluation, Research and Analysis) which will be introduced and described here for the first time.

  14. Modules Identification in Gene Positive Networks of Hepatocellular Carcinoma Using Pearson Agglomerative Method and Pearson Cohesion Coupling Modularity

    Directory of Open Access Journals (Sweden)

    Jinyu Hu

    2012-01-01

    Full Text Available In this study, a gene positive network is proposed based on a weighted undirected graph, where the weight represents the positive correlation of the genes. A Pearson agglomerative clustering algorithm is employed to build a clustering tree, where dotted lines cut the tree from bottom to top leading to a number of subsets of the modules. In order to achieve better module partitions, the Pearson correlation coefficient modularity is addressed to seek optimal module decomposition by selecting an optimal threshold value. For the liver cancer gene network under study, we obtain a strong threshold value at 0.67302, and a very strong correlation threshold at 0.80086. On the basis of these threshold values, fourteen strong modules and thirteen very strong modules are obtained respectively. A certain degree of correspondence between the two types of modules is addressed as well. Finally, the biological significance of the two types of modules is analyzed and explained, which shows that these modules are closely related to the proliferation and metastasis of liver cancer. This discovery of the new modules may provide new clues and ideas for liver cancer treatment.

  15. Internet of Things-Based Arduino Intelligent Monitoring and Cluster Analysis of Seasonal Variation in Physicochemical Parameters of Jungnangcheon, an Urban Stream

    Directory of Open Access Journals (Sweden)

    Byungwan Jo

    2017-03-01

    Full Text Available In the present case study, the use of an advanced, efficient and low-cost technique for monitoring an urban stream was reported. Physicochemical parameters (PcPs of Jungnangcheon stream (Seoul, South Korea were assessed using an Internet of Things (IoT platform. Temperature, dissolved oxygen (DO, and pH parameters were monitored for the three summer months and the first fall month at a fixed location. Analysis was performed using clustering techniques (CTs, such as K-means clustering, agglomerative hierarchical clustering (AHC, and density-based spatial clustering of applications with noise (DBSCAN. An IoT-based Arduino sensor module (ASM network with a 99.99% efficient communication platform was developed to allow collection of stream data with user-friendly software and hardware and facilitated data analysis by interested individuals using their smartphones. Clustering was used to formulate relationships among physicochemical parameters. K-means clustering was used to identify natural clusters using the silhouette coefficient based on cluster compactness and looseness. AHC grouped all data into two clusters as well as temperature, DO and pH into four, eight, and four clusters, respectively. DBSCAN analysis was also performed to evaluate yearly variations in physicochemical parameters. Noise points (NOISE of temperature in 2016 were border points (ƥ, whereas in 2014 and 2015 they remained core points (ɋ, indicating a trend toward increasing stream temperature. We found the stream parameters were within the permissible limits set by the Water Quality Standards for River Water, South Korea.

  16. Hierarchical and Non-Hierarchical Linear and Non-Linear Clustering Methods to “Shakespeare Authorship Question”

    Directory of Open Access Journals (Sweden)

    Refat Aljumily

    2015-09-01

    Full Text Available A few literary scholars have long claimed that Shakespeare did not write some of his best plays (history plays and tragedies and proposed at one time or another various suspect authorship candidates. Most modern-day scholars of Shakespeare have rejected this claim, arguing that strong evidence that Shakespeare wrote the plays and poems being his name appears on them as the author. This has caused and led to an ongoing scholarly academic debate for quite some long time. Stylometry is a fast-growing field often used to attribute authorship to anonymous or disputed texts. Stylometric attempts to resolve this literary puzzle have raised interesting questions over the past few years. The following paper contributes to “the Shakespeare authorship question” by using a mathematically-based methodology to examine the hypothesis that Shakespeare wrote all the disputed plays traditionally attributed to him. More specifically, the mathematically based methodology used here is based on Mean Proximity, as a linear hierarchical clustering method, and on Principal Components Analysis, as a non-hierarchical linear clustering method. It is also based, for the first time in the domain, on Self-Organizing Map U-Matrix and Voronoi Map, as non-linear clustering methods to cover the possibility that our data contains significant non-linearities. Vector Space Model (VSM is used to convert texts into vectors in a high dimensional space. The aim of which is to compare the degrees of similarity within and between limited samples of text (the disputed plays. The various works and plays assumed to have been written by Shakespeare and possible authors notably, Sir Francis Bacon, Christopher Marlowe, John Fletcher, and Thomas Kyd, where “similarity” is defined in terms of correlation/distance coefficient measure based on the frequency of usage profiles of function words, word bi-grams, and character triple-grams. The claim that Shakespeare authored all the disputed

  17. Definition of run-off-road crash clusters-For safety benefit estimation and driver assistance development.

    Science.gov (United States)

    Nilsson, Daniel; Lindman, Magdalena; Victor, Trent; Dozza, Marco

    2018-04-01

    Single-vehicle run-off-road crashes are a major traffic safety concern, as they are associated with a high proportion of fatal outcomes. In addressing run-off-road crashes, the development and evaluation of advanced driver assistance systems requires test scenarios that are representative of the variability found in real-world crashes. We apply hierarchical agglomerative cluster analysis to define similarities in a set of crash data variables, these clusters can then be used as the basis in test scenario development. Out of 13 clusters, nine test scenarios are derived, corresponding to crashes characterised by: drivers drifting off the road in daytime and night-time, high speed departures, high-angle departures on narrow roads, highways, snowy roads, loss-of-control on wet roadways, sharp curves, and high speeds on roads with severe road surface conditions. In addition, each cluster was analysed with respect to crash variables related to the crash cause and reason for the unintended lane departure. The study shows that cluster analysis of representative data provides a statistically based method to identify relevant properties for run-off-road test scenarios. This was done to support development of vehicle-based run-off-road countermeasures and driver behaviour models used in virtual testing. Future studies should use driver behaviour from naturalistic driving data to further define how test-scenarios and behavioural causation mechanisms should be included. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. The use of hierarchical clustering for the design of optimized monitoring networks

    Science.gov (United States)

    Soares, Joana; Makar, Paul Andrew; Aklilu, Yayne; Akingunola, Ayodeji

    2018-05-01

    Associativity analysis is a powerful tool to deal with large-scale datasets by clustering the data on the basis of (dis)similarity and can be used to assess the efficacy and design of air quality monitoring networks. We describe here our use of Kolmogorov-Zurbenko filtering and hierarchical clustering of NO2 and SO2 passive and continuous monitoring data to analyse and optimize air quality networks for these species in the province of Alberta, Canada. The methodology applied in this study assesses dissimilarity between monitoring station time series based on two metrics: 1 - R, R being the Pearson correlation coefficient, and the Euclidean distance; we find that both should be used in evaluating monitoring site similarity. We have combined the analytic power of hierarchical clustering with the spatial information provided by deterministic air quality model results, using the gridded time series of model output as potential station locations, as a proxy for assessing monitoring network design and for network optimization. We demonstrate that clustering results depend on the air contaminant analysed, reflecting the difference in the respective emission sources of SO2 and NO2 in the region under study. Our work shows that much of the signal identifying the sources of NO2 and SO2 emissions resides in shorter timescales (hourly to daily) due to short-term variation of concentrations and that longer-term averages in data collection may lose the information needed to identify local sources. However, the methodology identifies stations mainly influenced by seasonality, if larger timescales (weekly to monthly) are considered. We have performed the first dissimilarity analysis based on gridded air quality model output and have shown that the methodology is capable of generating maps of subregions within which a single station will represent the entire subregion, to a given level of dissimilarity. We have also shown that our approach is capable of identifying different

  19. An Algorithm for Inspecting Self Check-in Airline Luggage Based on Hierarchical Clustering and Cube-fitting

    Directory of Open Access Journals (Sweden)

    Gao Qingji

    2014-04-01

    Full Text Available Airport passengers are required to put only one baggage each time in the check-in self-service so that the baggage can be detected and identified successfully. In order to automatically get the number of baggage that had been put on the conveyor belt, dual laser rangefinders are used to scan the outer contour of luggage in this paper. The algorithm based on hierarchical clustering and cube-fitting is proposed to inspect the number and dimension of airline luggage. Firstly, the point cloud is projected to vertical direction. By the analysis of one-dimensional clustering, the number and height of luggage will be quickly computed. Secondly, the method of nearest hierarchical clustering is applied to divide the point cloud if the above cannot be distinguished. It can preferably solve the difficult issue like crossing or overlapping pieces of baggage. Finally, the point cloud is projected to the horizontal plane. By rotating point cloud based on the centre, its minimum bounding rectangle (MBR is obtained. The length and width of luggage are got form MBR. Many experiments in different cases have been done to verify the effectiveness of the algorithm.

  20. Biomolecule-Assisted Hydrothermal Synthesis and Self-Assembly of Bi2Te3 Nanostring-Cluster Hierarchical Structure

    DEFF Research Database (Denmark)

    Mi, Jianli; Lock, Nina; Sun, Ting

    2010-01-01

    A simple biomolecule-assisted hydrothermal approach has been developed for the fabrication of Bi2Te3 thermoelectric nanomaterials. The product has a nanostring-cluster hierarchical structure which is composed of ordered and aligned platelet-like crystals. The platelets are100 nm in diameter...

  1. ON THE QUESTION OF THE CONSTRUCTION OF COGNITIVE MAPS FOR DATA MINING

    Directory of Open Access Journals (Sweden)

    Zhilov R. A.

    2016-11-01

    Full Text Available A method of constructing an optimal cognitive maps consists in optimizing the input data and the dimension data structure of a cognitive map. Pro-optimization problem occurs when large amounts of input data. Optimization of time-dimension data is clustering the input data and as a method of polarization-clusters using hierarchical agglomerative method. Cluster analysis allows to divide the data set into a finite number of homogeneous groups. Optimization of the structurery cognitive map is automatically tuning the balance of influence on each other concepts of machine learning methods, particularly the method of training the neural network.

  2. A supplier selection using a hybrid grey based hierarchical clustering and artificial bee colony

    Directory of Open Access Journals (Sweden)

    Farshad Faezy Razi

    2014-06-01

    Full Text Available Selection of one or a combination of the most suitable potential providers and outsourcing problem is the most important strategies in logistics and supply chain management. In this paper, selection of an optimal combination of suppliers in inventory and supply chain management are studied and analyzed via multiple attribute decision making approach, data mining and evolutionary optimization algorithms. For supplier selection in supply chain, hierarchical clustering according to the studied indexes first clusters suppliers. Then, according to its cluster, each supplier is evaluated through Grey Relational Analysis. Then the combination of suppliers’ Pareto optimal rank and costs are obtained using Artificial Bee Colony meta-heuristic algorithm. A case study is conducted for a better description of a new algorithm to select a multiple source of suppliers.

  3. A study of hierarchical clustering of galaxies in an expanding universe

    Science.gov (United States)

    Porter, D. H.

    The nonlinear hierarchical clustering of galaxies in an Einstein-deSitter (Omega = 1), initially white noise mass fluctuations (n = 0) model universe is investigated and shown to be in contradiction with previous results. The model is done in terms of an 11,000-body numerical simulation. The independent statics of 0.72 million particles are used to simulte the boundary conditions. A new method for integrating the Newtonian N-body gravity equations, which has controllable accuracy, incorporates a recursive center of mass reduction, and regularizes two body encounters is used to do the simulation. The coordinate system used here is well suited for the investigation of galaxy clustering, incorporating the independent positions and velocities of an arbitrary number of particles into a logarithmic hierarchy of center of mass nodes. The boundary for the simulation is created by using this hierarchy to map the independent statics of 0.72 million particles into just 4,000 particles. This method for simulating the boundary conditions also has controllable accuracy.

  4. Methods of Complex Data Processing from Technical Means of Monitoring

    Directory of Open Access Journals (Sweden)

    Serhii Tymchuk

    2017-03-01

    Full Text Available The problem of processing the information from different types of monitoring equipment was examined. The use of generalized methods of information processing, based on the techniques of clustering combined territorial information sources for monitoring and the use of framing model of knowledge base for identification of monitoring objects was proposed as a possible solution of the problem. Clustering methods were formed on the basis of Lance-Williams hierarchical agglomerative procedure using the Ward metrics. Frame model of knowledge base was built using the tools of object-oriented modeling.

  5. Automatic quantification of iris color

    DEFF Research Database (Denmark)

    Christoffersen, S.; Harder, Stine; Andersen, J. D.

    2012-01-01

    regions. The result is a blue-brown ratio for each eye. Furthermore, an image clustering approach has been used with promising results. The approach is based on using a sparse dictionary of feature vectors learned from a training set of iris regions. The feature vectors contain both local structural...... information and colour information. For each iris an explanatory histogram is build, containing information about the weighted occurrence of each visual word. A hierarchical agglomerative clustering of the entire set of photos is performed using the distance between the explanatory histograms. The approach...

  6. Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles

    Directory of Open Access Journals (Sweden)

    Lee Yun-Shien

    2008-03-01

    Full Text Available Abstract Background The hierarchical clustering tree (HCT with a dendrogram 1 and the singular value decomposition (SVD with a dimension-reduced representative map 2 are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. Results This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose seriation by Chen 3 as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. Conclusion We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at http://gap.stat.sinica.edu.tw/Software/GAP.

  7. A hierarchical clustering scheme approach to assessment of IP-network traffic using detrended fluctuation analysis

    Science.gov (United States)

    Takuma, Takehisa; Masugi, Masao

    2009-03-01

    This paper presents an approach to the assessment of IP-network traffic in terms of the time variation of self-similarity. To get a comprehensive view in analyzing the degree of long-range dependence (LRD) of IP-network traffic, we use a hierarchical clustering scheme, which provides a way to classify high-dimensional data with a tree-like structure. Also, in the LRD-based analysis, we employ detrended fluctuation analysis (DFA), which is applicable to the analysis of long-range power-law correlations or LRD in non-stationary time-series signals. Based on sequential measurements of IP-network traffic at two locations, this paper derives corresponding values for the LRD-related parameter α that reflects the degree of LRD of measured data. In performing the hierarchical clustering scheme, we use three parameters: the α value, average throughput, and the proportion of network traffic that exceeds 80% of network bandwidth for each measured data set. We visually confirm that the traffic data can be classified in accordance with the network traffic properties, resulting in that the combined depiction of the LRD and other factors can give us an effective assessment of network conditions at different times.

  8. Implementation of hierarchical clustering using k-mer sparse matrix to analyze MERS-CoV genetic relationship

    Science.gov (United States)

    Bustamam, A.; Ulul, E. D.; Hura, H. F. A.; Siswantining, T.

    2017-07-01

    Hierarchical clustering is one of effective methods in creating a phylogenetic tree based on the distance matrix between DNA (deoxyribonucleic acid) sequences. One of the well-known methods to calculate the distance matrix is k-mer method. Generally, k-mer is more efficient than some distance matrix calculation techniques. The steps of k-mer method are started from creating k-mer sparse matrix, and followed by creating k-mer singular value vectors. The last step is computing the distance amongst vectors. In this paper, we analyze the sequences of MERS-CoV (Middle East Respiratory Syndrome - Coronavirus) DNA by implementing hierarchical clustering using k-mer sparse matrix in order to perform the phylogenetic analysis. Our results show that the ancestor of our MERS-CoV is coming from Egypt. Moreover, we found that the MERS-CoV infection that occurs in one country may not necessarily come from the same country of origin. This suggests that the process of MERS-CoV mutation might not only be influenced by geographical factor.

  9. Symptom Clusters in People Living with HIV Attending Five Palliative Care Facilities in Two Sub-Saharan African Countries: A Hierarchical Cluster Analysis.

    Science.gov (United States)

    Moens, Katrien; Siegert, Richard J; Taylor, Steve; Namisango, Eve; Harding, Richard

    2015-01-01

    Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries. Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores. Among the sample (N=217) the mean age was 36.5 (SD 9.0), 73.2% were female, and 49.1% were on antiretroviral therapy (ART). The cluster analysis produced five symptom clusters identified as: 1) dermatological; 2) generalised anxiety and elimination; 3) social and image; 4) persistently present; and 5) a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, ppeople living with HIV with longitudinally collected symptom data to test cluster stability and identify common symptom trajectories is recommended.

  10. Efficient clustering aggregation based on data fragments.

    Science.gov (United States)

    Wu, Ou; Hu, Weiming; Maybank, Stephen J; Zhu, Mingliang; Li, Bing

    2012-06-01

    Clustering aggregation, known as clustering ensembles, has emerged as a powerful technique for combining different clustering results to obtain a single better clustering. Existing clustering aggregation algorithms are applied directly to data points, in what is referred to as the point-based approach. The algorithms are inefficient if the number of data points is large. We define an efficient approach for clustering aggregation based on data fragments. In this fragment-based approach, a data fragment is any subset of the data that is not split by any of the clustering results. To establish the theoretical bases of the proposed approach, we prove that clustering aggregation can be performed directly on data fragments under two widely used goodness measures for clustering aggregation taken from the literature. Three new clustering aggregation algorithms are described. The experimental results obtained using several public data sets show that the new algorithms have lower computational complexity than three well-known existing point-based clustering aggregation algorithms (Agglomerative, Furthest, and LocalSearch); nevertheless, the new algorithms do not sacrifice the accuracy.

  11. K-means clustering for optimal partitioning and dynamic load balancing of parallel hierarchical N-body simulations

    International Nuclear Information System (INIS)

    Marzouk, Youssef M.; Ghoniem, Ahmed F.

    2005-01-01

    A number of complex physical problems can be approached through N-body simulation, from fluid flow at high Reynolds number to gravitational astrophysics and molecular dynamics. In all these applications, direct summation is prohibitively expensive for large N and thus hierarchical methods are employed for fast summation. This work introduces new algorithms, based on k-means clustering, for partitioning parallel hierarchical N-body interactions. We demonstrate that the number of particle-cluster interactions and the order at which they are performed are directly affected by partition geometry. Weighted k-means partitions minimize the sum of clusters' second moments and create well-localized domains, and thus reduce the computational cost of N-body approximations by enabling the use of lower-order approximations and fewer cells. We also introduce compatible techniques for dynamic load balancing, including adaptive scaling of cluster volumes and adaptive redistribution of cluster centroids. We demonstrate the performance of these algorithms by constructing a parallel treecode for vortex particle simulations, based on the serial variable-order Cartesian code developed by Lindsay and Krasny [Journal of Computational Physics 172 (2) (2001) 879-907]. The method is applied to vortex simulations of a transverse jet. Results show outstanding parallel efficiencies even at high concurrencies, with velocity evaluation errors maintained at or below their serial values; on a realistic distribution of 1.2 million vortex particles, we observe a parallel efficiency of 98% on 1024 processors. Excellent load balance is achieved even in the face of several obstacles, such as an irregular, time-evolving particle distribution containing a range of length scales and the continual introduction of new vortex particles throughout the domain. Moreover, results suggest that k-means yields a more efficient partition of the domain than a global oct-tree

  12. A comparison of hierarchical cluster analysis and league table rankings as methods for analysis and presentation of district health system performance data in Uganda.

    Science.gov (United States)

    Tashobya, Christine K; Dubourg, Dominique; Ssengooba, Freddie; Speybroeck, Niko; Macq, Jean; Criel, Bart

    2016-03-01

    In 2003, the Uganda Ministry of Health introduced the district league table for district health system performance assessment. The league table presents district performance against a number of input, process and output indicators and a composite index to rank districts. This study explores the use of hierarchical cluster analysis for analysing and presenting district health systems performance data and compares this approach with the use of the league table in Uganda. Ministry of Health and district plans and reports, and published documents were used to provide information on the development and utilization of the Uganda district league table. Quantitative data were accessed from the Ministry of Health databases. Statistical analysis using SPSS version 20 and hierarchical cluster analysis, utilizing Wards' method was used. The hierarchical cluster analysis was conducted on the basis of seven clusters determined for each year from 2003 to 2010, ranging from a cluster of good through moderate-to-poor performers. The characteristics and membership of clusters varied from year to year and were determined by the identity and magnitude of performance of the individual variables. Criticisms of the league table include: perceived unfairness, as it did not take into consideration district peculiarities; and being oversummarized and not adequately informative. Clustering organizes the many data points into clusters of similar entities according to an agreed set of indicators and can provide the beginning point for identifying factors behind the observed performance of districts. Although league table ranking emphasize summation and external control, clustering has the potential to encourage a formative, learning approach. More research is required to shed more light on factors behind observed performance of the different clusters. Other countries especially low-income countries that share many similarities with Uganda can learn from these experiences. © The Author 2015

  13. Multiscale mining of fMRI data with hierarchical structured sparsity

    International Nuclear Information System (INIS)

    Jenatton, R.; Obozinski, G.; Bach, F.; Gramfort, Alexandre; Michel, Vincent; Thirion, Bertrand; Eger, Evelyne

    2012-01-01

    Reverse inference, or 'brain reading', is a recent paradigm for analyzing functional magnetic resonance imaging (fMRI) data, based on pattern recognition and statistical learning. By predicting some cognitive variables related to brain activation maps, this approach aims at decoding brain activity. Reverse inference takes into account the multivariate information between voxels and is currently the only way to assess how precisely some cognitive information is encoded by the activity of neural populations within the whole brain. However, it relies on a prediction function that is plagued by the curse of dimensionality, since there are far more features than samples, i.e., more voxels than fMRI volumes. To address this problem, different methods have been proposed, such as, among others, univariate feature selection, feature agglomeration and regularization techniques. In this paper, we consider a sparse hierarchical structured regularization. Specifically, the penalization we use is constructed from a tree that is obtained by spatially-constrained agglomerative clustering. This approach encodes the spatial structure of the data at different scales into the regularization, which makes the overall prediction procedure more robust to inter-subject variability. The regularization used induces the selection of spatially coherent predictive brain regions simultaneously at different scales. We test our algorithm on real data acquired to study the mental representation of objects, and we show that the proposed algorithm not only delineates meaningful brain regions but yields as well better prediction accuracy than reference methods. (authors)

  14. Prioritizing the risk of plant pests by clustering methods; self-organising maps, k-means and hierarchical clustering

    Directory of Open Access Journals (Sweden)

    Susan Worner

    2013-09-01

    -means, hierarchical clustering and the incorporation of the SOM analysis into criteria based approaches to assess pest risk.

  15. Characterizing the course of back pain after osteoporotic vertebral fracture: a hierarchical cluster analysis of a prospective cohort study.

    Science.gov (United States)

    Toyoda, Hiromitsu; Takahashi, Shinji; Hoshino, Masatoshi; Takayama, Kazushi; Iseki, Kazumichi; Sasaoka, Ryuichi; Tsujio, Tadao; Yasuda, Hiroyuki; Sasaki, Takeharu; Kanematsu, Fumiaki; Kono, Hiroshi; Nakamura, Hiroaki

    2017-09-23

    This study demonstrated four distinct patterns in the course of back pain after osteoporotic vertebral fracture (OVF). Greater angular instability in the first 6 months after the baseline was one factor affecting back pain after OVF. Understanding the natural course of symptomatic acute OVF is important in deciding the optimal treatment strategy. We used latent class analysis to classify the course of back pain after OVF and identify the risk factors associated with persistent pain. This multicenter cohort study included 218 consecutive patients with ≤ 2-week-old OVFs who were enrolled at 11 institutions. Dynamic x-rays and back pain assessment with a visual analog scale (VAS) were obtained at enrollment and at 1-, 3-, and 6-month follow-ups. The VAS scores were used to characterize patient groups, using hierarchical cluster analysis. VAS for 128 patients was used for hierarchical cluster analysis. Analysis yielded four clusters representing different patterns of back pain progression. Cluster 1 patients (50.8%) had stable, mild pain. Cluster 2 patients (21.1%) started with moderate pain and progressed quickly to very low pain. Patients in cluster 3 (10.9%) had moderate pain that initially improved but worsened after 3 months. Cluster 4 patients (17.2%) had persistent severe pain. Patients in cluster 4 showed significant high baseline pain intensity, higher degree of angular instability, and higher number of previous OVFs, and tended to lack regular exercise. In contrast, patients in cluster 2 had significantly lower baseline VAS and less angular instability. We identified four distinct groups of OVF patients with different patterns of back pain progression. Understanding the course of back pain after OVF may help in its management and contribute to future treatment trials.

  16. A Resting-State Brain Functional Network Study in MDD Based on Minimum Spanning Tree Analysis and the Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Xiaowei Li

    2017-01-01

    Full Text Available A large number of studies demonstrated that major depressive disorder (MDD is characterized by the alterations in brain functional connections which is also identifiable during the brain’s “resting-state.” But, in the present study, the approach of constructing functional connectivity is often biased by the choice of the threshold. Besides, more attention was paid to the number and length of links in brain networks, and the clustering partitioning of nodes was unclear. Therefore, minimum spanning tree (MST analysis and the hierarchical clustering were first used for the depression disease in this study. Resting-state electroencephalogram (EEG sources were assessed from 15 healthy and 23 major depressive subjects. Then the coherence, MST, and the hierarchical clustering were obtained. In the theta band, coherence analysis showed that the EEG coherence of the MDD patients was significantly higher than that of the healthy controls especially in the left temporal region. The MST results indicated the higher leaf fraction in the depressed group. Compared with the normal group, the major depressive patients lost clustering in frontal regions. Our findings suggested that there was a stronger brain interaction in the MDD group and a left-right functional imbalance in the frontal regions for MDD controls.

  17. Combining Unsupervised and Supervised Statistical Learning Methods for Currency Exchange Rate Forecasting

    OpenAIRE

    Vasiljeva, Polina

    2016-01-01

    In this thesis we revisit the challenging problem of forecasting currency exchange rate. We combine machine learning methods such as agglomerative hierarchical clustering and random forest to construct a two-step approach for predicting movements in currency exchange prices of the Swedish krona and the US dollar. We use a data set with over 200 predictors comprised of different financial and macro-economic time series and their transformations. We perform forecasting for one week ahead with d...

  18. Relating climate change signals and physiographic catchment properties to clustered hydrological response types

    Directory of Open Access Journals (Sweden)

    N. Köplin

    2012-07-01

    Full Text Available We propose an approach to reduce a comprehensive set of 186 mesoscale catchments in Switzerland to fewer response types to climate change and to name sensitive regions as well as catchment characteristics that govern hydrological change. We classified the hydrological responses of our study catchments through an agglomerative-hierarchical cluster analysis, and we related the dominant explanatory variables, i.e. the determining catchment properties and climate change signals, to the catchments' hydrological responses by means of redundancy analysis. All clusters except for one exhibit clearly decreasing summer runoff and increasing winter runoff. This seasonal shift was observed for the near future period (2025–2046 but is particularly obvious in the far future period (2074–2095. Within a certain elevation range (between 1000 and 2500 m a.s.l., the hydrological change is basically a function of elevation, because the latter governs the dominant hydro-climatological processes associated with temperature, e.g. the ratio of liquid to solid precipitation and snow melt processes. For catchments below the stated range, hydrological change is mainly a function of precipitation change, which is not as pronounced as the temperature signal is. Future impact studies in Switzerland can be conducted on a reduced sample of catchments representing the sensitive regions or covering a range of altitudes.

  19. Patterns of comorbidity in community-dwelling older people hospitalised for fall-related injury: A cluster analysis

    Directory of Open Access Journals (Sweden)

    Finch Caroline F

    2011-08-01

    Full Text Available Abstract Background Community-dwelling older people aged 65+ years sustain falls frequently; these can result in physical injuries necessitating medical attention including emergency department care and hospitalisation. Certain health conditions and impairments have been shown to contribute independently to the risk of falling or experiencing a fall injury, suggesting that individuals with these conditions or impairments should be the focus of falls prevention. Since older people commonly have multiple conditions/impairments, knowledge about which conditions/impairments coexist in at-risk individuals would be valuable in the implementation of a targeted prevention approach. The objective of this study was therefore to examine the prevalence and patterns of comorbidity in this population group. Methods We analysed hospitalisation data from Victoria, Australia's second most populous state, to estimate the prevalence of comorbidity in patients hospitalised at least once between 2005-6 and 2007-8 for treatment of acute fall-related injuries. In patients with two or more comorbid conditions (multicomorbidity we used an agglomerative hierarchical clustering method to cluster comorbidity variables and identify constellations of conditions. Results More than one in four patients had at least one comorbid condition and among patients with comorbidity one in three had multicomorbidity (range 2-7. The prevalence of comorbidity varied by gender, age group, ethnicity and injury type; it was also associated with a significant increase in the average cumulative length of stay per patient. The cluster analysis identified five distinct, biologically plausible clusters of comorbidity: cardiopulmonary/metabolic, neurological, sensory, stroke and cancer. The cardiopulmonary/metabolic cluster was the largest cluster among the clusters identified. Conclusions The consequences of comorbidity clustering in terms of falls and/or injury outcomes of hospitalised patients

  20. Using hierarchical clustering methods to classify motor activities of COPD patients from wearable sensor data

    Directory of Open Access Journals (Sweden)

    Reilly John J

    2005-06-01

    Full Text Available Abstract Background Advances in miniature sensor technology have led to the development of wearable systems that allow one to monitor motor activities in the field. A variety of classifiers have been proposed in the past, but little has been done toward developing systematic approaches to assess the feasibility of discriminating the motor tasks of interest and to guide the choice of the classifier architecture. Methods A technique is introduced to address this problem according to a hierarchical framework and its use is demonstrated for the application of detecting motor activities in patients with chronic obstructive pulmonary disease (COPD undergoing pulmonary rehabilitation. Accelerometers were used to collect data for 10 different classes of activity. Features were extracted to capture essential properties of the data set and reduce the dimensionality of the problem at hand. Cluster measures were utilized to find natural groupings in the data set and then construct a hierarchy of the relationships between clusters to guide the process of merging clusters that are too similar to distinguish reliably. It provides a means to assess whether the benefits of merging for performance of a classifier outweigh the loss of resolution incurred through merging. Results Analysis of the COPD data set demonstrated that motor tasks related to ambulation can be reliably discriminated from tasks performed in a seated position with the legs in motion or stationary using two features derived from one accelerometer. Classifying motor tasks within the category of activities related to ambulation requires more advanced techniques. While in certain cases all the tasks could be accurately classified, in others merging clusters associated with different motor tasks was necessary. When merging clusters, it was found that the proposed method could lead to more than 12% improvement in classifier accuracy while retaining resolution of 4 tasks. Conclusion Hierarchical

  1. On hierarchical solutions to the BBGKY hierarchy

    Science.gov (United States)

    Hamilton, A. J. S.

    1988-01-01

    It is thought that the gravitational clustering of galaxies in the universe may approach a scale-invariant, hierarchical form in the small separation, large-clustering regime. Past attempts to solve the Born-Bogoliubov-Green-Kirkwood-Yvon (BBGKY) hierarchy in this regime have assumed a certain separable hierarchical form for the higher order correlation functions of galaxies in phase space. It is shown here that such separable solutions to the BBGKY equations must satisfy the condition that the clustered component of the solution has cluster-cluster correlations equal to galaxy-galaxy correlations to all orders. The solutions also admit the presence of an arbitrary unclustered component, which plays no dyamical role in the large-clustering regime. These results are a particular property of the specific separable model assumed for the correlation functions in phase space, not an intrinsic property of spatially hierarchical solutions to the BBGKY hierarchy. The observed distribution of galaxies does not satisfy the required conditions. The disagreement between theory and observation may be traced, at least in part, to initial conditions which, if Gaussian, already have cluster correlations greater than galaxy correlations.

  2. Hierarchical Cluster Analysis of Semicircular Canal and Otolith Deficits in Bilateral Vestibulopathy

    Directory of Open Access Journals (Sweden)

    Alexander A. Tarnutzer

    2018-04-01

    Full Text Available BackgroundGait imbalance and oscillopsia are frequent complaints of bilateral vestibular loss (BLV. Video-head-impulse testing (vHIT of all six semicircular canals (SCCs has demonstrated varying involvement of the different canals. Sparing of anterior-canal function has been linked to aminoglycoside-related vestibulopathy and Menière’s disease. We hypothesized that utricular and saccular impairment [assessed by vestibular-evoked myogenic potentials (VEMPs] may be disease-specific also, possibly facilitating the differential diagnosis.MethodsWe searched our vHIT database (n = 3,271 for patients with bilaterally impaired SCC function who also received ocular VEMPs (oVEMPs and cervical VEMPs (cVEMPs and identified 101 patients. oVEMP/cVEMP latencies above the 95th percentile and peak-to-peak amplitudes below the 5th percentile of normal were considered abnormal. Frequency of impairment of vestibular end organs (horizontal/anterior/posterior SCC, utriculus/sacculus was analyzed with hierarchical cluster analysis and correlated with the underlying etiology.ResultsRates of utricular and saccular loss of function were similar (87.1 vs. 78.2%, p = 0.136, Fisher’s exact test. oVEMP abnormalities were found more frequent in aminoglycoside-related bilateral vestibular loss (BVL compared with Menière’s disease (91.7 vs. 54.6%, p = 0.039. Hierarchical cluster analysis indicated distinct patterns of vestibular end-organ impairment, showing that the results for the same end-organs on both sides are more similar than to other end-organs. Relative sparing of anterior-canal function was reflected in late merging with the other end-organs, emphasizing their distinct state. An anatomically corresponding pattern of SCC/otolith hypofunction was present in 60.4% (oVEMPs vs. horizontal SCCs, 34.7% (oVEMPs vs. anterior SCCs, and 48.5% (cVEMPs vs. posterior SCCs of cases. Average (±1 SD number of damaged sensors was 6.8 ± 2.2 out of 10

  3. Efficient visible light photocatalytic NO{sub x} removal with cationic Ag clusters-grafted (BiO){sub 2}CO{sub 3} hierarchical superstructures

    Energy Technology Data Exchange (ETDEWEB)

    Feng, Xin [Chongqing Key Laboratory of Catalysis and Functional Organic Molecules, College of Environment and Resources, Engineering Research Center for Waste Oil Recovery Technology and Equipment of Ministry of Education, College of Environment and Resources, Chongqing Technology and Business University, Chongqing 40067 (China); Zhang, Wendong [Department of Scientific Research Management, Chongqing Normal University, Chongqing 401331 (China); Deng, Hua [State Key Joint Laboratory of Environment Simulation and Pollution Control, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085 (China); Ni, Zilin [Department of Scientific Research Management, Chongqing Normal University, Chongqing 401331 (China); Dong, Fan, E-mail: dfctbu@126.com [Chongqing Key Laboratory of Catalysis and Functional Organic Molecules, College of Environment and Resources, Engineering Research Center for Waste Oil Recovery Technology and Equipment of Ministry of Education, College of Environment and Resources, Chongqing Technology and Business University, Chongqing 40067 (China); Zhang, Yuxin, E-mail: zhangyuxin@cqu.edu.cn [College of Materials Science and Engineering, National Key Laboratory of Fundamental Science of Micro/Nano-Devices and System Technology, Chongqing University, Chongqing 400044 (China)

    2017-01-15

    Graphical abstract: The cationic Ag clusters-grafted (BiO){sub 2}CO{sub 3} hierarchical superstructures exhibits highly enhanced visible light photocatalytic air purification through an interfacial charge transfer process induced by Ag clusters. - Highlights: • Microstructural optimization and surface cluster-grafting were firstly combined. • Cationic Ag clusters were grafted on the surface of (BiO){sub 2}CO{sub 3} superstructures. • The Ag clusters-grafted BHS displayed enhanced visible light photocatalysis. • Direct interfacial charge transfer (IFCT) from BHS to Ag clusters was proposed. • The charge transfer process and the dominant reactive species were revealed. - Abstract: A facile method was developed to graft cationic Ag clusters on (BiO){sub 2}CO{sub 3} hierarchical superstructures (BHS) surface to improve their visible light activity. Significantly, the resultant Ag clusters-grafted BHS displayed a highly enhanced visible light photocatalytic performance for NOx removal due to the direct interfacial charge transfer (IFCT) from BHS to Ag clusters. The chemical and coordination state of the cationic Ag clusters was determined with the extended X-ray absorption fine structure (EXAFS) and a theoretical structure model was proposed for this unique Ag clusters. The charge transfer process and the dominant reactive species (·OH) were revealed on the basis of electron spin resonance (ESR) trapping. A new photocatalysis mechanism of Ag clusters-grafted BHS under visible light involving IFCT process was uncovered. In addition, the cationic Ag clusters-grafted BHS also demonstrated high photochemical and structural stability under repeated photocatalysis runs. The perspective of enhancing photocatalysis through combination of microstructural optimization and IFCT could provide a new avenue for the developing efficient visible light photocatalysts.

  4. APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

    DEFF Research Database (Denmark)

    Zong, Yu; Xu, Guandong; Jin, Pin

    2011-01-01

    algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C1,C2,…,Cm}; (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone...... as the initial tag clustering result and then assign the rest tags into the corresponding clusters based on the similarity. Experimental results on three real world datasets namely MedWorm, MovieLens and Dmoz demonstrate the effectiveness and the superiority of the proposed method against the traditional...... Agglomerative Clustering on tagging data, which possess the inherent drawbacks, such as the sensitivity of initialization. In this paper, we instead make use of the approximate backbone of tag clustering results to find out better tag clusters. In particular, we propose an APProximate backbonE-based Clustering...

  5. Hierarchical ordering with partial pairwise hierarchical relationships on the macaque brain data sets.

    Directory of Open Access Journals (Sweden)

    Woosang Lim

    Full Text Available Hierarchical organizations of information processing in the brain networks have been known to exist and widely studied. To find proper hierarchical structures in the macaque brain, the traditional methods need the entire pairwise hierarchical relationships between cortical areas. In this paper, we present a new method that discovers hierarchical structures of macaque brain networks by using partial information of pairwise hierarchical relationships. Our method uses a graph-based manifold learning to exploit inherent relationship, and computes pseudo distances of hierarchical levels for every pair of cortical areas. Then, we compute hierarchy levels of all cortical areas by minimizing the sum of squared hierarchical distance errors with the hierarchical information of few cortical areas. We evaluate our method on the macaque brain data sets whose true hierarchical levels are known as the FV91 model. The experimental results show that hierarchy levels computed by our method are similar to the FV91 model, and its errors are much smaller than the errors of hierarchical clustering approaches.

  6. Hierarchical video summarization

    Science.gov (United States)

    Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.

    1998-12-01

    We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.

  7. A Clustering Routing Protocol for Mobile Ad Hoc Networks

    Directory of Open Access Journals (Sweden)

    Jinke Huang

    2016-01-01

    Full Text Available The dynamic topology of a mobile ad hoc network poses a real challenge in the design of hierarchical routing protocol, which combines proactive with reactive routing protocols and takes advantages of both. And as an essential technique of hierarchical routing protocol, clustering of nodes provides an efficient method of establishing a hierarchical structure in mobile ad hoc networks. In this paper, we designed a novel clustering algorithm and a corresponding hierarchical routing protocol for large-scale mobile ad hoc networks. Each cluster is composed of a cluster head, several cluster gateway nodes, several cluster guest nodes, and other cluster members. The proposed routing protocol uses proactive protocol between nodes within individual clusters and reactive protocol between clusters. Simulation results show that the proposed clustering algorithm and hierarchical routing protocol provide superior performance with several advantages over existing clustering algorithm and routing protocol, respectively.

  8. Permutation Tests of Hierarchical Cluster Analyses of Carrion Communities and Their Potential Use in Forensic Entomology.

    Science.gov (United States)

    van der Ham, Joris L

    2016-05-19

    Forensic entomologists can use carrion communities' ecological succession data to estimate the postmortem interval (PMI). Permutation tests of hierarchical cluster analyses of these data provide a conceptual method to estimate part of the PMI, the post-colonization interval (post-CI). This multivariate approach produces a baseline of statistically distinct clusters that reflect changes in the carrion community composition during the decomposition process. Carrion community samples of unknown post-CIs are compared with these baseline clusters to estimate the post-CI. In this short communication, I use data from previously published studies to demonstrate the conceptual feasibility of this multivariate approach. Analyses of these data produce series of significantly distinct clusters, which represent carrion communities during 1- to 20-day periods of the decomposition process. For 33 carrion community samples, collected over an 11-day period, this approach correctly estimated the post-CI within an average range of 3.1 days. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Multichannel biomedical time series clustering via hierarchical probabilistic latent semantic analysis.

    Science.gov (United States)

    Wang, Jin; Sun, Xiangping; Nahavandi, Saeid; Kouzani, Abbas; Wu, Yuchuan; She, Mary

    2014-11-01

    Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  10. Hierarchical Control for Multiple DC-Microgrids Clusters

    DEFF Research Database (Denmark)

    Shafiee, Qobad; Dragicevic, Tomislav; Vasquez, Juan Carlos

    2014-01-01

    DC microgrids (MGs) have gained research interest during the recent years because of many potential advantages as compared to the ac system. To ensure reliable operation of a low-voltage dc MG as well as its intelligent operation with the other DC MGs, a hierarchical control is proposed in this p......DC microgrids (MGs) have gained research interest during the recent years because of many potential advantages as compared to the ac system. To ensure reliable operation of a low-voltage dc MG as well as its intelligent operation with the other DC MGs, a hierarchical control is proposed...

  11. The Case for a Hierarchical Cosmology

    Science.gov (United States)

    Vaucouleurs, G. de

    1970-01-01

    The development of modern theoretical cosmology is presented and some questionable assumptions of orthodox cosmology are pointed out. Suggests that recent observations indicate that hierarchical clustering is a basic factor in cosmology. The implications of hierarchical models of the universe are considered. Bibliography. (LC)

  12. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  13. Clustering analysis

    International Nuclear Information System (INIS)

    Romli

    1997-01-01

    Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods

  14. Clustering, Hierarchical Organization, and the Topography of Abstract and Concrete Nouns

    Directory of Open Access Journals (Sweden)

    Joshua eTroche

    2014-04-01

    Full Text Available The empirical study of language has historically relied heavily upon concrete word stimuli. By definition, concrete words evoke salient perceptual associations that fit well within feature-based, sensorimotor models of word meaning. In contrast, many theorists argue that abstract words are disembodied in that their meaning is mediated through language. We investigated word meaning as distributed in multidimensional space using hierarchical cluster analysis. Participants (N=365 rated target words (n=400 English nouns across 12 cognitive dimensions (e.g., polarity, ease of teaching, emotional valence. Factor reduction revealed three latent factors, corresponding roughly to perceptual salience, affective association, and magnitude. We plotted the original 400 words for the three latent factors. Abstract and concrete words showed overlap in their topography but also differentiated themselves in semantic space. This topographic approach to word meaning offers a unique perspective to word concreteness.

  15. Clustering, hierarchical organization, and the topography of abstract and concrete nouns.

    Science.gov (United States)

    Troche, Joshua; Crutch, Sebastian; Reilly, Jamie

    2014-01-01

    The empirical study of language has historically relied heavily upon concrete word stimuli. By definition, concrete words evoke salient perceptual associations that fit well within feature-based, sensorimotor models of word meaning. In contrast, many theorists argue that abstract words are "disembodied" in that their meaning is mediated through language. We investigated word meaning as distributed in multidimensional space using hierarchical cluster analysis. Participants (N = 365) rated target words (n = 400 English nouns) across 12 cognitive dimensions (e.g., polarity, ease of teaching, emotional valence). Factor reduction revealed three latent factors, corresponding roughly to perceptual salience, affective association, and magnitude. We plotted the original 400 words for the three latent factors. Abstract and concrete words showed overlap in their topography but also differentiated themselves in semantic space. This topographic approach to word meaning offers a unique perspective to word concreteness.

  16. Performance analysis of clustering techniques over microarray data: A case study

    Science.gov (United States)

    Dash, Rasmita; Misra, Bijan Bihari

    2018-03-01

    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  17. Optimal wavelength band clustering for multispectral iris recognition.

    Science.gov (United States)

    Gong, Yazhuo; Zhang, David; Shi, Pengfei; Yan, Jingqi

    2012-07-01

    This work explores the possibility of clustering spectral wavelengths based on the maximum dissimilarity of iris textures. The eventual goal is to determine how many bands of spectral wavelengths will be enough for iris multispectral fusion and to find these bands that will provide higher performance of iris multispectral recognition. A multispectral acquisition system was first designed for imaging the iris at narrow spectral bands in the range of 420 to 940 nm. Next, a set of 60 human iris images that correspond to the right and left eyes of 30 different subjects were acquired for an analysis. Finally, we determined that 3 clusters were enough to represent the 10 feature bands of spectral wavelengths using the agglomerative clustering based on two-dimensional principal component analysis. The experimental results suggest (1) the number, center, and composition of clusters of spectral wavelengths and (2) the higher performance of iris multispectral recognition based on a three wavelengths-bands fusion.

  18. Hierarchical Controlled Remote State Preparation by Using a Four-Qubit Cluster State

    Science.gov (United States)

    Ma, Peng-Cheng; Chen, Gui-Bin; Li, Xiao-Wei; Zhan, You-Bang

    2018-02-01

    We propose a scheme for hierarchical controlled remote preparation of an arbitrary single-qubit state via a four-qubit cluster state as the quantum channel. In this scheme, a sender wishes to help three agents to remotely prepare a quantum state, respectively. The three agents are divided into two grades, that is, an agent is in the upper grade and other two agents are in the lower grade. In this process of remote state preparation, the agent of the upper grade only needs the assistance of any one of the other two agents for recovering the sender's original state, while an agent of the lower grade needs the collaboration of all the other two agents. In other words, the agents of two grades have different authorities to reconstruct sender's original state.

  19. Hierarchical Controlled Remote State Preparation by Using a Four-Qubit Cluster State

    Science.gov (United States)

    Ma, Peng-Cheng; Chen, Gui-Bin; Li, Xiao-Wei; Zhan, You-Bang

    2018-06-01

    We propose a scheme for hierarchical controlled remote preparation of an arbitrary single-qubit state via a four-qubit cluster state as the quantum channel. In this scheme, a sender wishes to help three agents to remotely prepare a quantum state, respectively. The three agents are divided into two grades, that is, an agent is in the upper grade and other two agents are in the lower grade. In this process of remote state preparation, the agent of the upper grade only needs the assistance of any one of the other two agents for recovering the sender's original state, while an agent of the lower grade needs the collaboration of all the other two agents. In other words, the agents of two grades have different authorities to reconstruct sender's original state.

  20. A 350 ka record of climate change from Lake El'gygytgyn, Far East Russian Arctic: refining the pattern of climate modes by means of cluster analysis

    Directory of Open Access Journals (Sweden)

    U. Frank

    2013-07-01

    Full Text Available Rock magnetic, biochemical and inorganic records of the sediment cores PG1351 and Lz1024 from Lake El'gygytgyn, Chukotka peninsula, Far East Russian Arctic, were subject to a hierarchical agglomerative cluster analysis in order to refine and extend the pattern of climate modes as defined by Melles et al. (2007. Cluster analysis of the data obtained from both cores yielded similar results, differentiating clearly between the four climate modes warm, peak warm, cold and dry, and cold and moist. In addition, two transitional phases were identified, representing the early stages of a cold phase and slightly colder conditions during a warm phase. The statistical approach can thus be used to resolve gradual changes in the sedimentary units as an indicator of available oxygen in the hypolimnion in greater detail. Based upon cluster analyses on core Lz1024, the published succession of climate modes in core PG1351, covering the last 250 ka, was modified and extended back to 350 ka. Comparison to the marine oxygen isotope (δ18O stack LR04 (Lisiecki and Raymo, 2005 and the summer insolation at 67.5° N, with the extended Lake El'gygytgyn parameter records of magnetic susceptibility (κLF, total organic carbon content (TOC and the chemical index of alteration (CIA; Minyuk et al., 2007, revealed that all stages back to marine isotope stage (MIS 10 and most of the substages are clearly reflected in the pattern derived from the cluster analysis.

  1. Hydrothermal synthesis and photoluminescent properties of hierarchical GdPO4·H2O:Ln3+ (Ln3+ = Eu3+, Ce3+, Tb3+) flower-like clusters

    Science.gov (United States)

    Amurisana, Bao.; Zhiqiang, Song.; Haschaolu, O.; Yi, Chen; Tegus, O.

    2018-02-01

    3D hierarchical GdPO4·H2O:Ln3+ (Ln3+ = Eu3+, Ce3+, Tb3+) flower clusters were successfully prepared on glass slide substrate by a simple, economical hydrothermal process with the assistance of disodium ethylenediaminetetraacetic acid (Na2H2L, where L4- = (CH2COO)2N(CH2)2N(CH2COO)24-). In this process, Na2H2L was used as both a chelating agent and a structure-director. The hierarchical flower clusters have an average diameter of 7-12 μm and are composed of well-aligned microrods. The influence of the molar ratio of Na2H2L/Gd3+ and reaction time on the morphology was systematically studied. A possible crystal growth and formation mechanism of hierarchical flower clusters is proposed based on the evolution of morphology as a function of reaction time. The self-assembled GdPO4·H2O:Ln3+ superstructures exhibit strong orange-red (Eu3+, 5D0 → 7F1), green (Tb3+, 5D4 → 7F5) and near ultraviolet emissions (Ce3+, 5d → 7F5/2) under ultraviolet excitation, respectively. This study may provide a new channel for building hierarchically superstructued oxide micro/nanomaterials with optical and new properties.

  2. Micromechanics of hierarchical materials

    DEFF Research Database (Denmark)

    Mishnaevsky, Leon, Jr.

    2012-01-01

    A short overview of micromechanical models of hierarchical materials (hybrid composites, biomaterials, fractal materials, etc.) is given. Several examples of the modeling of strength and damage in hierarchical materials are summarized, among them, 3D FE model of hybrid composites...... with nanoengineered matrix, fiber bundle model of UD composites with hierarchically clustered fibers and 3D multilevel model of wood considered as a gradient, cellular material with layered composite cell walls. The main areas of research in micromechanics of hierarchical materials are identified, among them......, the investigations of the effects of load redistribution between reinforcing elements at different scale levels, of the possibilities to control different material properties and to ensure synergy of strengthening effects at different scale levels and using the nanoreinforcement effects. The main future directions...

  3. Discovering hierarchical structure in normal relational data

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Herlau, Tue; Mørup, Morten

    2014-01-01

    -parametric generative model for hierarchical clustering of similarity based on multifurcating Gibbs fragmentation trees. This allows us to infer and display the posterior distribution of hierarchical structures that comply with the data. We demonstrate the utility of our method on synthetic data and data of functional...

  4. DATA CLASSIFICATION WITH NEURAL CLASSIFIER USING RADIAL BASIS FUNCTION WITH DATA REDUCTION USING HIERARCHICAL CLUSTERING

    Directory of Open Access Journals (Sweden)

    M. Safish Mary

    2012-04-01

    Full Text Available Classification of large amount of data is a time consuming process but crucial for analysis and decision making. Radial Basis Function networks are widely used for classification and regression analysis. In this paper, we have studied the performance of RBF neural networks to classify the sales of cars based on the demand, using kernel density estimation algorithm which produces classification accuracy comparable to data classification accuracy provided by support vector machines. In this paper, we have proposed a new instance based data selection method where redundant instances are removed with help of a threshold thus improving the time complexity with improved classification accuracy. The instance based selection of the data set will help reduce the number of clusters formed thereby reduces the number of centers considered for building the RBF network. Further the efficiency of the training is improved by applying a hierarchical clustering technique to reduce the number of clusters formed at every step. The paper explains the algorithm used for classification and for conditioning the data. It also explains the complexities involved in classification of sales data for analysis and decision-making.

  5. Discrimination of Black Ball-point Pen Inks by High Performance Liquid Chromatography (HPLC)

    International Nuclear Information System (INIS)

    Mohamed Izzharif Abdul Halim; Norashikin Saim; Rozita Osman; Halila Jasmani; Nurul Nadhirah Zainal Abidin

    2013-01-01

    In this study, thirteen types of black ball-point pen inks of three major brands were analyzed using high performance liquid chromatography (HPLC). Separation of the ink components was achieved using Bondapak C-18 column with gradient elution using water, ethanol and ethyl acetate. The chromatographic data obtained at wavelength 254.8 nm was analyzed using agglomerative hierarchical clustering (AHC) and principle component analysis (PCA). AHC was able to group the inks into three clusters. This result was supported by PCA, whereby distinct separation of the three different brands was achieved. Therefore, HPLC in combination with chemometric methods may be a valuable tool for the analysis of black ball-point pen inks for forensic purposes. (author)

  6. Examination of Clustering in Eutectic Microstrcture

    Directory of Open Access Journals (Sweden)

    Bortnyik K.

    2017-06-01

    Full Text Available The eutectic microstructures are complex microstructures and a hard work to describe it with few numbers. The eutectics builds up eutectic cells. In the cells the phases are clustered. With the development of big databases the data mining also develops, and produces a lot of method to handling the large datasets, and earns information from the sets. One typical method is the clustering, which finds the groups in the datasets. In this article a partitioning and a hierarchical clustering is applied to eutectic structures to find the clusters. In the case of AlMn alloy the K-means algorithm work well, and find the eutectic cells. In the case of ductile cast iron the hierarchical clustering works better. With the combination of the partitioning and hierarchical clustering with the image transformation, an effective method is developed for clustering the objects in the microstructures.

  7. A Negative Selection Algorithm Based on Hierarchical Clustering of Self Set and its Application in Anomaly Detection

    Directory of Open Access Journals (Sweden)

    Wen Chen

    2011-08-01

    Full Text Available A negative selection algorithm based on the hierarchical clustering of self set HC-RNSA is introduced in this paper. Several strategies are applied to improve the algorithm performance. First, the self data set is replaced by the self cluster centers to compare with the detector candidates in each cluster level. As the number of self clusters is much less than the self set size, the detector generation efficiency is improved. Second, during the detector generation process, the detector candidates are restricted to the lower coverage space to reduce detector redundancy. In the article, the problem that the distances between antigens coverage to a constant value in the high dimensional space is analyzed, accordingly the Principle Component Analysis (PCA method is used to reduce the data dimension, and the fractional distance function is employed to enhance the distinctiveness between the self and non-self antigens. The detector generation procedure is terminated when the expected non-self coverage is reached. The theory analysis and experimental results demonstrate that the detection rate of HC-RNSA is higher than that of the traditional negative selection algorithms while the false alarm rate and time cost are reduced.

  8. Interactive visual exploration and analysis of origin-destination data

    Science.gov (United States)

    Ding, Linfang; Meng, Liqiu; Yang, Jian; Krisp, Jukka M.

    2018-05-01

    In this paper, we propose a visual analytics approach for the exploration of spatiotemporal interaction patterns of massive origin-destination data. Firstly, we visually query the movement database for data at certain time windows. Secondly, we conduct interactive clustering to allow the users to select input variables/features (e.g., origins, destinations, distance, and duration) and to adjust clustering parameters (e.g. distance threshold). The agglomerative hierarchical clustering method is applied for the multivariate clustering of the origin-destination data. Thirdly, we design a parallel coordinates plot for visualizing the precomputed clusters and for further exploration of interesting clusters. Finally, we propose a gradient line rendering technique to show the spatial and directional distribution of origin-destination clusters on a map view. We implement the visual analytics approach in a web-based interactive environment and apply it to real-world floating car data from Shanghai. The experiment results show the origin/destination hotspots and their spatial interaction patterns. They also demonstrate the effectiveness of our proposed approach.

  9. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Guo Junqiao

    2008-09-01

    Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.

  10. Genome-wide decoding of hierarchical modular structure of transcriptional regulation by cis-element and expression clustering.

    Science.gov (United States)

    Leyfer, Dmitriy; Weng, Zhiping

    2005-09-01

    A holistic approach to the study of cellular processes is identifying both gene-expression changes and regulatory elements promoting such changes. Cellular regulatory processes can be viewed as transcriptional modules (TMs), groups of coexpressed genes regulated by groups of transcription factors (TFs). We set out to devise a method that would identify TMs while avoiding arbitrary thresholds on TM sizes and number. Assuming that gene expression is determined by TFs that bind to the gene's promoter, clustering of genes based on TF binding sites (cis-elements) should create gene groups similar to those obtained by gene expression clustering. Intersections between the expression and cis-element-based gene clusters reveal TMs. Statistical significance assigned to each TM allows identification of regulatory units of any size. Our method correctly identifies the number and sizes of TMs on simulated datasets. We demonstrate that yeast experimental TMs are biologically relevant by comparing them with MIPS and GO categories. Our modules are in statistically significant agreement with TMs from other research groups. This work suggests that there is no preferential division of biological processes into regulatory units; each degree of partitioning exhibits a slice of biological network revealing hierarchical modular organization of transcriptional regulation.

  11. Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models

    KAUST Repository

    Chermak, Edrisse; De Donato, Renato; Lensink, Marc F.; Petta, Andrea; Serra, Luigi; Scarano, Vittorio; Cavallo, Luigi; Oliva, Romina

    2016-01-01

    Correctly scoring protein-protein docking models to single out native-like ones is an open challenge. It is also an object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), the community-wide blind docking experiment. We introduced in the field the first pure consensus method, CONSRANK, which ranks models based on their ability to match the most conserved contacts in the ensemble they belong to. In CAPRI, scorers are asked to evaluate a set of available models and select the top ten ones, based on their own scoring approach. Scorers' performance is ranked based on the number of targets/interfaces for which they could provide at least one correct solution. In such terms, blind testing in CAPRI Round 30 (a joint prediction round with CASP11) has shown that critical cases for CONSRANK are represented by targets showing multiple interfaces or for which only a very small number of correct solutions are available. To address these challenging cases, CONSRANK has now been modified to include a contact-based clustering of the models as a preliminary step of the scoring process. We used an agglomerative hierarchical clustering based on the number of common inter-residue contacts within the models. Two criteria, with different thresholds, were explored in the cluster generation, setting either the number of common contacts or of total clusters. For each clustering approach, after selecting the top (most populated) ten clusters, CONSRANK was run on these clusters and the top-ranked model for each cluster was selected, in the limit of 10 models per target. We have applied our modified scoring approach, Clust-CONSRANK, to SCORE_SET, a set of CAPRI scoring models made recently available by CAPRI assessors, and to the subset of homodimeric targets in CAPRI Round 30 for which CONSRANK failed to include a correct solution within the ten selected models. Results show that, for the challenging cases, the clustering step typically enriches the ten top ranked

  12. Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models

    KAUST Repository

    Chermak, Edrisse

    2016-11-15

    Correctly scoring protein-protein docking models to single out native-like ones is an open challenge. It is also an object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), the community-wide blind docking experiment. We introduced in the field the first pure consensus method, CONSRANK, which ranks models based on their ability to match the most conserved contacts in the ensemble they belong to. In CAPRI, scorers are asked to evaluate a set of available models and select the top ten ones, based on their own scoring approach. Scorers\\' performance is ranked based on the number of targets/interfaces for which they could provide at least one correct solution. In such terms, blind testing in CAPRI Round 30 (a joint prediction round with CASP11) has shown that critical cases for CONSRANK are represented by targets showing multiple interfaces or for which only a very small number of correct solutions are available. To address these challenging cases, CONSRANK has now been modified to include a contact-based clustering of the models as a preliminary step of the scoring process. We used an agglomerative hierarchical clustering based on the number of common inter-residue contacts within the models. Two criteria, with different thresholds, were explored in the cluster generation, setting either the number of common contacts or of total clusters. For each clustering approach, after selecting the top (most populated) ten clusters, CONSRANK was run on these clusters and the top-ranked model for each cluster was selected, in the limit of 10 models per target. We have applied our modified scoring approach, Clust-CONSRANK, to SCORE_SET, a set of CAPRI scoring models made recently available by CAPRI assessors, and to the subset of homodimeric targets in CAPRI Round 30 for which CONSRANK failed to include a correct solution within the ten selected models. Results show that, for the challenging cases, the clustering step typically enriches the ten top ranked

  13. A Performance-Prediction Model for PIC Applications on Clusters of Symmetric MultiProcessors: Validation with Hierarchical HPF+OpenMP Implementation

    Directory of Open Access Journals (Sweden)

    Sergio Briguglio

    2003-01-01

    Full Text Available A performance-prediction model is presented, which describes different hierarchical workload decomposition strategies for particle in cell (PIC codes on Clusters of Symmetric MultiProcessors. The devised workload decomposition is hierarchically structured: a higher-level decomposition among the computational nodes, and a lower-level one among the processors of each computational node. Several decomposition strategies are evaluated by means of the prediction model, with respect to the memory occupancy, the parallelization efficiency and the required programming effort. Such strategies have been implemented by integrating the high-level languages High Performance Fortran (at the inter-node stage and OpenMP (at the intra-node one. The details of these implementations are presented, and the experimental values of parallelization efficiency are compared with the predicted results.

  14. "Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping out and Hierarchical Cluster Analysis"

    Directory of Open Access Journals (Sweden)

    Alex J. Bowers

    2010-05-01

    Full Text Available School personnel currently lack an effective method to pattern and visually interpret disaggregated achievement data collected on students as a means to help inform decision making. This study, through the examination of longitudinal K-12 teacher assigned grading histories for entire cohorts of students from a school district (n=188, demonstrates a novel application of hierarchical cluster analysis and pattern visualization in which all data points collected on every student in a cohort can be patterned, visualized and interpreted to aid in data driven decision making by teachers and administrators. Additionally, as a proof-of-concept study, overall schooling outcomes, such as student dropout or taking a college entrance exam, are identified from the data patterns and compared to past methods of dropout identification as one example of the usefulness of the method. Hierarchical cluster analysis correctly identified over 80% of the students who dropped out using the entire student grade history patterns from either K-12 or K-8.

  15. THE USE OF CLUSTER ANALYSIS IN THE RESEARCH ON SHOPPING PREFERENCES REGARDING REGIONAL PRODUCTS FROM LUBELSKIE VOIVODESHIP

    Directory of Open Access Journals (Sweden)

    Jan Czeczelewski

    2017-03-01

    Full Text Available An increasing awareness of consumers is reflected in a growing demand for products which are manufactured in a particular way, with unique ingredients, or which are of a particular origin. The analysis of consumers’ preferences makes it possible to define factors which determine the purchase of regional products. The aim of the work was to identify factors which determine the purchase of regional products from Lubelskie Voivodeship on the basis of cluster analysis using Ward’s hierarchical agglomerative clustering method. The research was carried out in 2016 and included 383 individuals. Statistical analysis of results was conducted on the basis of frequency analysis and cluster analysis. According to the respondents, the most frequently purchased regional products included bakery products (47%, dairy products (35.3%, meat (33.3%, and alcoholic beverages (29.4%. Over 53% of the respondents claimed that the prices of regional products are too high, every third person (29.6% concluded that they are reasonable, while slightly over 3% of the respondents said they are low. Television and the Internet as well as close relatives and friends appeared to be the best forms of reaching the client with information concerning regional products when bringing them out on the market. However, the most common places where regional products were purchased were food fairs and festivals. Every second respondent purchased regional products at least once a month. Additionally, it was revealed that the consumers’ income was not a decisive factor when purchasing regional products. Despite financial stability, individuals who could be defined as “rich” in Polish conditions purchased regional products relatively rarely.

  16. Topology of the correlation networks among major currencies using hierarchical structure methods

    Science.gov (United States)

    Keskin, Mustafa; Deviren, Bayram; Kocakaplan, Yusuf

    2011-02-01

    We studied the topology of correlation networks among 34 major currencies using the concept of a minimal spanning tree and hierarchical tree for the full years of 2007-2008 when major economic turbulence occurred. We used the USD (US Dollar) and the TL (Turkish Lira) as numeraires in which the USD was the major currency and the TL was the minor currency. We derived a hierarchical organization and constructed minimal spanning trees (MSTs) and hierarchical trees (HTs) for the full years of 2007, 2008 and for the 2007-2008 period. We performed a technique to associate a value of reliability to the links of MSTs and HTs by using bootstrap replicas of data. We also used the average linkage cluster analysis for obtaining the hierarchical trees in the case of the TL as the numeraire. These trees are useful tools for understanding and detecting the global structure, taxonomy and hierarchy in financial data. We illustrated how the minimal spanning trees and their related hierarchical trees developed over a period of time. From these trees we identified different clusters of currencies according to their proximity and economic ties. The clustered structure of the currencies and the key currency in each cluster were obtained and we found that the clusters matched nicely with the geographical regions of corresponding countries in the world such as Asia or Europe. As expected the key currencies were generally those showing major economic activity.

  17. Hierarchical Sets: Analyzing Pangenome Structure through Scalable Set Visualizations

    DEFF Research Database (Denmark)

    Pedersen, Thomas Lin

    2017-01-01

    of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https...

  18. Weighted Clustering

    DEFF Research Database (Denmark)

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina

    2012-01-01

    We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...

  19. Hierarchical Neural Regression Models for Customer Churn Prediction

    Directory of Open Access Journals (Sweden)

    Golshan Mohammadi

    2013-01-01

    Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.

  20. Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.

  1. An improved Pearson's correlation proximity-based hierarchical clustering for mining biological association between genes.

    Science.gov (United States)

    Booma, P M; Prabhakaran, S; Dhanalakshmi, R

    2014-01-01

    Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.

  2. Tune Your Brown Clustering, Please

    DEFF Research Database (Denmark)

    Derczynski, Leon; Chester, Sean; Bøgh, Kenneth Sejdenfaden

    2015-01-01

    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly...

  3. A Novel Cluster Head Selection Algorithm Based on Fuzzy Clustering and Particle Swarm Optimization.

    Science.gov (United States)

    Ni, Qingjian; Pan, Qianqian; Du, Huimin; Cao, Cen; Zhai, Yuqing

    2017-01-01

    An important objective of wireless sensor network is to prolong the network life cycle, and topology control is of great significance for extending the network life cycle. Based on previous work, for cluster head selection in hierarchical topology control, we propose a solution based on fuzzy clustering preprocessing and particle swarm optimization. More specifically, first, fuzzy clustering algorithm is used to initial clustering for sensor nodes according to geographical locations, where a sensor node belongs to a cluster with a determined probability, and the number of initial clusters is analyzed and discussed. Furthermore, the fitness function is designed considering both the energy consumption and distance factors of wireless sensor network. Finally, the cluster head nodes in hierarchical topology are determined based on the improved particle swarm optimization. Experimental results show that, compared with traditional methods, the proposed method achieved the purpose of reducing the mortality rate of nodes and extending the network life cycle.

  4. A new application of hierarchical cluster analysis to investigate organic peaks in bulk mass spectra obtained with an Aerodyne Aerosol Mass Spectrometer

    Science.gov (United States)

    Middlebrook, A. M.; Marcolli, C.; Canagaratna, M. R.; Worsnop, D. R.; Bahreini, R.; de Gouw, J. A.; Warneke, C.; Goldan, P. D.; Kuster, W. C.; Williams, E. J.; Lerner, B. M.; Roberts, J. M.; Meagher, J. F.; Fehsenfeld, F. C.; Marchewka, M. L.; Bertman, S. B.

    2006-12-01

    We applied hierarchical cluster analysis to an Aerodyne aerosol mass spectrometer (AMS) bulk mass spectral dataset collected aboard the NOAA research vessel Ronald H. Brown during the 2002 New England Air Quality Study off the east coast of the United States. Emphasizing the organic peaks, the cluster analysis yielded a series of categories that are distinguishable with respect to their mass spectra and their occurrence as a function of time. The differences between the categories mainly arise from relative intensity changes rather than from the presence or absence of specific peaks. The most frequent category exhibits a strong signal at m/z 44 and represents oxidized organic matter probably originating from both anthropogenic as well as biogenic sources. On the basis of spectral and trace gas correlations, the second most common category with strong signals at m/z 29, 43, and 44 contains contributions from isoprene oxidation products. The third through the fifth most common categories have peak patterns characteristic of monoterpene oxidation products and were most frequently observed when air masses from monoterpene rich regions were sampled. Taken together, the second through the fifth most common categories represent on average 17% of the total organic mass that stems likely from biogenic sources during the ship's cruise. These numbers have to be viewed as lower limits since the most common category was attributed to anthropogenic sources for this calculation. The cluster analysis was also very effective in identifying a few contaminated mass spectra that were not removed during pre-processing. This study demonstrates that hierarchical clustering is a useful tool to analyze the complex patterns of the organic peaks in bulk aerosol mass spectra from a field study.

  5. Segmentation methodology for automated classification and differentiation of soft tissues in multiband images of high-resolution ultrasonic transmission tomography.

    Science.gov (United States)

    Jeong, Jeong-Won; Shin, Dae C; Do, Synho; Marmarelis, Vasilis Z

    2006-08-01

    This paper presents a novel segmentation methodology for automated classification and differentiation of soft tissues using multiband data obtained with the newly developed system of high-resolution ultrasonic transmission tomography (HUTT) for imaging biological organs. This methodology extends and combines two existing approaches: the L-level set active contour (AC) segmentation approach and the agglomerative hierarchical kappa-means approach for unsupervised clustering (UC). To prevent the trapping of the current iterative minimization AC algorithm in a local minimum, we introduce a multiresolution approach that applies the level set functions at successively increasing resolutions of the image data. The resulting AC clusters are subsequently rearranged by the UC algorithm that seeks the optimal set of clusters yielding the minimum within-cluster distances in the feature space. The presented results from Monte Carlo simulations and experimental animal-tissue data demonstrate that the proposed methodology outperforms other existing methods without depending on heuristic parameters and provides a reliable means for soft tissue differentiation in HUTT images.

  6. Market Attractiveness Classification of European Union Countries for Establishing Logistics Centres

    Directory of Open Access Journals (Sweden)

    Schüller David

    2016-10-01

    Full Text Available At present, enterprises are forced to serve their customers as quickly as possible if they want to succeed on turbulent global markets. Enterprises are looking for regions with high-quality infrastructure where they can establish new logistics centres that enable enterprises to serve their customers quickly. This paper focuses on the segmentation of the European Union market for enterprises that are willing to set up logistics centres in order to be able to distribute products fluently and more quickly to their customers in Europe. An agglomerative hierarchical clustering algorithm was used and Ward’s criterion applied for the purposes of market segmentation. A Logistic Performance Index and the indicator Dealing with Construction Permits were used as two relevant dimensions reflecting the market attractiveness of identified clusters. Based on the given statistical output, fundamental marketing concepts were formulated for each cluster composed of EU countries with similar characteristics.

  7. Brightest Cluster Galaxies in REXCESS Clusters

    Science.gov (United States)

    Haarsma, Deborah B.; Leisman, L.; Bruch, S.; Donahue, M.

    2009-01-01

    Most galaxy clusters contain a Brightest Cluster Galaxy (BCG) which is larger than the other cluster ellipticals and has a more extended profile. In the hierarchical model, the BCG forms through many galaxy mergers in the crowded center of the cluster, and thus its properties give insight into the assembly of the cluster as a whole. In this project, we are working with the Representative XMM-Newton Cluster Structure Survey (REXCESS) team (Boehringer et al 2007) to study BCGs in 33 X-ray luminous galaxy clusters, 0.055 < z < 0.183. We are imaging the BCGs in R band at the Southern Observatory for Astrophysical Research (SOAR) in Chile. In this poster, we discuss our methods and give preliminary measurements of the BCG magnitudes, morphology, and stellar mass. We compare these BCG properties with the properties of their host clusters, particularly of the X-ray emitting gas.

  8. Cluster evolution

    International Nuclear Information System (INIS)

    Schaeffer, R.

    1987-01-01

    The galaxy and cluster luminosity functions are constructed from a model of the mass distribution based on hierarchical clustering at an epoch where the matter distribution is non-linear. These luminosity functions are seen to reproduce the present distribution of objects as can be inferred from the observations. They can be used to deduce the redshift dependence of the cluster distribution and to extrapolate the observations towards the past. The predicted evolution of the cluster distribution is quite strong, although somewhat less rapid than predicted by the linear theory

  9. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    Science.gov (United States)

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  10. Cluster analysis of the clinical histories of cattle affected with bovine anaemia associated with Theileria orientalis Ikeda type infection.

    Science.gov (United States)

    Lawrence, K E; Forsyth, S F; Vaatstra, B L; McFadden, Amj; Pulford, D J; Govindaraju, K; Pomroy, W E

    2017-11-01

    AIM To determine the most commonly used words in the clinical histories of animals naturally infected with Theileria orientalis Ikeda type; whether these words differed between cases categorised by age, farm type or haematocrit (HCT), and if there was any clustering of the common words in relation to these categories. METHODS Clinical histories were transcribed for 605 cases of bovine anaemia associated with T. orientalis (TABA), that were submitted to laboratories with blood samples which tested positive for T. orientalis Ikeda type infection by PCR analysis, between October 2012 and November 2014. χ 2 tests were used to determine whether the proportion of submissions for each word was similar across the categories of HCT (normal, moderate anaemia or severe anaemia), farm type (dairy or beef) and age (young or old). Correspondence analysis (CA) was carried out on a contingency table of the frequency of the 28 most commonly used history words, cross-tabulated by age categories (young, old or unknown). Agglomerative hierarchical clustering, using Ward's method, was then performed on the coordinates from the correspondence analysis. RESULTS The six most commonly used history words were jaundice (204/605), lethargic (162/605), pale mucous membranes (161/605), cow (151/605), anaemia (147/605), and off milk (115/605). The proportion of cases with some history words differed between categories of age, farm type and HCT. The cluster analysis indicated that the recorded history words were grouped in two main clusters. The first included the words weight loss, tachycardia, pale mucous membranes, anaemia, lethargic and thin, and was associated with adult (pcluster included the words deaths, ill-thrift, calves, calf and diarrhoea, and was associated with young (pCluster analysis of words recorded in clinical histories submitted with blood samples from cases of TABA indicates that two potentially different disease syndromes were associated with T. orientalis Ikeda type

  11. Relation between financial market structure and the real economy: comparison between clustering methods.

    Science.gov (United States)

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T

    2015-01-01

    We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].

  12. Relation between financial market structure and the real economy: comparison between clustering methods.

    Directory of Open Access Journals (Sweden)

    Nicoló Musmeci

    Full Text Available We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].

  13. Application of hierarchical clustering method to classify of space-time rainfall patterns

    Science.gov (United States)

    Yu, Hwa-Lung; Chang, Tu-Je

    2010-05-01

    Understanding the local precipitation patterns is essential to the water resources management and flooding mitigation. The precipitation patterns can vary in space and time depending upon the factors from different spatial scales such as local topological changes and macroscopic atmospheric circulation. The spatiotemporal variation of precipitation in Taiwan is significant due to its complex terrain and its location at west pacific and subtropical area, where is the boundary between the pacific ocean and Asia continent with the complex interactions among the climatic processes. This study characterizes local-scale precipitation patterns by classifying the historical space-time precipitation records. We applied the hierarchical ascending clustering method to analyze the precipitation records from 1960 to 2008 at the six rainfall stations located in Lan-yang catchment at the northeast of the island. Our results identify the four primary space-time precipitation types which may result from distinct driving forces from the changes of atmospheric variables and topology at different space-time scales. This study also presents an important application of the statistical downscaling to combine large-scale upper-air circulation with local space-time precipitation patterns.

  14. CBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network

    OpenAIRE

    Rashed, M. G.; Kabir, M. Hasnat; Rahim, M. Sajjadur; Ullah, Sk. Enayet

    2012-01-01

    A new two layer hierarchical routing protocol called Cluster Based Hierarchical Routing Protocol (CBHRP) is proposed in this paper. It is an extension of LEACH routing protocol. We introduce cluster head-set idea for cluster-based routing where several clusters are formed with the deployed sensors to collect information from target field. On rotation basis, a head-set member receives data from the neighbor nodes and transmits the aggregated results to the distance base station. This protocol ...

  15. NOVEL CONTEXT-AWARE CLUSTERING WITH HIERARCHICAL ADDRESSING (CCHA) FOR THE INTERNET OF THINGS (IoT)

    DEFF Research Database (Denmark)

    Mahalle, Parikshit N.; Prasad, Neeli R.; Prasad, Ramjee

    2013-01-01

    As computing technology becomes more tightly coupled into dynamic and mobile world of the Internet of Things (IoT), security mechanism becomes more stringent, less flexible and intrusive. Scalability issue in the IoT makes Identity Management (IdM) of ubiquitous things more challenging. Forming ad......-hoc network, interaction between these nomadic devices to provide seamless service extend the need of new identi-ties to the things, addressing and IdM in the IoT. New identities and identifier format to alleviate the perfor-mance issue is introduced in this paper. This paper pre-sents novel Context......-aware Clustering with Hierarchical Addressing (CCHA) scheme for the things with new identifier format. Simulation results shows that CCHA achieves better performance with less energy expendi-ture, less end-to-end delay and more throughput. Results also show that CCHA significantly reduces the failure probability...

  16. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2012-01-01

    Full Text Available Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

  17. Modular networks with hierarchical organization

    Indian Academy of Sciences (India)

    Several networks occurring in real life have modular structures that are arranged in a hierarchical fashion. In this paper, we have proposed a model for such networks, using a stochastic generation method. Using this model we show that, the scaling relation between the clustering and degree of the nodes is not a necessary ...

  18. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  19. BULGELESS GIANT GALAXIES CHALLENGE OUR PICTURE OF GALAXY FORMATION BY HIERARCHICAL CLUSTERING ,

    International Nuclear Information System (INIS)

    Kormendy, John; Cornell, Mark E.; Drory, Niv; Bender, Ralf

    2010-01-01

    To better understand the prevalence of bulgeless galaxies in the nearby field, we dissect giant Sc-Scd galaxies with Hubble Space Telescope (HST) photometry and Hobby-Eberly Telescope (HET) spectroscopy. We use the HET High Resolution Spectrograph (resolution R ≡ λ/FWHM ≅ 15, 000) to measure stellar velocity dispersions in the nuclear star clusters and (pseudo)bulges of the pure-disk galaxies M 33, M 101, NGC 3338, NGC 3810, NGC 6503, and NGC 6946. The dispersions range from 20 ± 1 km s -1 in the nucleus of M 33 to 78 ± 2 km s -1 in the pseudobulge of NGC 3338. We use HST archive images to measure the brightness profiles of the nuclei and (pseudo)bulges in M 101, NGC 6503, and NGC 6946 and hence to estimate their masses. The results imply small mass-to-light ratios consistent with young stellar populations. These observations lead to two conclusions. (1) Upper limits on the masses of any supermassive black holes are M . ∼ 6 M sun in M 101 and M . ∼ 6 M sun in NGC 6503. (2) We show that the above galaxies contain only tiny pseudobulges that make up ∼ circ > 150 km s -1 , including M 101, NGC 6946, IC 342, and our Galaxy, show no evidence for a classical bulge. Four may contain small classical bulges that contribute 5%-12% of the light of the galaxy. Only four of the 19 giant galaxies are ellipticals or have classical bulges that contribute ∼1/3 of the galaxy light. We conclude that pure-disk galaxies are far from rare. It is hard to understand how bulgeless galaxies could form as the quiescent tail of a distribution of merger histories. Recognition of pseudobulges makes the biggest problem with cold dark matter galaxy formation more acute: How can hierarchical clustering make so many giant, pure-disk galaxies with no evidence for merger-built bulges? Finally, we emphasize that this problem is a strong function of environment: the Virgo cluster is not a puzzle, because more than 2/3 of its stellar mass is in merger remnants.

  20. BioCluster: Tool for Identification and Clustering of Enterobacteriaceae Based on Biochemical Data

    Directory of Open Access Journals (Sweden)

    Ahmed Abdullah

    2015-06-01

    Full Text Available Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is labor-intensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface (GUI tool named BioCluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering (HC and the Improved Hierarchical Clustering (IHC, a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1–47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that BioCluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/.

  1. Clustering Dycom

    KAUST Repository

    Minku, Leandro L.

    2017-10-06

    Background: Software Effort Estimation (SEE) can be formulated as an online learning problem, where new projects are completed over time and may become available for training. In this scenario, a Cross-Company (CC) SEE approach called Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving the high cost of collecting such training projects. However, Dycom relies on splitting CC projects into different subsets in order to create its CC models. Such splitting can have a significant impact on Dycom\\'s predictive performance. Aims: This paper investigates whether clustering methods can be used to help finding good CC splits for Dycom. Method: Dycom is extended to use clustering methods for creating the CC subsets. Three different clustering methods are investigated, namely Hierarchical Clustering, K-Means, and Expectation-Maximisation. Clustering Dycom is compared against the original Dycom with CC subsets of different sizes, based on four SEE databases. A baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number of CC subsets to be pre-defined, and a poor choice can negatively affect predictive performance. EM enables Dycom to automatically set the number of CC subsets while still maintaining or improving predictive performance with respect to the baseline WC model. Clustering Dycom with Hierarchical Clustering did not offer significant advantage in terms of predictive performance. Conclusion: Clustering methods can be an effective way to automatically generate Dycom\\'s CC subsets.

  2. Evaluation of Hierarchical Clustering Algorithms for Document Datasets

    National Research Council Canada - National Science Library

    Zhao, Ying; Karypis, George

    2002-01-01

    Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters...

  3. Comparative Study of Fatty Acids Profile in Eleven Wild Mushrooms of Boletacea and Russulaceae Families.

    Science.gov (United States)

    Dimitrijevic, Marija V; Mitic, Violeta D; Jovanovic, Olga P; Stankov Jovanovic, Vesna P; Nikolic, Jelena S; Petrovic, Goran M; Stojanovic, Gordana S

    2018-01-01

    Eleven species of wild mushrooms which belong to Boletaceae and Russulaceae families were examined by gas chromatography (GC) and gas chromatography-mass spectrometry (GC/MS) analysis for the presence of fatty acids. As far as we know, the fatty acid profiles of B. purpureus and B. rhodoxanthus were described for the first time. Twenty-six fatty acids were determined. Linoleic (19.5 - 72%), oleic (0.11 - 64%), palmitic (5.9 - 22%) and stearic acids (0.81 - 57%) were present in the highest contents. In all samples, unsaturated fatty acids dominate. Agglomerative hierarchical clustering was used to display the correlation between the fatty acids and their relationships with the mushroom species. Based on the fatty acids profile in the samples, the mushrooms can be divided into two families: Boletaceae and Russulaceae families, using cluster analysis. © 2018 Wiley-VHCA AG, Zurich, Switzerland.

  4. Competitive cluster growth in complex networks.

    Science.gov (United States)

    Moreira, André A; Paula, Demétrius R; Costa Filho, Raimundo N; Andrade, José S

    2006-06-01

    In this work we propose an idealized model for competitive cluster growth in complex networks. Each cluster can be thought of as a fraction of a community that shares some common opinion. Our results show that the cluster size distribution depends on the particular choice for the topology of the network of contacts among the agents. As an application, we show that the cluster size distributions obtained when the growth process is performed on hierarchical networks, e.g., the Apollonian network, have a scaling form similar to what has been observed for the distribution of a number of votes in an electoral process. We suggest that this similarity may be due to the fact that social networks involved in the electoral process may also possess an underlining hierarchical structure.

  5. Performance of clustering techniques for solving multi depot vehicle routing problem

    Directory of Open Access Journals (Sweden)

    Eliana M. Toro-Ocampo

    2016-01-01

    Full Text Available The vehicle routing problem considering multiple depots is classified as NP-hard. MDVRP determines simultaneously the routes of a set of vehicles and aims to meet a set of clients with a known demand. The objective function of the problem is to minimize the total distance traveled by the routes given that all customers must be served considering capacity constraints in depots and vehicles. This paper presents a hybrid methodology that combines agglomerative clustering techniques to generate initial solutions with an iterated local search algorithm (ILS to solve the problem. Although previous studies clustering methods have been proposed like strategies to generate initial solutions, in this work the search is intensified on the information generated after applying the clustering technique. Besides an extensive analysis on the performance of techniques, and their effect in the final solution is performed. The operation of the proposed methodology is feasible and effective to solve the problem regarding the quality of the answers and computational times obtained on request evaluated literature

  6. Chemical Fingerprint and Quantitative Analysis for the Quality Evaluation of Platycladi cacumen by Ultra-performance Liquid Chromatography Coupled with Hierarchical Cluster Analysis.

    Science.gov (United States)

    Shan, Mingqiu; Li, Sam Fong Yau; Yu, Sheng; Qian, Yan; Guo, Shuchen; Zhang, Li; Ding, Anwei

    2018-01-01

    Platycladi cacumen (dried twigs and leaves of Platycladus orientalis (L.) Franco) is a frequently utilized Chinese medicinal herb. To evaluate the quality of the phytomedcine, an ultra-performance liquid chromatographic method with diode array detection was established for chemical fingerprinting and quantitative analysis. In this study, 27 batches of P. cacumen from different regions were collected for analysis. A chemical fingerprint with 20 common peaks was obtained using Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine (Version 2004A). Among these 20 components, seven flavonoids (myricitrin, isoquercitrin, quercitrin, afzelin, cupressuflavone, amentoflavone and hinokiflavone) were identified and determined simultaneously. In the method validation, the seven analytes showed good regressions (R ≥ 0.9995) within linear ranges and good recoveries from 96.4% to 103.3%. Furthermore, with the contents of these seven flavonoids, hierarchical clustering analysis was applied to distinguish the 27 batches into five groups. The chemometric results showed that these groups were almost consistent with geographical positions and climatic conditions of the production regions. Integrating fingerprint analysis, simultaneous determination and hierarchical clustering analysis, the established method is rapid, sensitive, accurate and readily applicable, and also provides a significant foundation for quality control of P. cacumen efficiently. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  7. Clustering of resting state networks.

    Directory of Open Access Journals (Sweden)

    Megan H Lee

    Full Text Available The goal of the study was to demonstrate a hierarchical structure of resting state activity in the healthy brain using a data-driven clustering algorithm.The fuzzy-c-means clustering algorithm was applied to resting state fMRI data in cortical and subcortical gray matter from two groups acquired separately, one of 17 healthy individuals and the second of 21 healthy individuals. Different numbers of clusters and different starting conditions were used. A cluster dispersion measure determined the optimal numbers of clusters. An inner product metric provided a measure of similarity between different clusters. The two cluster result found the task-negative and task-positive systems. The cluster dispersion measure was minimized with seven and eleven clusters. Each of the clusters in the seven and eleven cluster result was associated with either the task-negative or task-positive system. Applying the algorithm to find seven clusters recovered previously described resting state networks, including the default mode network, frontoparietal control network, ventral and dorsal attention networks, somatomotor, visual, and language networks. The language and ventral attention networks had significant subcortical involvement. This parcellation was consistently found in a large majority of algorithm runs under different conditions and was robust to different methods of initialization.The clustering of resting state activity using different optimal numbers of clusters identified resting state networks comparable to previously obtained results. This work reinforces the observation that resting state networks are hierarchically organized.

  8. iHAT: interactive Hierarchical Aggregation Table for Genetic Association Data

    Directory of Open Access Journals (Sweden)

    Heinrich Julian

    2012-05-01

    Full Text Available Abstract In the search for single-nucleotide polymorphisms which influence the observable phenotype, genome wide association studies have become an important technique for the identification of associations between genotype and phenotype of a diverse set of sequence-based data. We present a methodology for the visual assessment of single-nucleotide polymorphisms using interactive hierarchical aggregation techniques combined with methods known from traditional sequence browsers and cluster heatmaps. Our tool, the interactive Hierarchical Aggregation Table (iHAT, facilitates the visualization of multiple sequence alignments, associated metadata, and hierarchical clusterings. Different color maps and aggregation strategies as well as filtering options support the user in finding correlations between sequences and metadata. Similar to other visualizations such as parallel coordinates or heatmaps, iHAT relies on the human pattern-recognition ability for spotting patterns that might indicate correlation or anticorrelation. We demonstrate iHAT using artificial and real-world datasets for DNA and protein association studies as well as expression Quantitative Trait Locus data.

  9. An Effective Approach for Clustering InhA Molecular Dynamics Trajectory Using Substrate-Binding Cavity Features.

    Directory of Open Access Journals (Sweden)

    Renata De Paris

    Full Text Available Protein receptor conformations, obtained from molecular dynamics (MD simulations, have become a promising treatment of its explicit flexibility in molecular docking experiments applied to drug discovery and development. However, incorporating the entire ensemble of MD conformations in docking experiments to screen large candidate compound libraries is currently an unfeasible task. Clustering algorithms have been widely used as a means to reduce such ensembles to a manageable size. Most studies investigate different algorithms using pairwise Root-Mean Square Deviation (RMSD values for all, or part of the MD conformations. Nevertheless, the RMSD only may not be the most appropriate gauge to cluster conformations when the target receptor has a plastic active site, since they are influenced by changes that occur on other parts of the structure. Hence, we have applied two partitioning methods (k-means and k-medoids and four agglomerative hierarchical methods (Complete linkage, Ward's, Unweighted Pair Group Method and Weighted Pair Group Method to analyze and compare the quality of partitions between a data set composed of properties from an enzyme receptor substrate-binding cavity and two data sets created using different RMSD approaches. Ensembles of representative MD conformations were generated by selecting a medoid of each group from all partitions analyzed. We investigated the performance of our new method for evaluating binding conformation of drug candidates to the InhA enzyme, which were performed by cross-docking experiments between a 20 ns MD trajectory and 20 different ligands. Statistical analyses showed that the novel ensemble, which is represented by only 0.48% of the MD conformations, was able to reproduce 75% of all dynamic behaviors within the binding cavity for the docking experiments performed. Moreover, this new approach not only outperforms the other two RMSD-clustering solutions, but it also shows to be a promising strategy to

  10. ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time.

    Science.gov (United States)

    Cai, Yunpeng; Zheng, Wei; Yao, Jin; Yang, Yujie; Mai, Volker; Mao, Qi; Sun, Yijun

    2017-04-01

    The rapid development of sequencing technology has led to an explosive accumulation of genomic sequence data. Clustering is often the first step to perform in sequence analysis, and hierarchical clustering is one of the most commonly used approaches for this purpose. However, it is currently computationally expensive to perform hierarchical clustering of extremely large sequence datasets due to its quadratic time and space complexities. In this paper we developed a new algorithm called ESPRIT-Forest for parallel hierarchical clustering of sequences. The algorithm achieves subquadratic time and space complexity and maintains a high clustering accuracy comparable to the standard method. The basic idea is to organize sequences into a pseudo-metric based partitioning tree for sub-linear time searching of nearest neighbors, and then use a new multiple-pair merging criterion to construct clusters in parallel using multiple threads. The new algorithm was tested on the human microbiome project (HMP) dataset, currently one of the largest published microbial 16S rRNA sequence dataset. Our experiment demonstrated that with the power of parallel computing it is now compu- tationally feasible to perform hierarchical clustering analysis of tens of millions of sequences. The software is available at http://www.acsu.buffalo.edu/∼yijunsun/lab/ESPRIT-Forest.html.

  11. [Cluster analysis in biomedical researches].

    Science.gov (United States)

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

    2013-01-01

    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.

  12. Hierarchical MAS based control strategy for microgrid

    Energy Technology Data Exchange (ETDEWEB)

    Xiao, Z.; Li, T.; Huang, M.; Shi, J.; Yang, J.; Yu, J. [School of Information Science and Engineering, Yunnan University, Kunming 650091 (China); Xiao, Z. [School of Electrical and Electronic Engineering, Nanyang Technological University, Western Catchment Area, 639798 (Singapore); Wu, W. [Communication Branch of Yunnan Power Grid Corporation, Kunming, Yunnan 650217 (China)

    2010-09-15

    Microgrids have become a hot topic driven by the dual pressures of environmental protection concerns and the energy crisis. In this paper, a challenge for the distributed control of a modern electric grid incorporating clusters of residential microgrids is elaborated and a hierarchical multi-agent system (MAS) is proposed as a solution. The issues of how to realize the hierarchical MAS and how to improve coordination and control strategies are discussed. Based on MATLAB and ZEUS platforms, bilateral switching between grid-connected mode and island mode is performed under control of the proposed MAS to enhance and support its effectiveness. (authors)

  13. Hierarchical cluster analysis of technical replicates to identify interferents in untargeted mass spectrometry metabolomics.

    Science.gov (United States)

    Caesar, Lindsay K; Kvalheim, Olav M; Cech, Nadja B

    2018-08-27

    Mass spectral data sets often contain experimental artefacts, and data filtering prior to statistical analysis is crucial to extract reliable information. This is particularly true in untargeted metabolomics analyses, where the analyte(s) of interest are not known a priori. It is often assumed that chemical interferents (i.e. solvent contaminants such as plasticizers) are consistent across samples, and can be removed by background subtraction from blank injections. On the contrary, it is shown here that chemical contaminants may vary in abundance across each injection, potentially leading to their misidentification as relevant sample components. With this metabolomics study, we demonstrate the effectiveness of hierarchical cluster analysis (HCA) of replicate injections (technical replicates) as a methodology to identify chemical interferents and reduce their contaminating contribution to metabolomics models. Pools of metabolites with varying complexity were prepared from the botanical Angelica keiskei Koidzumi and spiked with known metabolites. Each set of pools was analyzed in triplicate and at multiple concentrations using ultraperformance liquid chromatography coupled to mass spectrometry (UPLC-MS). Before filtering, HCA failed to cluster replicates in the data sets. To identify contaminant peaks, we developed a filtering process that evaluated the relative peak area variance of each variable within triplicate injections. These interferent peaks were found across all samples, but did not show consistent peak area from injection to injection, even when evaluating the same chemical sample. This filtering process identified 128 ions that appear to originate from the UPLC-MS system. Data sets collected for a high number of pools with comparatively simple chemical composition were highly influenced by these chemical interferents, as were samples that were analyzed at a low concentration. When chemical interferent masses were removed, technical replicates clustered in

  14. The association between content of the elements S, Cl, K, Fe, Cu, Zn and Br in normal and cirrhotic liver tissue from Danes and Greenlandic Inuit examined by dual hierarchical clustering analysis.

    Science.gov (United States)

    Laursen, Jens; Milman, Nils; Pind, Niels; Pedersen, Henrik; Mulvad, Gert

    2014-01-01

    Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. Normal liver samples from 45 Greenlandic Inuit, median age 60 years and from 71 Danes, median age 61 years. Cirrhotic liver samples from 27 Danes, median age 71 years. Element content was measured using X-ray fluorescence spectrometry. Dual hierarchical clustering analysis, creating a dual dendrogram, one clustering element contents according to calculated similarities, one clustering elements according to correlation coefficients between the element contents, both using Euclidian distance and Ward Procedure. One dendrogram separated subjects in 7 clusters showing no differences in ethnicity, gender or age. The analysis discriminated between elements in normal and cirrhotic livers. The other dendrogram clustered elements in four clusters: sulphur and chlorine; copper and bromine; potassium and zinc; iron. There were significant correlations between the elements in normal liver samples: S was associated with Cl, K, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical clustering analysis incorporates all elements at the same time and can be used to examine the linkage and interplay between multiple elements in tissue samples. Copyright © 2013 Elsevier GmbH. All rights reserved.

  15. Examination of the regional distribution of minor and trace elements in normal human brain by PIXE and chemometric techniques

    International Nuclear Information System (INIS)

    Maenhaut, W.; Hebbrecht, G.; Reuck, J. de

    1993-01-01

    Particle-induced X-ray emission (PIXE) was used to measure two minor and six trace elements, i.e. K, Ca, Mn, Fe, Cu, Zn, Se, and Rb, in up to 50 different structures (regions) of brains from Belgian individuals without neurological disorders. The data matrix with the mean dry-weight elemental concentrations and mean wet-to-dry weight ratio (means over 18 brains) for the various structures was subjected to two chemometric techniques, i.e., VARIMAX rotated absolute principal component analysis (APCA) and hierarchical cluster analysis. Three components were identified by APCA: Components 1 and 3 represented aqueous fractions of the brain (respectively the intracellular and extracellular fluid), whereas component 2 apparently represented the solid brain fraction. The elements K, Cu, Zn, Se, and Rb were predominantly attributed to component 1, Ca to component 3, and Fe to component 2. In the hierarchical cluster analysis seven different agglomerative cluster strategies were compared. The dendrograms obtained from the furthest neighbor and Ward's error sum strategy were virtually identical, and they consisted of two large clusters with 30 and 16 structures, respectively. The first cluster included all gray matter structures, while the second comprised all white matter. Furthermore, structures involved in the same physiological function or morphologically similar regions often conglomerated in one subcluster. This strongly suggests that there is some relationship between the trace element profile of a brain structure and its function. (orig.)

  16. Hierarchical sets: analyzing pangenome structure through scalable set visualizations

    Science.gov (United States)

    2017-01-01

    Abstract Motivation: The increase in available microbial genome sequences has resulted in an increase in the size of the pangenomes being analyzed. Current pangenome visualizations are not intended for the pangenome sizes possible today and new approaches are necessary in order to convert the increase in available information to increase in knowledge. As the pangenome data structure is essentially a collection of sets we explore the potential for scalable set visualization as a tool for pangenome analysis. Results: We present a new hierarchical clustering algorithm based on set arithmetics that optimizes the intersection sizes along the branches. The intersection and union sizes along the hierarchy are visualized using a composite dendrogram and icicle plot, which, in pangenome context, shows the evolution of pangenome and core size along the evolutionary hierarchy. Outlying elements, i.e. elements whose presence pattern do not correspond with the hierarchy, can be visualized using hierarchical edge bundles. When applied to pangenome data this plot shows putative horizontal gene transfers between the genomes and can highlight relationships between genomes that is not represented by the hierarchy. We illustrate the utility of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. Availability and Implementation: The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https://cran.r-project.org/web/packages/hierarchicalSets) Contact: thomasp85@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28130242

  17. Revisiting the variation of clustering coefficient of biological networks suggests new modular structure.

    Science.gov (United States)

    Hao, Dapeng; Ren, Cong; Li, Chuanxing

    2012-05-01

    A central idea in biology is the hierarchical organization of cellular processes. A commonly used method to identify the hierarchical modular organization of network relies on detecting a global signature known as variation of clustering coefficient (so-called modularity scaling). Although several studies have suggested other possible origins of this signature, it is still widely used nowadays to identify hierarchical modularity, especially in the analysis of biological networks. Therefore, a further and systematical investigation of this signature for different types of biological networks is necessary. We analyzed a variety of biological networks and found that the commonly used signature of hierarchical modularity is actually the reflection of spoke-like topology, suggesting a different view of network architecture. We proved that the existence of super-hubs is the origin that the clustering coefficient of a node follows a particular scaling law with degree k in metabolic networks. To study the modularity of biological networks, we systematically investigated the relationship between repulsion of hubs and variation of clustering coefficient. We provided direct evidences for repulsion between hubs being the underlying origin of the variation of clustering coefficient, and found that for biological networks having no anti-correlation between hubs, such as gene co-expression network, the clustering coefficient doesn't show dependence of degree. Here we have shown that the variation of clustering coefficient is neither sufficient nor exclusive for a network to be hierarchical. Our results suggest the existence of spoke-like modules as opposed to "deterministic model" of hierarchical modularity, and suggest the need to reconsider the organizational principle of biological hierarchy.

  18. Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm

    DEFF Research Database (Denmark)

    Grotkjær, Thomas; Winther, Ole; Regenberg, Birgitte

    2006-01-01

    Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods...... analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset...

  19. Parallel Implementation of the Recursive Approximation of an Unsupervised Hierarchical Segmentation Algorithm. Chapter 5

    Science.gov (United States)

    Tilton, James C.; Plaza, Antonio J. (Editor); Chang, Chein-I. (Editor)

    2008-01-01

    The hierarchical image segmentation algorithm (referred to as HSEG) is a hybrid of hierarchical step-wise optimization (HSWO) and constrained spectral clustering that produces a hierarchical set of image segmentations. HSWO is an iterative approach to region grooving segmentation in which the optimal image segmentation is found at N(sub R) regions, given a segmentation at N(sub R+1) regions. HSEG's addition of constrained spectral clustering makes it a computationally intensive algorithm, for all but, the smallest of images. To counteract this, a computationally efficient recursive approximation of HSEG (called RHSEG) has been devised. Further improvements in processing speed are obtained through a parallel implementation of RHSEG. This chapter describes this parallel implementation and demonstrates its computational efficiency on a Landsat Thematic Mapper test scene.

  20. D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure

    Science.gov (United States)

    Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.

    2016-06-01

    Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  1. Hierarchical silica particles by dynamic multicomponent assembly

    DEFF Research Database (Denmark)

    Wu, Z. W.; Hu, Q. Y.; Pang, J. B.

    2005-01-01

    Abstract: Aerosol-assisted assembly of mesoporous silica particles with hierarchically controllable pore structure has been prepared using cetyltrimethylammonium bromide (CTAB) and poly(propylene oxide) (PPO, H[OCH(CH3)CH2],OH) as co-templates. Addition of the hydrophobic PPO significantly...... influences the delicate hydrophilic-hydrophobic balance in the well-studied CTAB-silicate co-assembling system, resulting in various mesostructures (such as hexagonal, lamellar, and hierarchical structure). The co-assembly of CTAB, silicate clusters, and a low-molecular-weight PPO (average M-n 425) results...... in a uniform lamellar structure, while the use of a high-molecular-weight PPO (average M-n 2000), which is more hydrophobic, leads to the formation of hierarchical pore structure that contains meso-meso or meso-macro pore structure. The role of PPO additives on the mesostructure evolution in the CTAB...

  2. Musical genres: beating to the rhythms of different drums

    Science.gov (United States)

    Correa, Debora C.; Saito, Jose H.; Costa, Luciano da F.

    2010-05-01

    Online music databases have increased significantly as a consequence of the rapid growth of the Internet and digital audio, requiring the development of faster and more efficient tools for music content analysis. Musical genres are widely used to organize music collections. In this paper, the problem of automatic single and multi-label music genre classification is addressed by exploring rhythm-based features obtained from a respective complex network representation. A Markov model is built in order to analyse the temporal sequence of rhythmic notation events. Feature analysis is performed by using two multivariate statistical approaches: principal components analysis (unsupervised) and linear discriminant analysis (supervised). Similarly, two classifiers are applied in order to identify the category of rhythms: parametric Bayesian classifier under the Gaussian hypothesis (supervised) and agglomerative hierarchical clustering (unsupervised). Qualitative results obtained by using the kappa coefficient and the obtained clusters corroborated the effectiveness of the proposed method.

  3. Musical genres: beating to the rhythms of different drums

    Energy Technology Data Exchange (ETDEWEB)

    Correa, Debora C; Costa, Luciano da F [Instituto de Fisica de Sao Carlos - Universidade de Sao Paulo, Av. Trabalhador Sao Carlense 400, Caixa Postal 369, CEP 13560-970, Sao Carlos, Sao Paulo (Brazil); Saito, Jose H, E-mail: deboracorrea@ursa.ifsc.usp.b, E-mail: luciano@ursa.ifsc.usp.b [Departamento de Computacao-Universidade Federal de Sao Carlos, Rodovia Washington Luis, km 235, SP-310, CEP 13565-905, Sao Carlos, Sao Paulo (Brazil)

    2010-05-15

    Online music databases have increased significantly as a consequence of the rapid growth of the Internet and digital audio, requiring the development of faster and more efficient tools for music content analysis. Musical genres are widely used to organize music collections. In this paper, the problem of automatic single and multi-label music genre classification is addressed by exploring rhythm-based features obtained from a respective complex network representation. A Markov model is built in order to analyse the temporal sequence of rhythmic notation events. Feature analysis is performed by using two multivariate statistical approaches: principal components analysis (unsupervised) and linear discriminant analysis (supervised). Similarly, two classifiers are applied in order to identify the category of rhythms: parametric Bayesian classifier under the Gaussian hypothesis (supervised) and agglomerative hierarchical clustering (unsupervised). Qualitative results obtained by using the kappa coefficient and the obtained clusters corroborated the effectiveness of the proposed method.

  4. Musical genres: beating to the rhythms of different drums

    International Nuclear Information System (INIS)

    Correa, Debora C; Costa, Luciano da F; Saito, Jose H

    2010-01-01

    Online music databases have increased significantly as a consequence of the rapid growth of the Internet and digital audio, requiring the development of faster and more efficient tools for music content analysis. Musical genres are widely used to organize music collections. In this paper, the problem of automatic single and multi-label music genre classification is addressed by exploring rhythm-based features obtained from a respective complex network representation. A Markov model is built in order to analyse the temporal sequence of rhythmic notation events. Feature analysis is performed by using two multivariate statistical approaches: principal components analysis (unsupervised) and linear discriminant analysis (supervised). Similarly, two classifiers are applied in order to identify the category of rhythms: parametric Bayesian classifier under the Gaussian hypothesis (supervised) and agglomerative hierarchical clustering (unsupervised). Qualitative results obtained by using the kappa coefficient and the obtained clusters corroborated the effectiveness of the proposed method.

  5. A local adaptive algorithm for emerging scale-free hierarchical networks

    International Nuclear Information System (INIS)

    Gomez Portillo, I J; Gleiser, P M

    2010-01-01

    In this work we study a growing network model with chaotic dynamical units that evolves using a local adaptive rewiring algorithm. Using numerical simulations we show that the model allows for the emergence of hierarchical networks. First, we show that the networks that emerge with the algorithm present a wide degree distribution that can be fitted by a power law function, and thus are scale-free networks. Using the LaNet-vi visualization tool we present a graphical representation that reveals a central core formed only by hubs, and also show the presence of a preferential attachment mechanism. In order to present a quantitative analysis of the hierarchical structure we analyze the clustering coefficient. In particular, we show that as the network grows the clustering becomes independent of system size, and also presents a power law decay as a function of the degree. Finally, we compare our results with a similar version of the model that has continuous non-linear phase oscillators as dynamical units. The results show that local interactions play a fundamental role in the emergence of hierarchical networks.

  6. Ultrathin mesoporous Co_3O_4 nanosheets-constructed hierarchical clusters as high rate capability and long life anode materials for lithium-ion batteries

    International Nuclear Information System (INIS)

    Wu, Shengming; Xia, Tian; Wang, Jingping; Lu, Feifei; Xu, Chunbo; Zhang, Xianfa; Huo, Lihua; Zhao, Hui

    2017-01-01

    Graphical abstract: Ultrathin mesoporous Co_3O_4 nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment. When tested as anode materials for LIBs, UMCN-HCs achieve high reversible capacity, good long cycling life, and rate capability. - Highlights: • UMCN-HCs show high capacity, excellent stability, and good rate capability. • UMCN-HCs retain a capacity of 1067 mAh g"−"1 after 100 cycles at 100 mA g"−"1. • UMCN-HCs deliver a capacity of 507 mAh g"−"1 after 500 cycles at 2 A g"−"1. - Abstract: Herein, Ultrathin mesoporous Co_3O_4 nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment at 600 °C in air. The products consist of cluster-like Co_3O_4 microarchitectures, which are assembled by numerous ultrathin mesoporous Co_3O_4 nanosheets. When tested as anode materials for lithium-ion batteries, UMCN-HCs deliver a high reversible capacity of 1067 mAh g"−"1 at a current density of 100 mA g"−"1 after 100 cycles. Even at 2 A g"−"1, a stable capacity as high as 507 mAh g"−"1 can be achieved after 500 cycles. The high reversible capacity, excellent cycling stability, and good rate capability of UMCN-HCs may be attributed to their mesoporous sheet-like nanostructure. The sheet-layered structure of UMCN-HCs may buffer the volume change during the lithiation-delithiation process, and the mesoporous characteristic make lithium-ion transfer more easily at the interface between the active electrode and the electrolyte.

  7. Revisiting the variation of clustering coefficient of biological networks suggests new modular structure

    Directory of Open Access Journals (Sweden)

    Hao Dapeng

    2012-05-01

    Full Text Available Abstract Background A central idea in biology is the hierarchical organization of cellular processes. A commonly used method to identify the hierarchical modular organization of network relies on detecting a global signature known as variation of clustering coefficient (so-called modularity scaling. Although several studies have suggested other possible origins of this signature, it is still widely used nowadays to identify hierarchical modularity, especially in the analysis of biological networks. Therefore, a further and systematical investigation of this signature for different types of biological networks is necessary. Results We analyzed a variety of biological networks and found that the commonly used signature of hierarchical modularity is actually the reflection of spoke-like topology, suggesting a different view of network architecture. We proved that the existence of super-hubs is the origin that the clustering coefficient of a node follows a particular scaling law with degree k in metabolic networks. To study the modularity of biological networks, we systematically investigated the relationship between repulsion of hubs and variation of clustering coefficient. We provided direct evidences for repulsion between hubs being the underlying origin of the variation of clustering coefficient, and found that for biological networks having no anti-correlation between hubs, such as gene co-expression network, the clustering coefficient doesn’t show dependence of degree. Conclusions Here we have shown that the variation of clustering coefficient is neither sufficient nor exclusive for a network to be hierarchical. Our results suggest the existence of spoke-like modules as opposed to “deterministic model” of hierarchical modularity, and suggest the need to reconsider the organizational principle of biological hierarchy.

  8. Performance quantification of clustering algorithms for false positive removal in fMRI by ROC curves

    Directory of Open Access Journals (Sweden)

    André Salles Cunha Peres

    Full Text Available Abstract Introduction Functional magnetic resonance imaging (fMRI is a non-invasive technique that allows the detection of specific cerebral functions in humans based on hemodynamic changes. The contrast changes are about 5%, making visual inspection impossible. Thus, statistic strategies are applied to infer which brain region is engaged in a task. However, the traditional methods like general linear model and cross-correlation utilize voxel-wise calculation, introducing a lot of false-positive data. So, in this work we tested post-processing cluster algorithms to diminish the false-positives. Methods In this study, three clustering algorithms (the hierarchical cluster, k-means and self-organizing maps were tested and compared for false-positive removal in the post-processing of cross-correlation analyses. Results Our results showed that the hierarchical cluster presented the best performance to remove the false positives in fMRI, being 2.3 times more accurate than k-means, and 1.9 times more accurate than self-organizing maps. Conclusion The hierarchical cluster presented the best performance in false-positive removal because it uses the inconsistency coefficient threshold, while k-means and self-organizing maps utilize a priori cluster number (centroids and neurons number; thus, the hierarchical cluster avoids clustering scattered voxels, as the inconsistency coefficient threshold allows only the voxels to be clustered that are at a minimum distance to some cluster.

  9. A Simple Hierarchical Pooling Data Structure for Loop Closure

    Science.gov (United States)

    2016-10-16

    performance empirically on the KITTI [9], Oxford [6] and TUM RGB- D [29] datasets, as well as demonstrate extensions to general image retrieval on the...of a BoW where each word is an element of a dictionary of descriptors obtained off-line by hierarchical k-means clustering, with each word weighted by...to the inverse docu- ment frequency. This standard pipeline, with different clustering procedures to generate the dictionary and different features

  10. Phenotypes of asthma in low-income children and adolescents: cluster analysis

    Directory of Open Access Journals (Sweden)

    Anna Lucia Barros Cabral

    Full Text Available ABSTRACT Objective: Studies characterizing asthma phenotypes have predominantly included adults or have involved children and adolescents in developed countries. Therefore, their applicability in other populations, such as those of developing countries, remains indeterminate. Our objective was to determine how low-income children and adolescents with asthma in Brazil are distributed across a cluster analysis. Methods: We included 306 children and adolescents (6-18 years of age with a clinical diagnosis of asthma and under medical treatment for at least one year of follow-up. At enrollment, all the patients were clinically stable. For the cluster analysis, we selected 20 variables commonly measured in clinical practice and considered important in defining asthma phenotypes. Variables with high multicollinearity were excluded. A cluster analysis was applied using a twostep agglomerative test and log-likelihood distance measure. Results: Three clusters were defined for our population. Cluster 1 (n = 94 included subjects with normal pulmonary function, mild eosinophil inflammation, few exacerbations, later age at asthma onset, and mild atopy. Cluster 2 (n = 87 included those with normal pulmonary function, a moderate number of exacerbations, early age at asthma onset, more severe eosinophil inflammation, and moderate atopy. Cluster 3 (n = 108 included those with poor pulmonary function, frequent exacerbations, severe eosinophil inflammation, and severe atopy. Conclusions: Asthma was characterized by the presence of atopy, number of exacerbations, and lung function in low-income children and adolescents in Brazil. The many similarities with previous cluster analyses of phenotypes indicate that this approach shows good generalizability.

  11. Hierarchical structure of the European countries based on debts as a percentage of GDP during the 2000-2011 period

    Science.gov (United States)

    Kantar, Ersin; Deviren, Bayram; Keskin, Mustafa

    2014-11-01

    We investigate hierarchical structures of the European countries by using debt as a percentage of Gross Domestic Product (GDP) of the countries as they change over a certain period of time. We obtain the topological properties among the countries based on debt as a percentage of GDP of European countries over the period 2000-2011 by using the concept of hierarchical structure methods (minimal spanning tree, (MST) and hierarchical tree, (HT)). This period is also divided into two sub-periods related to 2004 enlargement of the European Union, namely 2000-2004 and 2005-2011, in order to test various time-window and observe the temporal evolution. The bootstrap techniques is applied to see a value of statistical reliability of the links of the MSTs and HTs. The clustering linkage procedure is also used to observe the cluster structure more clearly. From the structural topologies of these trees, we identify different clusters of countries according to their level of debts. Our results show that by the debt crisis, the less and most affected Eurozone’s economies are formed as a cluster with each other in the MSTs and hierarchical trees.

  12. Quantum annealing for combinatorial clustering

    Science.gov (United States)

    Kumar, Vaibhaw; Bass, Gideon; Tomlin, Casey; Dulny, Joseph

    2018-02-01

    Clustering is a powerful machine learning technique that groups "similar" data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of within-the-cluster distances between points. The straightforward approach involves examining all the possible assignments of points to each of the clusters. This approach guarantees the solution will be a global minimum; however, the number of possible assignments scales quickly with the number of data points and becomes computationally intractable even for very small datasets. In order to circumvent this issue, cost function minima are found using popular local search-based heuristic approaches such as k-means and hierarchical clustering. Due to their greedy nature, such techniques do not guarantee that a global minimum will be found and can lead to sub-optimal clustering assignments. Other classes of global search-based techniques, such as simulated annealing, tabu search, and genetic algorithms, may offer better quality results but can be too time-consuming to implement. In this work, we describe how quantum annealing can be used to carry out clustering. We map the clustering objective to a quadratic binary optimization problem and discuss two clustering algorithms which are then implemented on commercially available quantum annealing hardware, as well as on a purely classical solver "qbsolv." The first algorithm assigns N data points to K clusters, and the second one can be used to perform binary clustering in a hierarchical manner. We present our results in the form of benchmarks against well-known k-means clustering and discuss the advantages and disadvantages of the proposed techniques.

  13. Statistical measures of galaxy clustering

    International Nuclear Information System (INIS)

    Porter, D.H.

    1988-01-01

    Consideration is given to the large-scale distribution of galaxies and ways in which this distribution may be statistically measured. Galaxy clustering is hierarchical in nature, so that the positions of clusters of galaxies are themselves spatially clustered. A simple identification of groups of galaxies would be an inadequate description of the true richness of galaxy clustering. Current observations of the large-scale structure of the universe and modern theories of cosmology may be studied with a statistical description of the spatial and velocity distributions of galaxies. 8 refs

  14. Cities and regions in Britain through hierarchical percolation

    Science.gov (United States)

    Arcaute, Elsa; Molinero, Carlos; Hatna, Erez; Murcio, Roberto; Vargas-Ruiz, Camilo; Masucci, A. Paolo; Batty, Michael

    2016-04-01

    Urban systems present hierarchical structures at many different scales. These are observed as administrative regional delimitations which are the outcome of complex geographical, political and historical processes which leave almost indelible footprints on infrastructure such as the street network. In this work, we uncover a set of hierarchies in Britain at different scales using percolation theory on the street network and on its intersections which are the primary points of interaction and urban agglomeration. At the larger scales, the observed hierarchical structures can be interpreted as regional fractures of Britain, observed in various forms, from natural boundaries, such as National Parks, to regional divisions based on social class and wealth such as the well-known North-South divide. At smaller scales, cities are generated through recursive percolations on each of the emerging regional clusters. We examine the evolution of the morphology of the system as a whole, by measuring the fractal dimension of the clusters at each distance threshold in the percolation. We observe that this reaches a maximum plateau at a specific distance. The clusters defined at this distance threshold are in excellent correspondence with the boundaries of cities recovered from satellite images, and from previous methods using population density.

  15. Complexity of major UK companies between 2006 and 2010: Hierarchical structure method approach

    Science.gov (United States)

    Ulusoy, Tolga; Keskin, Mustafa; Shirvani, Ayoub; Deviren, Bayram; Kantar, Ersin; Çaǧrı Dönmez, Cem

    2012-11-01

    This study reports on topology of the top 40 UK companies that have been analysed for predictive verification of markets for the period 2006-2010, applying the concept of minimal spanning tree and hierarchical tree (HT) analysis. Construction of the minimal spanning tree (MST) and the hierarchical tree (HT) is confined to a brief description of the methodology and a definition of the correlation function between a pair of companies based on the London Stock Exchange (LSE) index in order to quantify synchronization between the companies. A derivation of hierarchical organization and the construction of minimal-spanning and hierarchical trees for the 2006-2008 and 2008-2010 periods have been used and the results validate the predictive verification of applied semantics. The trees are known as useful tools to perceive and detect the global structure, taxonomy and hierarchy in financial data. From these trees, two different clusters of companies in 2006 were detected. They also show three clusters in 2008 and two between 2008 and 2010, according to their proximity. The clusters match each other as regards their common production activities or their strong interrelationship. The key companies are generally given by major economic activities as expected. This work gives a comparative approach between MST and HT methods from statistical physics and information theory with analysis of financial markets that may give new valuable and useful information of the financial market dynamics.

  16. Tractography segmentation using a hierarchical Dirichlet processes mixture model.

    Science.gov (United States)

    Wang, Xiaogang; Grimson, W Eric L; Westin, Carl-Fredrik

    2011-01-01

    In this paper, we propose a new nonparametric Bayesian framework to cluster white matter fiber tracts into bundles using a hierarchical Dirichlet processes mixture (HDPM) model. The number of clusters is automatically learned driven by data with a Dirichlet process (DP) prior instead of being manually specified. After the models of bundles have been learned from training data without supervision, they can be used as priors to cluster/classify fibers of new subjects for comparison across subjects. When clustering fibers of new subjects, new clusters can be created for structures not observed in the training data. Our approach does not require computing pairwise distances between fibers and can cluster a huge set of fibers across multiple subjects. We present results on several data sets, the largest of which has more than 120,000 fibers. Copyright © 2010 Elsevier Inc. All rights reserved.

  17. A new hierarchical method to find community structure in networks

    Science.gov (United States)

    Saoud, Bilal; Moussaoui, Abdelouahab

    2018-04-01

    Community structure is very important to understand a network which represents a context. Many community detection methods have been proposed like hierarchical methods. In our study, we propose a new hierarchical method for community detection in networks based on genetic algorithm. In this method we use genetic algorithm to split a network into two networks which maximize the modularity. Each new network represents a cluster (community). Then we repeat the splitting process until we get one node at each cluster. We use the modularity function to measure the strength of the community structure found by our method, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our method are highly effective at discovering community structure in both computer-generated and real-world network data.

  18. Energy Aware Cluster Based Routing Scheme For Wireless Sensor Network

    Directory of Open Access Journals (Sweden)

    Roy Sohini

    2015-09-01

    Full Text Available Wireless Sensor Network (WSN has emerged as an important supplement to the modern wireless communication systems due to its wide range of applications. The recent researches are facing the various challenges of the sensor network more gracefully. However, energy efficiency has still remained a matter of concern for the researches. Meeting the countless security needs, timely data delivery and taking a quick action, efficient route selection and multi-path routing etc. can only be achieved at the cost of energy. Hierarchical routing is more useful in this regard. The proposed algorithm Energy Aware Cluster Based Routing Scheme (EACBRS aims at conserving energy with the help of hierarchical routing by calculating the optimum number of cluster heads for the network, selecting energy-efficient route to the sink and by offering congestion control. Simulation results prove that EACBRS performs better than existing hierarchical routing algorithms like Distributed Energy-Efficient Clustering (DEEC algorithm for heterogeneous wireless sensor networks and Energy Efficient Heterogeneous Clustered scheme for Wireless Sensor Network (EEHC.

  19. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  20. 3D NEAREST NEIGHBOUR SEARCH USING A CLUSTERED HIERARCHICAL TREE STRUCTURE

    Directory of Open Access Journals (Sweden)

    A. Suhaibah

    2016-06-01

    Full Text Available Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  1. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR is an efficient tool for metamodelling of nonlinear dynamic models

    Directory of Open Access Journals (Sweden)

    Omholt Stig W

    2011-06-01

    Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback

  2. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    Science.gov (United States)

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for

  3. Building a System of Comparative-Spatial Assessment of the Level of Development of Ukraine and the EU Countries

    Directory of Open Access Journals (Sweden)

    Bril Mykhailo S.

    2018-02-01

    Full Text Available The article proposes an approach to the formation of a system of comparative-spatial assessment of the level of socio-economic development of the State, on the basis of which the multidimensional statistical analysis of Ukraine and the EU countries was accomplished. On the basis of the hierarchical agglomerative and iterative methods of the spatial cluster analysis, groups of countries are allocated by homogeneous characteristics of the socio-economic development. A comparison of the results of spatial and dynamic clustering confirms the stability of the composition of the allocated groups and their quality characteristics. The proposed complex of economic and mathematical models for determining the level of socio-economic development of the State and the EU countries on the basis of assessment and analysis of the main macro-indicators and their relationship in the perspective will improve the quality of managerial decisions as to ensuring the socio-economic development of the State.

  4. Coherence-based Time Series Clustering for Brain Connectivity Visualization

    KAUST Repository

    Euan, Carolina

    2017-11-19

    We develop the hierarchical cluster coherence (HCC) method for brain signals, a procedure for characterizing connectivity in a network by clustering nodes or groups of channels that display high level of coordination as measured by

  5. Coherence-based Time Series Clustering for Brain Connectivity Visualization

    KAUST Repository

    Euan, Carolina; Sun, Ying; Ombao, Hernando

    2017-01-01

    We develop the hierarchical cluster coherence (HCC) method for brain signals, a procedure for characterizing connectivity in a network by clustering nodes or groups of channels that display high level of coordination as measured by

  6. Programming Hierarchical Self-Assembly of Patchy Particles into Colloidal Crystals via Colloidal Molecules.

    Science.gov (United States)

    Morphew, Daniel; Shaw, James; Avins, Christopher; Chakrabarti, Dwaipayan

    2018-03-27

    Colloidal self-assembly is a promising bottom-up route to a wide variety of three-dimensional structures, from clusters to crystals. Programming hierarchical self-assembly of colloidal building blocks, which can give rise to structures ordered at multiple levels to rival biological complexity, poses a multiscale design problem. Here we explore a generic design principle that exploits a hierarchy of interaction strengths and employ this design principle in computer simulations to demonstrate the hierarchical self-assembly of triblock patchy colloidal particles into two distinct colloidal crystals. We obtain cubic diamond and body-centered cubic crystals via distinct clusters of uniform size and shape, namely, tetrahedra and octahedra, respectively. Such a conceptual design framework has the potential to reliably encode hierarchical self-assembly of colloidal particles into a high level of sophistication. Moreover, the design framework underpins a bottom-up route to cubic diamond colloidal crystals, which have remained elusive despite being much sought after for their attractive photonic applications.

  7. Hierarchical Image Segmentation of Remotely Sensed Data using Massively Parallel GNU-LINUX Software

    Science.gov (United States)

    Tilton, James C.

    2003-01-01

    A hierarchical set of image segmentations is a set of several image segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. In [1], Tilton, et a1 describes an approach for producing hierarchical segmentations (called HSEG) and gave a progress report on exploiting these hierarchical segmentations for image information mining. The HSEG algorithm is a hybrid of region growing and constrained spectral clustering that produces a hierarchical set of image segmentations based on detected convergence points. In the main, HSEG employs the hierarchical stepwise optimization (HSWO) approach to region growing, which was described as early as 1989 by Beaulieu and Goldberg. The HSWO approach seeks to produce segmentations that are more optimized than those produced by more classic approaches to region growing (e.g. Horowitz and T. Pavlidis, [3]). In addition, HSEG optionally interjects between HSWO region growing iterations, merges between spatially non-adjacent regions (i.e., spectrally based merging or clustering) constrained by a threshold derived from the previous HSWO region growing iteration. While the addition of constrained spectral clustering improves the utility of the segmentation results, especially for larger images, it also significantly increases HSEG s computational requirements. To counteract this, a computationally efficient recursive, divide-and-conquer, implementation of HSEG (RHSEG) was devised, which includes special code to avoid processing artifacts caused by RHSEG s recursive subdivision of the image data. The recursive nature of RHSEG makes for a straightforward parallel implementation. This paper describes the HSEG algorithm, its recursive formulation (referred to as RHSEG), and the implementation of RHSEG using massively parallel GNU-LINUX software. Results with Landsat TM data are included comparing RHSEG with classic

  8. Dynamic Hierarchical Energy-Efficient Method Based on Combinatorial Optimization for Wireless Sensor Networks.

    Science.gov (United States)

    Chang, Yuchao; Tang, Hongying; Cheng, Yongbo; Zhao, Qin; Yuan, Baoqing Li andXiaobing

    2017-07-19

    Routing protocols based on topology control are significantly important for improving network longevity in wireless sensor networks (WSNs). Traditionally, some WSN routing protocols distribute uneven network traffic load to sensor nodes, which is not optimal for improving network longevity. Differently to conventional WSN routing protocols, we propose a dynamic hierarchical protocol based on combinatorial optimization (DHCO) to balance energy consumption of sensor nodes and to improve WSN longevity. For each sensor node, the DHCO algorithm obtains the optimal route by establishing a feasible routing set instead of selecting the cluster head or the next hop node. The process of obtaining the optimal route can be formulated as a combinatorial optimization problem. Specifically, the DHCO algorithm is carried out by the following procedures. It employs a hierarchy-based connection mechanism to construct a hierarchical network structure in which each sensor node is assigned to a special hierarchical subset; it utilizes the combinatorial optimization theory to establish the feasible routing set for each sensor node, and takes advantage of the maximum-minimum criterion to obtain their optimal routes to the base station. Various results of simulation experiments show effectiveness and superiority of the DHCO algorithm in comparison with state-of-the-art WSN routing algorithms, including low-energy adaptive clustering hierarchy (LEACH), hybrid energy-efficient distributed clustering (HEED), genetic protocol-based self-organizing network clustering (GASONeC), and double cost function-based routing (DCFR) algorithms.

  9. Chimera states in networks of logistic maps with hierarchical connectivities

    Science.gov (United States)

    zur Bonsen, Alexander; Omelchenko, Iryna; Zakharova, Anna; Schöll, Eckehard

    2018-04-01

    Chimera states are complex spatiotemporal patterns consisting of coexisting domains of coherence and incoherence. We study networks of nonlocally coupled logistic maps and analyze systematically how the dilution of the network links influences the appearance of chimera patterns. The network connectivities are constructed using an iterative Cantor algorithm to generate fractal (hierarchical) connectivities. Increasing the hierarchical level of iteration, we compare the resulting spatiotemporal patterns. We demonstrate that a high clustering coefficient and symmetry of the base pattern promotes chimera states, and asymmetric connectivities result in complex nested chimera patterns.

  10. Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters as high rate capability and long life anode materials for lithium-ion batteries

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Shengming [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Xia, Tian, E-mail: xiatian@hlju.edu.cn [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Wang, Jingping [Key Laboratory of Superlight Material and Surface Technology, Ministry of Education, College of Materials Science and Chemical Engineering, Harbin Engineering University, Heilongjiang, Harbin 150001 (China); Lu, Feifei [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Xu, Chunbo [Key Laboratory of Superlight Material and Surface Technology, Ministry of Education, College of Materials Science and Chemical Engineering, Harbin Engineering University, Heilongjiang, Harbin 150001 (China); Zhang, Xianfa; Huo, Lihua [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Zhao, Hui, E-mail: zhaohui98@yahoo.com [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China)

    2017-06-01

    Graphical abstract: Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment. When tested as anode materials for LIBs, UMCN-HCs achieve high reversible capacity, good long cycling life, and rate capability. - Highlights: • UMCN-HCs show high capacity, excellent stability, and good rate capability. • UMCN-HCs retain a capacity of 1067 mAh g{sup −1} after 100 cycles at 100 mA g{sup −1}. • UMCN-HCs deliver a capacity of 507 mAh g{sup −1} after 500 cycles at 2 A g{sup −1}. - Abstract: Herein, Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment at 600 °C in air. The products consist of cluster-like Co{sub 3}O{sub 4} microarchitectures, which are assembled by numerous ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets. When tested as anode materials for lithium-ion batteries, UMCN-HCs deliver a high reversible capacity of 1067 mAh g{sup −1} at a current density of 100 mA g{sup −1} after 100 cycles. Even at 2 A g{sup −1}, a stable capacity as high as 507 mAh g{sup −1} can be achieved after 500 cycles. The high reversible capacity, excellent cycling stability, and good rate capability of UMCN-HCs may be attributed to their mesoporous sheet-like nanostructure. The sheet-layered structure of UMCN-HCs may buffer the volume change during the lithiation-delithiation process, and the mesoporous characteristic make lithium-ion transfer more easily at the interface between the active electrode and the electrolyte.

  11. Energy Aware Clustering Algorithms for Wireless Sensor Networks

    Science.gov (United States)

    Rakhshan, Noushin; Rafsanjani, Marjan Kuchaki; Liu, Chenglian

    2011-09-01

    The sensor nodes deployed in wireless sensor networks (WSNs) are extremely power constrained, so maximizing the lifetime of the entire networks is mainly considered in the design. In wireless sensor networks, hierarchical network structures have the advantage of providing scalable and energy efficient solutions. In this paper, we investigate different clustering algorithms for WSNs and also compare these clustering algorithms based on metrics such as clustering distribution, cluster's load balancing, Cluster Head's (CH) selection strategy, CH's role rotation, node mobility, clusters overlapping, intra-cluster communications, reliability, security and location awareness.

  12. MAP-Based Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and -Norm Minimization

    Directory of Open Access Journals (Sweden)

    Kellermann Walter

    2007-01-01

    Full Text Available We address the problem of underdetermined BSS. While most previous approaches are designed for instantaneous mixtures, we propose a time-frequency-domain algorithm for convolutive mixtures. We adopt a two-step method based on a general maximum a posteriori (MAP approach. In the first step, we estimate the mixing matrix based on hierarchical clustering, assuming that the source signals are sufficiently sparse. The algorithm works directly on the complex-valued data in the time-frequency domain and shows better convergence than algorithms based on self-organizing maps. The assumption of Laplacian priors for the source signals in the second step leads to an algorithm for estimating the source signals. It involves the -norm minimization of complex numbers because of the use of the time-frequency-domain approach. We compare a combinatorial approach initially designed for real numbers with a second-order cone programming (SOCP approach designed for complex numbers. We found that although the former approach is not theoretically justified for complex numbers, its results are comparable to, or even better than, the SOCP solution. The advantage is a lower computational cost for problems with low input/output dimensions.

  13. Hierarchical structure in the distribution of galaxies

    International Nuclear Information System (INIS)

    Schulman, L.S.; Seiden, P.E.; Technion - Israel Institute of Technology, Haifa; IBM Thomas J. Watson Research Center, Yorktown Heights, NY)

    1986-01-01

    The distribution of galaxies has a hierarchical structure with power-law correlations. This is usually thought to arise from gravity alone acting on an originally uniform distributioon. If, however, the original process of galaxy formation occurs through the stimulated birth of one galaxy due to a nearby recently formed galaxy, and if this process occurs near its percolation threshold, then a hierarchical structure with power-law correlations arises at the time of galaxy formation. If subsequent gravitational evolution within an expanding cosmology is such as to retain power-law correlations, the initial r exp -1 dropoff can steepen to the observed r exp -1.8. The distribution of galaxies obtained by this process produces clustering and voids, as observed. 23 references

  14. A CLUSTER IN THE MAKING: ALMA REVEALS THE INITIAL CONDITIONS FOR HIGH-MASS CLUSTER FORMATION

    International Nuclear Information System (INIS)

    Rathborne, J. M.; Contreras, Y.; Longmore, S. N.; Bastian, N.; Jackson, J. M.; Alves, J. F.; Bally, J.; Foster, J. B.; Garay, G.; Kruijssen, J. M. D.; Testi, L.; Walsh, A. J.

    2015-01-01

    G0.253+0.016 is a molecular clump that appears to be on the verge of forming a high-mass cluster: its extremely low dust temperature, high mass, and high density, combined with its lack of prevalent star formation, make it an excellent candidate for an Arches-like cluster in a very early stage of formation. Here we present new Atacama Large Millimeter/Sub-millimeter Array observations of its small-scale (∼0.07 pc) 3 mm dust continuum and molecular line emission from 17 different species that probe a range of distinct physical and chemical conditions. The data reveal a complex network of emission features with a complicated velocity structure: there is emission on all spatial scales, the morphology of which ranges from small, compact regions to extended, filamentary structures that are seen in both emission and absorption. The dust column density is well traced by molecules with higher excitation energies and critical densities, consistent with a clump that has a denser interior. A statistical analysis supports the idea that turbulence shapes the observed gas structure within G0.253+0.016. We find a clear break in the turbulent power spectrum derived from the optically thin dust continuum emission at a spatial scale of ∼0.1 pc, which may correspond to the spatial scale at which gravity has overcome the thermal pressure. We suggest that G0.253+0.016 is on the verge of forming a cluster from hierarchical, filamentary structures that arise from a highly turbulent medium. Although the stellar distribution within high-mass Arches-like clusters is compact, centrally condensed, and smooth, the observed gas distribution within G0.253+0.016 is extended, with no high-mass central concentration, and has a complex, hierarchical structure. If this clump gives rise to a high-mass cluster and its stars are formed from this initially hierarchical gas structure, then the resulting cluster must evolve into a centrally condensed structure via a dynamical process

  15. Nonredundant sparse feature extraction using autoencoders with receptive fields clustering.

    Science.gov (United States)

    Ayinde, Babajide O; Zurada, Jacek M

    2017-09-01

    This paper proposes new techniques for data representation in the context of deep learning using agglomerative clustering. Existing autoencoder-based data representation techniques tend to produce a number of encoding and decoding receptive fields of layered autoencoders that are duplicative, thereby leading to extraction of similar features, thus resulting in filtering redundancy. We propose a way to address this problem and show that such redundancy can be eliminated. This yields smaller networks and produces unique receptive fields that extract distinct features. It is also shown that autoencoders with nonnegativity constraints on weights are capable of extracting fewer redundant features than conventional sparse autoencoders. The concept is illustrated using conventional sparse autoencoder and nonnegativity-constrained autoencoders with MNIST digits recognition, NORB normalized-uniform object data and Yale face dataset. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Cluster Physics with Merging Galaxy Clusters

    Directory of Open Access Journals (Sweden)

    Sandor M. Molnar

    2016-02-01

    Full Text Available Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard LCDM model, where the total density is dominated by the cosmological constant ($Lambda$ and the matter density by cold dark matter (CDM, structure formation is hierarchical, and clusters grow mostly by merging.Mergers of two massive clusters are the most energetic events in the universe after the Big Bang,hence they provide a unique laboratory to study cluster physics.The two main mass components in clusters behave differently during collisions:the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulenceare developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thusour review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clustersis to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses.New high spatial and spectral resolution ground and space based telescopeswill come online in the near future. Motivated by these new opportunities,we briefly discuss methods which will be feasible in the near future in studying merging clusters.

  17. A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering

    Science.gov (United States)

    Chahine, Firas Safwan

    2012-01-01

    Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…

  18. Efficient Record Linkage Algorithms Using Complete Linkage Clustering.

    Science.gov (United States)

    Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar

    2016-01-01

    Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times.

  19. Non-Hierarchical Clustering as a method to analyse an open-ended ...

    African Journals Online (AJOL)

    We show that the use of non-hierarchical analysis allows us to interpret the reasoning of students solving different mathematical problems using Algebra, and to separate them into different groups, that can be recognised and characterised by common traits in their answers, without any prior knowledge on the part of the ...

  20. Open source clustering software.

    Science.gov (United States)

    de Hoon, M J L; Imoto, S; Nolan, J; Miyano, S

    2004-06-12

    We have implemented k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs. Using this library, we have created an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C. The C Clustering Library and the corresponding Python C extension module Pycluster were released under the Python License, while the Perl module Algorithm::Cluster was released under the Artistic License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well as the corresponding command-line program, were released under the same license as the original Cluster code. The complete source code is available at http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster. Alternatively, Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available as part of the Biopython distribution.

  1. Hierarchical organization in the temporal structure of infant-direct speech and song.

    Science.gov (United States)

    Falk, Simone; Kello, Christopher T

    2017-06-01

    Caregivers alter the temporal structure of their utterances when talking and singing to infants compared with adult communication. The present study tested whether temporal variability in infant-directed registers serves to emphasize the hierarchical temporal structure of speech. Fifteen German-speaking mothers sang a play song and told a story to their 6-months-old infants, or to an adult. Recordings were analyzed using a recently developed method that determines the degree of nested clustering of temporal events in speech. Events were defined as peaks in the amplitude envelope, and clusters of various sizes related to periods of acoustic speech energy at varying timescales. Infant-directed speech and song clearly showed greater event clustering compared with adult-directed registers, at multiple timescales of hundreds of milliseconds to tens of seconds. We discuss the relation of this newly discovered acoustic property to temporal variability in linguistic units and its potential implications for parent-infant communication and infants learning the hierarchical structures of speech and language. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Using Cluster Analysis to Compartmentalize a Large Managed Wetland Based on Physical, Biological, and Climatic Geospatial Attributes.

    Science.gov (United States)

    Hahus, Ian; Migliaccio, Kati; Douglas-Mankin, Kyle; Klarenberg, Geraldine; Muñoz-Carpena, Rafael

    2018-04-27

    Hierarchical and partitional cluster analyses were used to compartmentalize Water Conservation Area 1, a managed wetland within the Arthur R. Marshall Loxahatchee National Wildlife Refuge in southeast Florida, USA, based on physical, biological, and climatic geospatial attributes. Single, complete, average, and Ward's linkages were tested during the hierarchical cluster analyses, with average linkage providing the best results. In general, the partitional method, partitioning around medoids, found clusters that were more evenly sized and more spatially aggregated than those resulting from the hierarchical analyses. However, hierarchical analysis appeared to be better suited to identify outlier regions that were significantly different from other areas. The clusters identified by geospatial attributes were similar to clusters developed for the interior marsh in a separate study using water quality attributes, suggesting that similar factors have influenced variations in both the set of physical, biological, and climatic attributes selected in this study and water quality parameters. However, geospatial data allowed further subdivision of several interior marsh clusters identified from the water quality data, potentially indicating zones with important differences in function. Identification of these zones can be useful to managers and modelers by informing the distribution of monitoring equipment and personnel as well as delineating regions that may respond similarly to future changes in management or climate.

  3. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms.

    Science.gov (United States)

    Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John

    2015-09-01

    We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Hierarchical and Complex System Entropy Clustering Analysis Based Validation for Traditional Chinese Medicine Syndrome Patterns of Chronic Atrophic Gastritis.

    Science.gov (United States)

    Zhang, Yin; Liu, Yue; Li, Yannan; Zhao, Xia; Zhuo, Lin; Zhou, Ajian; Zhang, Li; Su, Zeqi; Chen, Cen; Du, Shiyu; Liu, Daming; Ding, Xia

    2018-03-22

    Chronic atrophic gastritis (CAG) is the precancerous stage of gastric carcinoma. Traditional Chinese Medicine (TCM) has been widely used in treating CAG. This study aimed to reveal core pathogenesis of CAG by validating the TCM syndrome patterns and provide evidence for optimization of treatment strategies. This is a cross-sectional study conducted in 4 hospitals in China. Hierarchical clustering analysis (HCA) and complex system entropy clustering analysis (CSECA) were performed, respectively, to achieve syndrome pattern validation. Based on HCA, 15 common factors were assigned to 6 syndrome patterns: liver depression and spleen deficiency and blood stasis in the stomach collateral, internal harassment of phlegm-heat and blood stasis in the stomach collateral, phlegm-turbidity internal obstruction, spleen yang deficiency, internal harassment of phlegm-heat and spleen deficiency, and spleen qi deficiency. By CSECA, 22 common factors were assigned to 7 syndrome patterns: qi deficiency, qi stagnation, blood stasis, phlegm turbidity, heat, yang deficiency, and yin deficiency. Combination of qi deficiency, qi stagnation, blood stasis, phlegm turbidity, heat, yang deficiency, and yin deficiency may play a crucial role in CAG pathogenesis. In accord with this, treatment strategies by TCM herbal prescriptions should be targeted to regulating qi, activating blood, resolving turbidity, clearing heat, removing toxin, nourishing yin, and warming yang. Further explorations are needed to verify and expand the current conclusions.

  5. Unsupervised classification of multivariate geostatistical data: Two algorithms

    Science.gov (United States)

    Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

    2015-12-01

    With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

  6. Intercalating graphene with clusters of Fe3O4 nanocrystals for electrochemical supercapacitors

    Science.gov (United States)

    Ke, Qingqing; Tang, Chunhua; Liu, Yanqiong; Liu, Huajun; Wang, John

    2014-04-01

    A hierarchical nanostructure consisting of graphene sheets intercalated by clusters of Fe3O4 nanocystals is developed for high-performance supercapacitor electrode. Here we show that the negatively charged graphene oxide (GO) and positively charged Fe3O4 clusters enable a strong electrostatic interaction, generating a hierarchical 3D nanostructure, which gives rise to the intercalated composites through a rational hydrothermal process. The electrocapacitive behavior of the resultant composites is systematically investigated by cyclic voltammeter and galvanostatic charge-discharge techniques, where a positive synergistic effect between graphene and Fe3O4 clusters is identified. A maximum specific capacitance of 169 F g-1 is achieved in the Fe3O4 clusters decorated with effectively reduced graphene oxide (Fe3O4-rGO-12h), which is much higher than those of rGO (101 F g-1) and Fe3O4 (68 F g-1) at the current density of 1 Ag-1. Moreover, this intercalated hierarchical nanostructure demonstrates a good capacitance retention, retaining over 88% of the initial capacity after 1000 cycles.

  7. A Multidimensional and Multimembership Clustering Method for Social Networks and Its Application in Customer Relationship Management

    Directory of Open Access Journals (Sweden)

    Peixin Zhao

    2013-01-01

    Full Text Available Community detection in social networks plays an important role in cluster analysis. Many traditional techniques for one-dimensional problems have been proven inadequate for high-dimensional or mixed type datasets due to the data sparseness and attribute redundancy. In this paper we propose a graph-based clustering method for multidimensional datasets. This novel method has two distinguished features: nonbinary hierarchical tree and the multi-membership clusters. The nonbinary hierarchical tree clearly highlights meaningful clusters, while the multimembership feature may provide more useful service strategies. Experimental results on the customer relationship management confirm the effectiveness of the new method.

  8. The Hierarchical Trend Model for property valuation and local price indices

    NARCIS (Netherlands)

    Francke, M.K.; Vos, G.A.

    2002-01-01

    This paper presents a hierarchical trend model (HTM) for selling prices of houses, addressing three main problems: the spatial and temporal dependence of selling prices and the dependency of price index changes on housing quality. In this model the general price trend, cluster-level price trends,

  9. Rotasi Varimax dan Median Hirarki Cluster Pada Program Raskin di Kabupaten Lombok Barat

    Directory of Open Access Journals (Sweden)

    Desy Komalasari

    2015-11-01

    Full Text Available The granting rice program for poor households (Raskin is one of the West Lombok regency government programs for village poverty. The effectiveness of the program relating to 14 criteria for the poor households Raskin recipients (RTS-PM. The 14 criteria have been grouped into several factors using varimax rotation factor analysis, while the RTS-PM have been grouped using hierarchical median cluster analysis. Four factors obtained based on the analysis. First factor was the house existence, the second factor was the financial ability, the third factor was the house existing facilities, and the four factor was the education of the household head and the purchasing power of clothing. The clustering results using hierarchical median cluster analysis formed 3 clusters. The first cluster contains the RTS-PM which have been grouped into first factor; the second cluster contains the RTS-PM which have been grouped into second and third factor; and the third cluster contains the RTS-PM which have been grouped into fourth factor.

  10. Rotasi Varimax dan Median Hirarki Cluster Pada Program Raskin di Kabupaten Lombok Barat

    Directory of Open Access Journals (Sweden)

    Desy Komalasari

    2015-06-01

    Full Text Available The granting rice program for poor households (Raskin is one of the West Lombok regency government programs for village poverty. The effectiveness of the program relating to 14 criteria for the poor households Raskin recipients (RTS-PM. The 14 criteria have been grouped into several factors using varimax rotation factor analysis, while the RTS-PM have been grouped using hierarchical median cluster analysis. Four factors obtained based on the analysis. First factor was the house existence, the second factor was the financial ability, the third factor was the house existing facilities, and the four factor was the education of the household head and the purchasing power of clothing. The clustering results using hierarchical median cluster analysis formed 3 clusters. The first cluster contains the RTS-PM which have been grouped into first factor; the second cluster contains the RTS-PM which have been grouped into second and third factor; and the third cluster contains the RTS-PM which have been grouped into fourth factor.

  11. Seismotectonic Implications Of Clustered Regional GPS Velocities In The San Francisco Bay Region, California

    Science.gov (United States)

    Graymer, R. W.; Simpson, R.

    2012-12-01

    We have used a hierarchical agglomerative clustering algorithm with Euclidean distance and centroid linkage, applied to continuous GPS observations for the Bay region available from the U.S. Geological Survey website. This analysis reveals 4 robust, spatially coherent clusters that coincide with 4 first-order structural blocks separated by 3 major fault systems: San Andreas (SA), Southern/Central Calaveras-Hayward-Rodgers Creek-Maacama (HAY), and Northern Calaveras-Concord-Green Valley-Berryessa-Bartlett Springs (NCAL). Because observations seaward of the San Gregorio (SG) fault are few in number, the cluster to the west of SA may actually contain 2 major structural blocks not adequately resolved: the Pacific plate to the west of the northern SA and a Peninsula block between the Peninsula SA and the SG fault. The average inter-block velocities are 11, 10, and 9 mm/yr across SA, HAY, and NCAL respectively. There appears to be a significant component of fault-normal compression across NCAL, whereas SA and HAY faults appear to be, on regional average, purely strike-slip. The velocities for the Sierra Nevada - Great Valley (SNGV) block to the west of NCAL are impressive in their similarity. The cluster of these velocities in a velocity plot forms a tighter grouping compared with the groupings for the other cluster blocks, suggesting a more rigid behavior for this block than the others. We note that for 4 clusters, none of the 3 cluster boundaries illuminate geologic structures other than north-northwest trending dominantly strike-slip faults, so plate motion is not accommodated by large-scale fault-parallel compression or extension in the region or by significant plastic deformation , at least over the time span of the GPS observations. Complexities of interseismic deformation of the upper crust do not allow simple application of inter-block velocities as long-term slip rates on bounding faults. However, 2D dislocation models using inter-block velocities and typical

  12. Extension of mixture-of-experts networks for binary classification of hierarchical data.

    Science.gov (United States)

    Ng, Shu-Kay; McLachlan, Geoffrey J

    2007-09-01

    For many applied problems in the context of medically relevant artificial intelligence, the data collected exhibit a hierarchical or clustered structure. Ignoring the interdependence between hierarchical data can result in misleading classification. In this paper, we extend the mechanism for mixture-of-experts (ME) networks for binary classification of hierarchical data. Another extension is to quantify cluster-specific information on data hierarchy by random effects via the generalized linear mixed-effects model (GLMM). The extension of ME networks is implemented by allowing for correlation in the hierarchical data in both the gating and expert networks via the GLMM. The proposed model is illustrated using a real thyroid disease data set. In our study, we consider 7652 thyroid diagnosis records from 1984 to early 1987 with complete information on 20 attribute values. We obtain 10 independent random splits of the data into a training set and a test set in the proportions 85% and 15%. The test sets are used to assess the generalization performance of the proposed model, based on the percentage of misclassifications. For comparison, the results obtained from the ME network with independence assumption are also included. With the thyroid disease data, the misclassification rate on test sets for the extended ME network is 8.9%, compared to 13.9% for the ME network. In addition, based on model selection methods described in Section 2, a network with two experts is selected. These two expert networks can be considered as modeling two groups of patients with high and low incidence rates. Significant variation among the predicted cluster-specific random effects is detected in the patient group with low incidence rate. It is shown that the extended ME network outperforms the ME network for binary classification of hierarchical data. With the thyroid disease data, useful information on the relative log odds of patients with diagnosed conditions at different periods can be

  13. Using hierarchical clustering of secreted protein families to classify and rank candidate effectors of rust fungi.

    Directory of Open Access Journals (Sweden)

    Diane G O Saunders

    Full Text Available Rust fungi are obligate biotrophic pathogens that cause considerable damage on crop plants. Puccinia graminis f. sp. tritici, the causal agent of wheat stem rust, and Melampsora larici-populina, the poplar leaf rust pathogen, have strong deleterious impacts on wheat and poplar wood production, respectively. Filamentous pathogens such as rust fungi secrete molecules called disease effectors that act as modulators of host cell physiology and can suppress or trigger host immunity. Current knowledge on effectors from other filamentous plant pathogens can be exploited for the characterisation of effectors in the genome of recently sequenced rust fungi. We designed a comprehensive in silico analysis pipeline to identify the putative effector repertoire from the genome of two plant pathogenic rust fungi. The pipeline is based on the observation that known effector proteins from filamentous pathogens have at least one of the following properties: (i contain a secretion signal, (ii are encoded by in planta induced genes, (iii have similarity to haustorial proteins, (iv are small and cysteine rich, (v contain a known effector motif or a nuclear localization signal, (vi are encoded by genes with long intergenic regions, (vii contain internal repeats, and (viii do not contain PFAM domains, except those associated with pathogenicity. We used Markov clustering and hierarchical clustering to classify protein families of rust pathogens and rank them according to their likelihood of being effectors. Using this approach, we identified eight families of candidate effectors that we consider of high value for functional characterization. This study revealed a diverse set of candidate effectors, including families of haustorial expressed secreted proteins and small cysteine-rich proteins. This comprehensive classification of candidate effectors from these devastating rust pathogens is an initial step towards probing plant germplasm for novel resistance components.

  14. Intraclass Correlation Coefficients in Hierarchical Designs: Evaluation Using Latent Variable Modeling

    Science.gov (United States)

    Raykov, Tenko

    2011-01-01

    Interval estimation of intraclass correlation coefficients in hierarchical designs is discussed within a latent variable modeling framework. A method accomplishing this aim is outlined, which is applicable in two-level studies where participants (or generally lower-order units) are clustered within higher-order units. The procedure can also be…

  15. A method for identifying hierarchical sub-networks / modules and weighting network links based on their similarity in sub-network / module affiliation

    Directory of Open Access Journals (Sweden)

    WenJun Zhang

    2016-06-01

    Full Text Available Some networks, including biological networks, consist of hierarchical sub-networks / modules. Based on my previous study, in present study a method for both identifying hierarchical sub-networks / modules and weighting network links is proposed. It is based on the cluster analysis in which between-node similarity in sets of adjacency nodes is used. Two matrices, linkWeightMat and linkClusterIDs, are achieved by using the algorithm. Two links with both the same weight in linkWeightMat and the same cluster ID in linkClusterIDs belong to the same sub-network / module. Two links with the same weight in linkWeightMat but different cluster IDs in linkClusterIDs belong to two sub-networks / modules at the same hirarchical level. However, a link with an unique cluster ID in linkClusterIDs does not belong to any sub-networks / modules. A sub-network / module of the greater weight is the more connected sub-network / modules. Matlab codes of the algorithm are presented.

  16. Coastal erosion and accretion in Pak Phanang, Thailand by GIS analysis of maps and satellite imagery

    Directory of Open Access Journals (Sweden)

    Sayedur Rahman Chowdhury

    2013-12-01

    Full Text Available Coastal erosion and accretion in Pak Phanang of southern Thailand between 1973 and 2003 was measured using multi-temporal topographic maps and Landsat satellite imageries. Within a GIS environment landward and seaward movements of shoreline was estimated by a transect-based analysis, and amounts of land accretion and erosion were estimated by a parcel-based geoprocessing. The whole longitudinal extent of the 58 kilometer coast was classified based on the erosion and accretion trends during this period using agglomerative hierarchical clustering approach. Erosion and accretion were found variable over time and space, and periodic reversal of status was also noticed in many places. Estimates of erosion were evaluated against field-survey based data, and found reasonably accurate where the rates were relatively great. Smoothing of shoreline datasets was found desirable as its impacts on the estimates remained within tolerable limits.

  17. Clustering for data mining a data recovery approach

    CERN Document Server

    Mirkin, Boris

    2005-01-01

    Often considered more as an art than a science, the field of clustering has been dominated by learning through examples and by techniques chosen almost through trial-and-error. Even the most popular clustering methods--K-Means for partitioning the data set and Ward's method for hierarchical clustering--have lacked the theoretical attention that would establish a firm relationship between the two methods and relevant interpretation aids.Rather than the traditional set of ad hoc techniques, Clustering for Data Mining: A Data Recovery Approach presents a theory that not only closes gaps in K-Mean

  18. Data clustering theory, algorithms, and applications

    CERN Document Server

    Gan, Guojun; Wu, Jianhong

    2007-01-01

    Cluster analysis is an unsupervised process that divides a set of objects into homogeneous groups. This book starts with basic information on cluster analysis, including the classification of data and the corresponding similarity measures, followed by the presentation of over 50 clustering algorithms in groups according to some specific baseline methodologies such as hierarchical, center-based, and search-based methods. As a result, readers and users can easily identify an appropriate algorithm for their applications and compare novel ideas with existing results. The book also provides examples of clustering applications to illustrate the advantages and shortcomings of different clustering architectures and algorithms. Application areas include pattern recognition, artificial intelligence, information technology, image processing, biology, psychology, and marketing. Readers also learn how to perform cluster analysis with the C/C++ and MATLAB® programming languages.

  19. Institutional, Individual, and Socio-Cultural Domains of Partnerships: A Typology of USDA Forest Service Recreation Partners

    Science.gov (United States)

    Seekamp, Erin; Cerveny, Lee K.; McCreary, Allie

    2011-09-01

    Federal land management agencies, such as the USDA Forest Service, have expanded the role of recreation partners reflecting constrained growth in appropriations and broader societal trends towards civic environmental governance. Partnerships with individual volunteers, service groups, commercial outfitters, and other government agencies provide the USDA Forest Service with the resources necessary to complete projects and meet goals under fiscal constraints. Existing partnership typologies typically focus on collaborative or strategic alliances and highlight organizational dimensions (e.g., structure and process) defined by researchers. This paper presents a partner typology constructed from USDA Forest Service partnership practitioners' conceptualizations of 35 common partner types. Multidimensional scaling of data from unconstrained pile sorts identified 3 distinct cultural dimensions of recreation partners—specifically, partnership character, partner impact, and partner motivations—that represent institutional, individual, and socio-cultural cognitive domains. A hierarchical agglomerative cluster analysis provides further insight into the various domains of agency personnel's conceptualizations. While three dimensions with high reliability (RSQ = 0.83) and corresponding hierarchical clusters illustrate commonality between agency personnel's partnership suppositions, this study also reveals variance in personnel's familiarity and affinity for specific partnership types. This real-world perspective on partner types highlights that agency practitioners not only make strategic choices when selecting and cultivating partnerships to accomplish critical task, but also elect to work with partners for the primary purpose of providing public service and fostering land stewardship.

  20. Worldwide clustering of the corruption perception

    Science.gov (United States)

    Paulus, Michal; Kristoufek, Ladislav

    2015-06-01

    We inspect a possible clustering structure of the corruption perception among 134 countries. Using the average linkage clustering, we uncover a well-defined hierarchy in the relationships among countries. Four main clusters are identified and they suggest that countries worldwide can be quite well separated according to their perception of corruption. Moreover, we find a strong connection between corruption levels and a stage of development inside the clusters. The ranking of countries according to their corruption perfectly copies the ranking according to the economic performance measured by the gross domestic product per capita of the member states. To the best of our knowledge, this study is the first one to present an application of hierarchical and clustering methods to the specific case of corruption.

  1. Avoiding Boundary Estimates in Hierarchical Linear Models through Weakly Informative Priors

    Science.gov (United States)

    Chung, Yeojin; Rabe-Hesketh, Sophia; Gelman, Andrew; Dorie, Vincent; Liu, Jinchen

    2012-01-01

    Hierarchical or multilevel linear models are widely used for longitudinal or cross-sectional data on students nested in classes and schools, and are particularly important for estimating treatment effects in cluster-randomized trials, multi-site trials, and meta-analyses. The models can allow for variation in treatment effects, as well as…

  2. Conceptual hierarchical modeling to describe wetland plant community organization

    Science.gov (United States)

    Little, A.M.; Guntenspergen, G.R.; Allen, T.F.H.

    2010-01-01

    Using multivariate analysis, we created a hierarchical modeling process that describes how differently-scaled environmental factors interact to affect wetland-scale plant community organization in a system of small, isolated wetlands on Mount Desert Island, Maine. We followed the procedure: 1) delineate wetland groups using cluster analysis, 2) identify differently scaled environmental gradients using non-metric multidimensional scaling, 3) order gradient hierarchical levels according to spatiotem-poral scale of fluctuation, and 4) assemble hierarchical model using group relationships with ordination axes and post-hoc tests of environmental differences. Using this process, we determined 1) large wetland size and poor surface water chemistry led to the development of shrub fen wetland vegetation, 2) Sphagnum and water chemistry differences affected fen vs. marsh / sedge meadows status within small wetlands, and 3) small-scale hydrologic differences explained transitions between forested vs. non-forested and marsh vs. sedge meadow vegetation. This hierarchical modeling process can help explain how upper level contextual processes constrain biotic community response to lower-level environmental changes. It creates models with more nuanced spatiotemporal complexity than classification and regression tree procedures. Using this process, wetland scientists will be able to generate more generalizable theories of plant community organization, and useful management models. ?? Society of Wetland Scientists 2009.

  3. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    International Nuclear Information System (INIS)

    Ellefsen, Karl J.; Smith, David B.

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples. - Highlights: • We evaluate a clustering procedure by applying it to geochemical data. • The procedure generates a hierarchy of clusters. • Different levels of the hierarchy show geochemical processes at different spatial scales. • The clustering method is Bayesian finite mixture modeling. • Model parameters are estimated with Hamiltonian Monte Carlo sampling.

  4. Non-Hierarchical Clustering as a method to analyse an open-ended ...

    African Journals Online (AJOL)

    Apple

    Keywords: algebraic thinking; cluster analysis; mathematics education; quantitative analysis. Introduction. Extensive ..... C1, C2 and C3 represent the three centroids of the three clusters formed. .... 6ALd. All these strategies are algebraic and 'high- ... 1995), of the didactical aspects related to teaching .... Brazil, 18-23 July.

  5. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

    Science.gov (United States)

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

    2014-11-01

    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Robustness of Multiple Clustering Algorithms on Hyperspectral Images

    National Research Council Canada - National Science Library

    Williams, Jason P

    2007-01-01

    .... Various clustering algorithms were employed, including a hierarchical method, ISODATA, K-means, and X-means, and were used on a simple two dimensional dataset in order to discover potential problems with the algorithms...

  7. Hierarchical structures of correlations networks among Turkey’s exports and imports by currencies

    Science.gov (United States)

    Kocakaplan, Yusuf; Deviren, Bayram; Keskin, Mustafa

    2012-12-01

    We have examined the hierarchical structures of correlations networks among Turkey’s exports and imports by currencies for the 1996-2010 periods, using the concept of a minimal spanning tree (MST) and hierarchical tree (HT) which depend on the concept of ultrametricity. These trees are useful tools for understanding and detecting the global structure, taxonomy and hierarchy in financial markets. We derived a hierarchical organization and build the MSTs and HTs during the 1996-2001 and 2002-2010 periods. The reason for studying two different sub-periods, namely 1996-2001 and 2002-2010, is that the Euro (EUR) came into use in 2001, and some countries have made their exports and imports with Turkey via the EUR since 2002, and in order to test various time-windows and observe temporal evolution. We have carried out bootstrap analysis to associate a value of the statistical reliability to the links of the MSTs and HTs. We have also used the average linkage cluster analysis (ALCA) to observe the cluster structure more clearly. Moreover, we have obtained the bidimensional minimal spanning tree (BMST) due to economic trade being a bidimensional problem. From the structural topologies of these trees, we have identified different clusters of currencies according to their proximity and economic ties. Our results show that some currencies are more important within the network, due to a tighter connection with other currencies. We have also found that the obtained currencies play a key role for Turkey’s exports and imports and have important implications for the design of portfolio and investment strategies.

  8. Hierarchical Spatial Concept Formation Based on Multimodal Information for Human Support Robots.

    Science.gov (United States)

    Hagiwara, Yoshinobu; Inoue, Masakazu; Kobayashi, Hiroyoshi; Taniguchi, Tadahiro

    2018-01-01

    In this paper, we propose a hierarchical spatial concept formation method based on the Bayesian generative model with multimodal information e.g., vision, position and word information. Since humans have the ability to select an appropriate level of abstraction according to the situation and describe their position linguistically, e.g., "I am in my home" and "I am in front of the table," a hierarchical structure of spatial concepts is necessary in order for human support robots to communicate smoothly with users. The proposed method enables a robot to form hierarchical spatial concepts by categorizing multimodal information using hierarchical multimodal latent Dirichlet allocation (hMLDA). Object recognition results using convolutional neural network (CNN), hierarchical k-means clustering result of self-position estimated by Monte Carlo localization (MCL), and a set of location names are used, respectively, as features in vision, position, and word information. Experiments in forming hierarchical spatial concepts and evaluating how the proposed method can predict unobserved location names and position categories are performed using a robot in the real world. Results verify that, relative to comparable baseline methods, the proposed method enables a robot to predict location names and position categories closer to predictions made by humans. As an application example of the proposed method in a home environment, a demonstration in which a human support robot moves to an instructed place based on human speech instructions is achieved based on the formed hierarchical spatial concept.

  9. Hierarchical Spatial Concept Formation Based on Multimodal Information for Human Support Robots

    Directory of Open Access Journals (Sweden)

    Yoshinobu Hagiwara

    2018-03-01

    Full Text Available In this paper, we propose a hierarchical spatial concept formation method based on the Bayesian generative model with multimodal information e.g., vision, position and word information. Since humans have the ability to select an appropriate level of abstraction according to the situation and describe their position linguistically, e.g., “I am in my home” and “I am in front of the table,” a hierarchical structure of spatial concepts is necessary in order for human support robots to communicate smoothly with users. The proposed method enables a robot to form hierarchical spatial concepts by categorizing multimodal information using hierarchical multimodal latent Dirichlet allocation (hMLDA. Object recognition results using convolutional neural network (CNN, hierarchical k-means clustering result of self-position estimated by Monte Carlo localization (MCL, and a set of location names are used, respectively, as features in vision, position, and word information. Experiments in forming hierarchical spatial concepts and evaluating how the proposed method can predict unobserved location names and position categories are performed using a robot in the real world. Results verify that, relative to comparable baseline methods, the proposed method enables a robot to predict location names and position categories closer to predictions made by humans. As an application example of the proposed method in a home environment, a demonstration in which a human support robot moves to an instructed place based on human speech instructions is achieved based on the formed hierarchical spatial concept.

  10. Clustering and Bayesian hierarchical modeling for the definition of informative prior distributions in hydrogeology

    Science.gov (United States)

    Cucchi, K.; Kawa, N.; Hesse, F.; Rubin, Y.

    2017-12-01

    In order to reduce uncertainty in the prediction of subsurface flow and transport processes, practitioners should use all data available. However, classic inverse modeling frameworks typically only make use of information contained in in-situ field measurements to provide estimates of hydrogeological parameters. Such hydrogeological information about an aquifer is difficult and costly to acquire. In this data-scarce context, the transfer of ex-situ information coming from previously investigated sites can be critical for improving predictions by better constraining the estimation procedure. Bayesian inverse modeling provides a coherent framework to represent such ex-situ information by virtue of the prior distribution and combine them with in-situ information from the target site. In this study, we present an innovative data-driven approach for defining such informative priors for hydrogeological parameters at the target site. Our approach consists in two steps, both relying on statistical and machine learning methods. The first step is data selection; it consists in selecting sites similar to the target site. We use clustering methods for selecting similar sites based on observable hydrogeological features. The second step is data assimilation; it consists in assimilating data from the selected similar sites into the informative prior. We use a Bayesian hierarchical model to account for inter-site variability and to allow for the assimilation of multiple types of site-specific data. We present the application and validation of the presented methods on an established database of hydrogeological parameters. Data and methods are implemented in the form of an open-source R-package and therefore facilitate easy use by other practitioners.

  11. Clusters of Galaxies

    Science.gov (United States)

    Huchtmeier, W. K.; Richter, O. G.; Materne, J.

    1981-09-01

    The large-scale structure of the universe is dominated by clustering. Most galaxies seem to be members of pairs, groups, clusters, and superclusters. To that degree we are able to recognize a hierarchical structure of the universe. Our local group of galaxies (LG) is centred on two large spiral galaxies: the Andromeda nebula and our own galaxy. Three sr:naller galaxies - like M 33 - and at least 23 dwarf galaxies (KraanKorteweg and Tammann, 1979, Astronomische Nachrichten, 300, 181) can be found in the evironment of these two large galaxies. Neighbouring groups have comparable sizes (about 1 Mpc in extent) and comparable numbers of bright members. Small dwarf galaxies cannot at present be observed at great distances.

  12. Assessment of mechanical properties of isolated bovine intervertebral discs from multi-parametric magnetic resonance imaging.

    Science.gov (United States)

    Recuerda, Maximilien; Périé, Delphine; Gilbert, Guillaume; Beaudoin, Gilles

    2012-10-12

    The treatment planning of spine pathologies requires information on the rigidity and permeability of the intervertebral discs (IVDs). Magnetic resonance imaging (MRI) offers great potential as a sensitive and non-invasive technique for describing the mechanical properties of IVDs. However, the literature reported small correlation coefficients between mechanical properties and MRI parameters. Our hypothesis is that the compressive modulus and the permeability of the IVD can be predicted by a linear combination of MRI parameters. Sixty IVDs were harvested from bovine tails, and randomly separated in four groups (in-situ, digested-6h, digested-18h, digested-24h). Multi-parametric MRI acquisitions were used to quantify the relaxation times T1 and T2, the magnetization transfer ratio MTR, the apparent diffusion coefficient ADC and the fractional anisotropy FA. Unconfined compression, confined compression and direct permeability measurements were performed to quantify the compressive moduli and the hydraulic permeabilities. Differences between groups were evaluated from a one way ANOVA. Multi linear regressions were performed between dependent mechanical properties and independent MRI parameters to verify our hypothesis. A principal component analysis was used to convert the set of possibly correlated variables into a set of linearly uncorrelated variables. Agglomerative Hierarchical Clustering was performed on the 3 principal components. Multilinear regressions showed that 45 to 80% of the Young's modulus E, the aggregate modulus in absence of deformation HA0, the radial permeability kr and the axial permeability in absence of deformation k0 can be explained by the MRI parameters within both the nucleus pulposus and the annulus pulposus. The principal component analysis reduced our variables to two principal components with a cumulative variability of 52-65%, which increased to 70-82% when considering the third principal component. The dendograms showed a natural

  13. Exploring the individual patterns of spiritual well-being in people newly diagnosed with advanced cancer: a cluster analysis.

    Science.gov (United States)

    Bai, Mei; Dixon, Jane; Williams, Anna-Leila; Jeon, Sangchoon; Lazenby, Mark; McCorkle, Ruth

    2016-11-01

    Research shows that spiritual well-being correlates positively with quality of life (QOL) for people with cancer, whereas contradictory findings are frequently reported with respect to the differentiated associations between dimensions of spiritual well-being, namely peace, meaning and faith, and QOL. This study aimed to examine individual patterns of spiritual well-being among patients newly diagnosed with advanced cancer. Cluster analysis was based on the twelve items of the 12-item Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being Scale at Time 1. A combination of hierarchical and k-means (non-hierarchical) clustering methods was employed to jointly determine the number of clusters. Self-rated health, depressive symptoms, peace, meaning and faith, and overall QOL were compared at Time 1 and Time 2. Hierarchical and k-means clustering methods both suggested four clusters. Comparison of the four clusters supported statistically significant and clinically meaningful differences in QOL outcomes among clusters while revealing contrasting relations of faith with QOL. Cluster 1, Cluster 3, and Cluster 4 represented high, medium, and low levels of overall QOL, respectively, with correspondingly high, medium, and low levels of peace, meaning, and faith. Cluster 2 was distinguished from other clusters by its medium levels of overall QOL, peace, and meaning and low level of faith. This study provides empirical support for individual difference in response to a newly diagnosed cancer and brings into focus conceptual and methodological challenges associated with the measure of spiritual well-being, which may partly contribute to the attenuated relation between faith and QOL.

  14. Not all stars form in clusters - measuring the kinematics of OB associations with Gaia

    Science.gov (United States)

    Ward, Jacob L.; Kruijssen, J. M. Diederik

    2018-04-01

    It is often stated that star clusters are the fundamental units of star formation and that most (if not all) stars form in dense stellar clusters. In this monolithic formation scenario, low-density OB associations are formed from the expansion of gravitationally bound clusters following gas expulsion due to stellar feedback. N-body simulations of this process show that OB associations formed this way retain signs of expansion and elevated radial anisotropy over tens of Myr. However, recent theoretical and observational studies suggest that star formation is a hierarchical process, following the fractal nature of natal molecular clouds and allowing the formation of large-scale associations in situ. We distinguish between these two scenarios by characterizing the kinematics of OB associations using the Tycho-Gaia Astrometric Solution catalogue. To this end, we quantify four key kinematic diagnostics: the number ratio of stars with positive radial velocities to those with negative radial velocities, the median radial velocity, the median radial velocity normalized by the tangential velocity, and the radial anisotropy parameter. Each quantity presents a useful diagnostic of whether the association was more compact in the past. We compare these diagnostics to models representing random motion and the expanding products of monolithic cluster formation. None of these diagnostics show evidence of expansion, either from a single cluster or multiple clusters, and the observed kinematics are better represented by a random velocity distribution. This result favours the hierarchical star formation model in which a minority of stars forms in bound clusters and large-scale, hierarchically structured associations are formed in situ.

  15. Hierarchical Cluster Analysis of Three-Dimensional Reconstructions of Unbiased Sampled Microglia Shows not Continuous Morphological Changes from Stage 1 to 2 after Multiple Dengue Infections in Callithrix penicillata

    Science.gov (United States)

    Diniz, Daniel G.; Silva, Geane O.; Naves, Thaís B.; Fernandes, Taiany N.; Araújo, Sanderson C.; Diniz, José A. P.; de Farias, Luis H. S.; Sosthenes, Marcia C. K.; Diniz, Cristovam G.; Anthony, Daniel C.; da Costa Vasconcelos, Pedro F.; Picanço Diniz, Cristovam W.

    2016-01-01

    It is known that microglial morphology and function are related, but few studies have explored the subtleties of microglial morphological changes in response to specific pathogens. In the present report we quantitated microglia morphological changes in a monkey model of dengue disease with virus CNS invasion. To mimic multiple infections that usually occur in endemic areas, where higher dengue infection incidence and abundant mosquito vectors carrying different serotypes coexist, subjects received once a week subcutaneous injections of DENV3 (genotype III)-infected culture supernatant followed 24 h later by an injection of anti-DENV2 antibody. Control animals received either weekly anti-DENV2 antibodies, or no injections. Brain sections were immunolabeled for DENV3 antigens and IBA-1. Random and systematic microglial samples were taken from the polymorphic layer of dentate gyrus for 3-D reconstructions, where we found intense immunostaining for TNFα and DENV3 virus antigens. We submitted all bi- or multimodal morphological parameters of microglia to hierarchical cluster analysis and found two major morphological phenotypes designated types I and II. Compared to type I (stage 1), type II microglia were more complex; displaying higher number of nodes, processes and trees and larger surface area and volumes (stage 2). Type II microglia were found only in infected monkeys, whereas type I microglia was found in both control and infected subjects. Hierarchical cluster analysis of morphological parameters of 3-D reconstructions of random and systematic selected samples in control and ADE dengue infected monkeys suggests that microglia morphological changes from stage 1 to stage 2 may not be continuous. PMID:27047345

  16. Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data.

    Science.gov (United States)

    Jeon, Jihyoun; Hsu, Li; Gorfine, Malka

    2012-07-01

    Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.

  17. Dynamic Hierarchical Sleep Scheduling for Wireless Ad-Hoc Sensor Networks

    OpenAIRE

    Chih-Yu Wen; Ying-Chih Chen

    2009-01-01

    This paper presents two scheduling management schemes for wireless sensor networks, which manage the sensors by utilizing the hierarchical network structure and allocate network resources efficiently. A local criterion is used to simultaneously establish the sensing coverage and connectivity such that dynamic cluster-based sleep scheduling can be achieved. The proposed schemes are simulated and analyzed to abstract the network behaviors in a number of settings. The experimental results show t...

  18. Formal And Informal Macro-Regional Transport Clusters As A Primary Step In The Design And Implementation Of Cluster-Based Strategies

    Directory of Open Access Journals (Sweden)

    Nežerenko Olga

    2015-09-01

    Full Text Available The aim of the study is the identification of a formal macro-regional transport and logistics cluster and its development trends on a macro-regional level in 2007-2011 by means of the hierarchical cluster analysis. The central approach of the study is based on two concepts: 1 the concept of formal and informal macro-regions, and 2 the concept of clustering which is based on the similarities shared by the countries of a macro-region and tightly related to the concept of macro-region. The authors seek to answer the question whether the formation of a formal transport cluster could provide the BSR a stable competitive position in the global transportation and logistics market.

  19. Evolution of the cluster X-ray luminosity function

    DEFF Research Database (Denmark)

    Mullis, C.R.; Vikhlinin, A.; Henry, J.P.

    2004-01-01

    We report measurements of the cluster X-ray luminosity function out to z = 0.8 based on the final sample of 201 galaxy systems from the 160 Square Degree ROSAT Cluster Survey. There is little evidence for any measurable change in cluster abundance out to z similar to 0.6 at luminosities of less...... than a few times 10(44) h(50)(-2) ergs s(-1) (0.5 - 2.0 keV). However, for 0.6 cluster deficit using integrated number counts...... independently confirm the presence of evolution. Whereas the bulk of the cluster population does not evolve, the most luminous and presumably most massive structures evolve appreciably between z = 0.8 and the present. Interpreted in the context of hierarchical structure formation, we are probing sufficiently...

  20. HIERARCHICAL FRAGMENTATION AND JET-LIKE OUTFLOWS IN IRDC G28.34+0.06: A GROWING MASSIVE PROTOSTAR CLUSTER

    International Nuclear Information System (INIS)

    Wang Ke; Wu Yuefang; Zhang Huawei; Zhang Qizhou

    2011-01-01

    We present Submillimeter Array (SMA) λ = 0.88 mm observations of an infrared dark cloud G28.34+0.06. Located in the quiescent southern part of the G28.34 cloud, the region of interest is a massive (>10 3 M sun ) molecular clump P1 with a luminosity of ∼10 3 L sun , where our previous SMA observations at 1.3 mm have revealed a string of five dust cores of 22-64 M sun along the 1 pc IR-dark filament. The cores are well aligned at a position angle (P.A.) of 48 deg. and regularly spaced at an average projected separation of 0.16 pc. The new high-resolution, high-sensitivity 0.88 mm image further resolves the five cores into 10 compact condensations of 1.4-10.6 M sun , with sizes of a few thousand AU. The spatial structure at clump (∼1 pc) and core (∼0.1 pc) scales indicates a hierarchical fragmentation. While the clump fragmentation is consistent with a cylindrical collapse, the observed fragment masses are much larger than the expected thermal Jeans masses. All the cores are driving CO (3-2) outflows up to 38 km s -1 , the majority of which are bipolar, jet-like outflows. The moderate luminosity of the P1 clump sets a limit on the mass of protostars of 3-7 M sun . Because of the large reservoir of dense molecular gas in the immediate medium and ongoing accretion as evident by the jet-like outflows, we speculate that P1 will grow and eventually form a massive star cluster. This study provides a first glimpse of massive, clustered star formation that currently undergoes through an intermediate-mass stage.

  1. Hierarchical drivers of reef-fish metacommunity structure.

    Science.gov (United States)

    MacNeil, M Aaron; Graham, Nicholas A J; Polunin, Nicholas V C; Kulbicki, Michel; Galzin, René; Harmelin-Vivien, Mireille; Rushton, Steven P

    2009-01-01

    multiple spatial scales; and (3) inter-atoll connectedness was poorly correlated with the nonrandom clustering of reef-fish species. These results demonstrate the importance of modeling hierarchical data and processes in understanding reef-fish metacommunity structure.

  2. Building occupancy diversity and HVAC (heating, ventilation, and air conditioning) system energy efficiency

    International Nuclear Information System (INIS)

    Yang, Zheng; Ghahramani, Ali; Becerik-Gerber, Burcin

    2016-01-01

    Approximately forty percent of total building energy consumption is attributed to HVAC (heating, ventilation, and air conditioning) systems that aim to maintain healthy and comfortable indoor environments. An HVAC system is a network with several subsystems, and there exist heat transfer and balance among the zones of a building, as well as heat gains and losses through a building's envelope. Diverse occupancy (diversity in terms of when and how occupants occupy a building) in spaces could result in increase of loads that are not actual demands for an HVAC system, leading into inefficiencies. This paper introduces a framework to quantitatively evaluate the energy implications of occupancy diversity at the building level, where building information modeling is integrated to provide building geometries, HVAC system layouts, and spatial information as inputs for computing potential energy implications if occupancy diversity were to be eliminated. An agglomerate hierarchical clustering-based iterative evaluation algorithm is designed for iteratively eliminating occupancy diversity. Whole building energy simulations for a real-world building, as well as virtual reference buildings demonstrate that the proposed framework could effectively quantify the HVAC system energy efficiency affected by occupancy diversity and the framework is generalizable to different building geometries, layouts, and occupancy diversities. - Highlights: • Analyze relationships between occupancy diversity and HVAC energy efficiency. • Integrate BIM for quantifying energy implications of occupancy diversity. • Demonstrate the effectiveness and generalizability of iterative evaluation algorithm. • Improve agglomerative hierarchical clustering process using heap data structure.

  3. Labeling Residential Community Characteristics from Collective Activity Patterns Using Taxi Trip Data

    Science.gov (United States)

    Zhou, Y.; Fang, Z.

    2017-09-01

    There existing a significant social and spatial differentiation in the residential communities in urban city. People live in different places have different socioeconomic background, resulting in various geographically activity patterns. This paper aims to label the characteristics of residential communities in a city using collective activity patterns derived from taxi trip data. Specifically, we first present a method to allocate the O/D (Origin/Destination) points of taxi trips to the land use parcels where the activities taken place in. Then several indices are employed to describe the collective activity patterns, including both activity intensity, travel distance, travel time, and activity space of residents by taking account of the geographical distribution of all O/Ds of the taxi trip related to that residential community. Followed by that, an agglomerative hierarchical clustering algorithm is introduced to cluster the residential communities with similar activity patterns. In the case study of Wuhan, the residential communities are clearly divided into eight clusters, which could be labelled as ordinary communities, privileged communities, old isolated communities, suburban communities, and so on. In this paper, we provide a new perspective to label the land use under same type from people's mobility patterns with the support of big trajectory data.

  4. tclust: An R Package for a Trimming Approach to Cluster Analysis

    Directory of Open Access Journals (Sweden)

    2012-04-01

    Full Text Available Outlying data can heavily influence standard clustering methods. At the same time, clustering principles can be useful when robustifying statistical procedures. These two reasons motivate the development of feasible robust model-based clustering approaches. With this in mind, an R package for performing non-hierarchical robust clustering, called tclust, is presented here. Instead of trying to “fit” noisy data, a proportion α of the most outlying observations is trimmed. The tclust package efficiently handles different cluster scatter constraints. Graphical exploratory tools are also provided to help the user make sensible choices for the trimming proportion as well as the number of clusters to search for.

  5. A roadmap of clustering algorithms: finding a match for a biomedical application.

    Science.gov (United States)

    Andreopoulos, Bill; An, Aijun; Wang, Xiaogang; Schroeder, Michael

    2009-05-01

    Clustering is ubiquitously applied in bioinformatics with hierarchical clustering and k-means partitioning being the most popular methods. Numerous improvements of these two clustering methods have been introduced, as well as completely different approaches such as grid-based, density-based and model-based clustering. For improved bioinformatics analysis of data, it is important to match clusterings to the requirements of a biomedical application. In this article, we present a set of desirable clustering features that are used as evaluation criteria for clustering algorithms. We review 40 different clustering algorithms of all approaches and datatypes. We compare algorithms on the basis of desirable clustering features, and outline algorithms' benefits and drawbacks as a basis for matching them to biomedical applications.

  6. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.

    Science.gov (United States)

    Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl

    2012-07-13

    Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results

  7. Accurate detection of hierarchical communities in complex networks based on nonlinear dynamical evolution

    Science.gov (United States)

    Zhuo, Zhao; Cai, Shi-Min; Tang, Ming; Lai, Ying-Cheng

    2018-04-01

    One of the most challenging problems in network science is to accurately detect communities at distinct hierarchical scales. Most existing methods are based on structural analysis and manipulation, which are NP-hard. We articulate an alternative, dynamical evolution-based approach to the problem. The basic principle is to computationally implement a nonlinear dynamical process on all nodes in the network with a general coupling scheme, creating a networked dynamical system. Under a proper system setting and with an adjustable control parameter, the community structure of the network would "come out" or emerge naturally from the dynamical evolution of the system. As the control parameter is systematically varied, the community hierarchies at different scales can be revealed. As a concrete example of this general principle, we exploit clustered synchronization as a dynamical mechanism through which the hierarchical community structure can be uncovered. In particular, for quite arbitrary choices of the nonlinear nodal dynamics and coupling scheme, decreasing the coupling parameter from the global synchronization regime, in which the dynamical states of all nodes are perfectly synchronized, can lead to a weaker type of synchronization organized as clusters. We demonstrate the existence of optimal choices of the coupling parameter for which the synchronization clusters encode accurate information about the hierarchical community structure of the network. We test and validate our method using a standard class of benchmark modular networks with two distinct hierarchies of communities and a number of empirical networks arising from the real world. Our method is computationally extremely efficient, eliminating completely the NP-hard difficulty associated with previous methods. The basic principle of exploiting dynamical evolution to uncover hidden community organizations at different scales represents a "game-change" type of approach to addressing the problem of community

  8. Typing of unknown microorganisms based on quantitative analysis of fatty acids by mass spectrometry and hierarchical clustering

    Energy Technology Data Exchange (ETDEWEB)

    Li Tingting; Dai Ling; Li Lun; Hu Xuejiao; Dong Linjie; Li Jianjian; Salim, Sule Khalfan; Fu Jieying [Key Laboratory of Pesticides and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, Hubei 430079 (China); Zhong Hongying, E-mail: hyzhong@mail.ccnu.edu.cn [Key Laboratory of Pesticides and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, Hubei 430079 (China)

    2011-01-17

    Rapid identification of unknown microorganisms of clinical and agricultural importance is not only critical for accurate diagnosis of infections but also essential for appropriate and prompt treatment. We describe here a rapid method for microorganisms typing based on quantitative analysis of fatty acids by iFAT approach (Isotope-coded Fatty Acid Transmethylation). In this work, lyophilized cell lysates were directly mixed with 0.5 M NaOH solution in d3-methanol and n-hexane. After 1 min of ultrasonication, the top n-hexane layer was combined with a mixture of standard d0-methanol derived fatty acid methylesters with known concentration. Measurement of intensity ratios of d3/d0 labeled fragment ion and molecular ion pairs at the corresponding target fatty acids provides a quantitative basis for hierarchical clustering. In the resultant dendrogram, the Euclidean distance between unknown species and known species quantitatively reveals their differences or shared similarities in fatty acid related pathways. It is of particular interest to apply this method for typing fungal species because fungi has distinguished lipid biosynthetic pathways that have been targeted for lots of drugs or fungicides compared with bacteria and animals. The proposed method has no dependence on the availability of genome or proteome databases. Therefore, it is can be applicable for a broad range of unknown microorganisms or mutant species.

  9. Clusters and Groups of Galaxies : International Meeting

    CERN Document Server

    Giuricin, G; Mezzetti, M

    1984-01-01

    The large-scale structure of the Universe and systems Clusters, and Groups of galaxies are topics like Superclusters, They fully justify the meeting on "Clusters of great interest. and Groups of Galaxies". The topics covered included the spatial distribution and the clustering of galaxies; the properties of Superclusters, Clusters and Groups of galaxies; radio and X-ray observations; the problem of unseen matter; theories concerning hierarchical clustering, pancakes, cluster and galaxy formation and evolution. The meeting was held at the International Center for Theoretical Physics in Trieste (Italy) from September 13 to September 16, 1983. It was attended by about 150 participants from 22 nations who presented 67 invited lectures (il) and contributed papers (cp), and 45 poster papers (pp). The Scientific Organizing Committee consisted of F. Bertola, P. Biermann, A. Cavaliere, N. Dallaporta, D. Gerba1, M. Hack, J . V . Peach, D. Sciama (Chairman), G. Setti, M. Tarenghi. We are particularly indebted to D. Scia...

  10. Recommending the heterogeneous cluster type multi-processor system computing

    International Nuclear Information System (INIS)

    Iijima, Nobukazu

    2010-01-01

    Real-time reactor simulator had been developed by reusing the equipment of the Musashi reactor and its performance improvement became indispensable for research tools to increase sampling rate with introduction of arithmetic units using multi-Digital Signal Processor(DSP) system (cluster). In order to realize the heterogeneous cluster type multi-processor system computing, combination of two kinds of Control Processor (CP) s, Cluster Control Processor (CCP) and System Control Processor (SCP), were proposed with Large System Control Processor (LSCP) for hierarchical cluster if needed. Faster computing performance of this system was well evaluated by simulation results for simultaneous execution of plural jobs and also pipeline processing between clusters, which showed the system led to effective use of existing system and enhancement of the cost performance. (T. Tanaka)

  11. Physicochemical and antioxidant properties of kiwifruit as a function of cultivar and fruit harvested month

    Directory of Open Access Journals (Sweden)

    Ramesh Singh Pal

    2015-04-01

    Full Text Available The present study was carried out to find the effect of fruit harvesting stage (October, November and December on the physicochemical and antioxidant properties in five kiwi cultivars (Abbot, Bruno, Allison, Hayward, Monty. Results showed that soluble solid content (SSC and pH increased while ascorbic acid (Vit C, titrated acidity (TAD and SSC/TAD decreased in all the cultivars with delay in harvesting. Total polyphenols (TP were decreased while total flavonoids (TF increased in all tested cultivars with delay in harvesting. The highest concentration of TP (2.02 mg gallic acid equivalent/g fresh weight and TF (51.12 mg catechin equivalent/100g FW were found in cultivar 'Allison' in the month of October and December, respectively. Antioxidant activities (AA were genotype depended and no trend was observed with month of harvesting. Principal component analysis (PCA showed strong correlation between Vit C, TP and antioxidant activities. Two major clusters were computed using agglomerative hierarchical clustering (AHC. All the studied important traits may be used in the breeding programmes to increase the variability for different physiochemical and antioxidative characteristics and to make suitable selections that could be acceptable to consumers.

  12. The Application of Data Mining Techniques to Create Promotion Strategy for Mobile Phone Shop

    Science.gov (United States)

    Khasanah, A. U.; Wibowo, K. S.; Dewantoro, H. F.

    2017-12-01

    The number of mobile shop is growing very fast in various regions in Indonesia including in Yogyakarta due to the increasing demand of mobile phone. This fact leads high competition among the mobile phone shops. In these conditions the mobile phone shop should have a good promotion strategy in order to survive in competition, especially for a small mobile phone shop. To create attractive promotion strategy, the companies/shops should know their customer segmentation and the buying pattern of their target market. These kind of analysis can be done using Data mining technique. This study aims to segment customer using Agglomerative Hierarchical Clustering and know customer buying pattern using Association Rule Mining. This result conducted in a mobile shop in Sleman Yogyakarta. The clustering result shows that the biggest customer segment of the shop was male university student who come on weekend and from association rule mining, it can be concluded that tempered glass and smart phone “x” as well as action camera and waterproof monopod and power bank have strong relationship. This results that used to create promotion strategies which are presented in the end of the study.

  13. Simultaneous determination of 19 flavonoids in commercial trollflowers by using high-performance liquid chromatography and classification of samples by hierarchical clustering analysis.

    Science.gov (United States)

    Song, Zhiling; Hashi, Yuki; Sun, Hongyang; Liang, Yi; Lan, Yuexiang; Wang, Hong; Chen, Shizhong

    2013-12-01

    The flowers of Trollius species, named Jin Lianhua in Chinese, are widely used traditional Chinese herbs with vital biological activity that has been used for several decades in China to treat upper respiratory infections, pharyngitis, tonsillitis, and bronchitis. We developed a rapid and reliable method for simultaneous quantitative analysis of 19 flavonoids in trollflowers by using high-performance liquid chromatography (HPLC). Chromatography was performed on Inertsil ODS-3 C18 column, with gradient elution methanol-acetonitrile-water with 0.02% (v/v) formic acid. Content determination was used to evaluate the quality of commercial trollflowers from different regions in China, while three Trollius species (Trollius chinensis Bunge, Trollius ledebouri Reichb, Trollius buddae Schipcz) were explicitly distinguished by using hierarchical clustering analysis. The linearity, precision, accuracy, limit of detection, and limit of quantification were validated for the quantification method, which proved sensitive, accurate and reproducible indicating that the proposed approach was applicable for the routine analysis and quality control of trollflowers. © 2013.

  14. Superhydrophobic polytetrafluoroethylene thin films with hierarchical roughness deposited using a single step vapor phase technique

    International Nuclear Information System (INIS)

    Gupta, Sushant; Arjunan, Arul Chakkaravarthi; Deshpande, Sameer; Seal, Sudipta; Singh, Deepika; Singh, Rajiv K.

    2009-01-01

    Superhydrophobic polytetrafluoroethylene films with hierarchical surface roughness were deposited using pulse electron deposition technique. We were able to modulate roughness of the deposited films by controlling the beam energy and hence the electron penetration depth. The films deposited at higher beam energy showed contact angle as high as 166 o . The scanning electron and atomic force microscope studies revealed clustered growth and two level sub-micron asperities on films deposited at higher energies. Such dual-scale hierarchical roughness and heterogeneities at the water-surface interface was attributed to the observed contact angle and thus its superhydrophobic nature.

  15. Superhydrophobic polytetrafluoroethylene thin films with hierarchical roughness deposited using a single step vapor phase technique

    Energy Technology Data Exchange (ETDEWEB)

    Gupta, Sushant, E-mail: sushant3@ufl.ed [Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611 (United States); Arjunan, Arul Chakkaravarthi [Sinmat Incorporated, 2153 SE Hawthorne Road, 129, Gainesville, Florida 32641 (United States); Deshpande, Sameer; Seal, Sudipta [Advanced Material Processing and Analysis Center, University of Central Florida, Orlando, Florida 32816 (United States); Singh, Deepika [Sinmat Incorporated, 2153 SE Hawthorne Road, 129, Gainesville, Florida 32641 (United States); Singh, Rajiv K. [Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611 (United States)

    2009-06-30

    Superhydrophobic polytetrafluoroethylene films with hierarchical surface roughness were deposited using pulse electron deposition technique. We were able to modulate roughness of the deposited films by controlling the beam energy and hence the electron penetration depth. The films deposited at higher beam energy showed contact angle as high as 166{sup o}. The scanning electron and atomic force microscope studies revealed clustered growth and two level sub-micron asperities on films deposited at higher energies. Such dual-scale hierarchical roughness and heterogeneities at the water-surface interface was attributed to the observed contact angle and thus its superhydrophobic nature.

  16. Investigation of major international and Turkish companies via hierarchical methods and bootstrap approach

    Science.gov (United States)

    Kantar, E.; Deviren, B.; Keskin, M.

    2011-11-01

    We present a study, within the scope of econophysics, of the hierarchical structure of 98 among the largest international companies including 18 among the largest Turkish companies, namely Banks, Automobile, Software-hardware, Telecommunication Services, Energy and the Oil-Gas sectors, viewed as a network of interacting companies. We analyze the daily time series data of the Boerse-Frankfurt and Istanbul Stock Exchange. We examine the topological properties among the companies over the period 2006-2010 by using the concept of hierarchical structure methods (the minimal spanning tree (MST) and the hierarchical tree (HT)). The period is divided into three subperiods, namely 2006-2007, 2008 which was the year of global economic crisis, and 2009-2010, in order to test various time-windows and observe temporal evolution. We carry out bootstrap analyses to associate the value of statistical reliability to the links of the MSTs and HTs. We also use average linkage clustering analysis (ALCA) in order to better observe the cluster structure. From these studies, we find that the interactions among the Banks/Energy sectors and the other sectors were reduced after the global economic crisis; hence the effects of the Banks and Energy sectors on the correlations of all companies were decreased. Telecommunication Services were also greatly affected by the crisis. We also observed that the Automobile and Banks sectors, including Turkish companies as well as some companies from the USA, Japan and Germany were strongly correlated with each other in all periods.

  17. Needle Terpenes as Chemotaxonomic Markers in Pinus: Subsections Pinus and Pinaster.

    Science.gov (United States)

    Mitić, Zorica S; Jovanović, Snežana Č; Zlatković, Bojan K; Nikolić, Biljana M; Stojanović, Gordana S; Marin, Petar D

    2017-05-01

    Chemical compositions of needle essential oils of 27 taxa from the section Pinus, including 20 and 7 taxa of the subsections Pinus and Pinaster, respectively, were compared in order to determine chemotaxonomic significance of terpenes at infrageneric level. According to analysis of variance, six out of 31 studied terpene characters were characterized by a high level of significance, indicating statistically significant difference between the examined subsections. Agglomerative hierarchical cluster analysis has shown separation of eight groups, where representatives of subsect. Pinaster were distributed within the first seven groups on the dendrogram together with P. nigra subsp. laricio and P. merkusii from the subsect. Pinus. On the other hand, the eighth group included the majority of the members of subsect. Pinus. Our findings, based on terpene characters, complement those obtained from morphological, biochemical, and molecular parameters studied over the past two decades. In addition, results presented in this article confirmed that terpenes are good markers at infrageneric level. © 2017 Wiley-VHCA AG, Zurich, Switzerland.

  18. Investigation on IMCP based clustering in LTE-M communication for smart metering applications

    Directory of Open Access Journals (Sweden)

    Kartik Vishal Deshpande

    2017-06-01

    Full Text Available Machine to Machine (M2M is foreseen as an emerging technology for smart metering applications where devices communicate seamlessly for information transfer. The M2M communication makes use of long term evolution (LTE as its backbone network and it results in long-term evolution for machine type communication (LTE-M network. As huge number of M2M devices is to be handled by single eNB (evolved Node B, clustering is exploited for efficient processing of the network. This paper investigates the proposed Improved M2M Clustering Process (IMCP based clustering technique and it is compared with two well-known clustering algorithms, namely, Low Energy Adaptive Clustering Hierarchical (LEACH and Energy Aware Multihop Multipath Hierarchical (EAMMH techniques. Further, the IMCP algorithm is analyzed with two-tier and three-tier M2M systems for various mobility conditions. The proposed IMCP algorithm improves the last node death by 63.15% and 51.61% as compared to LEACH and EAMMH, respectively. Further, the average energy of each node in IMCP is increased by 89.85% and 81.15%, as compared to LEACH and EAMMH, respectively.

  19. Study on distributed re-clustering algorithm for moblie wireless sensor networks

    Directory of Open Access Journals (Sweden)

    XU Chaojie

    2016-04-01

    Full Text Available In mobile wireless sensor networks,node mobility influences the topology of the hierarchically clustered network,thus affects packet delivery ratio and energy consumption of communications in clusters.To reduce the influence of node mobility,a distributed re-clustering algorithm is proposed in this paper.In this algorithm,basing on the clustered network,nodes estimate their current locations with particle algorithm and predict the most possible locations of next time basing on the mobility model.Each boundary node of a cluster periodically estimates the need for re-clustering and re-cluster itself to the optimal cluster through communicating with the cluster headers when needed.The simulation results indicate that,with small re-clustering periods,the proposed algorithm can be effective to keep appropriate communication distance and outperforms existing schemes on packet delivery ratio and energy consumption.

  20. Geographical Characterization of Tunisian Olive Tree Leaves (cv. Chemlali) Using HPLC-ESI-TOF and IT/MS Fingerprinting with Hierarchical Cluster Analysis

    Science.gov (United States)

    Arráez Román, David; Gómez Caravaca, Ana María; Zarrouk, Mokhtar

    2018-01-01

    The olive plant has been extensively studied for its nutritional value, whereas its leaves have been specifically recognized as a processing by-product. Leaves are considered by-products of olive farming, representing a significant material arriving to the olive mill. They have been considered for centuries as an important herbal remedy in Mediterranean countries. Their beneficial properties are generally attributed to the presence of a range of phytochemicals such as secoiridoids, triterpenes, lignans, and flavonoids. With the aim to study the impact of geographical location on the phenolic compounds, Olea europaea leaves were handpicked from the Tunisian cultivar “Chemlali” from nine regions in the north, center, and south of Tunisia. The ground leaves were then extracted with methanol : water 80% (v/v) and analyzed by using high-performance liquid chromatography coupled to electrospray time of flight and ion trap mass spectrometry analyzers. A total of 38 compounds could be identified. Their contents showed significant variation among samples from different regions. Hierarchical cluster analysis was applied to highlight similarities in the phytochemical composition observed between the samples of different regions. PMID:29725553

  1. clusterMaker: a multi-algorithm clustering plugin for Cytoscape

    Directory of Open Access Journals (Sweden)

    Morris John H

    2011-11-01

    Full Text Available Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view, k-means, k-medoid, SCPS, AutoSOME, and native (Java MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin cluster

  2. Hierarchical functional modularity in the resting-state human brain.

    Science.gov (United States)

    Ferrarini, Luca; Veer, Ilya M; Baerends, Evelinda; van Tol, Marie-José; Renken, Remco J; van der Wee, Nic J A; Veltman, Dirk J; Aleman, André; Zitman, Frans G; Penninx, Brenda W J H; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, Julien

    2009-07-01

    Functional magnetic resonance imaging (fMRI) studies have shown that anatomically distinct brain regions are functionally connected during the resting state. Basic topological properties in the brain functional connectivity (BFC) map have highlighted the BFC's small-world topology. Modularity, a more advanced topological property, has been hypothesized to be evolutionary advantageous, contributing to adaptive aspects of anatomical and functional brain connectivity. However, current definitions of modularity for complex networks focus on nonoverlapping clusters, and are seriously limited by disregarding inclusive relationships. Therefore, BFC's modularity has been mainly qualitatively investigated. Here, we introduce a new definition of modularity, based on a recently improved clustering measurement, which overcomes limitations of previous definitions, and apply it to the study of BFC in resting state fMRI of 53 healthy subjects. Results show hierarchical functional modularity in the brain. Copyright 2009 Wiley-Liss, Inc

  3. Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems.

    Directory of Open Access Journals (Sweden)

    Martin Rosvall

    Full Text Available To comprehend the hierarchical organization of large integrated systems, we introduce the hierarchical map equation, which reveals multilevel structures in networks. In this information-theoretic approach, we exploit the duality between compression and pattern detection; by compressing a description of a random walker as a proxy for real flow on a network, we find regularities in the network that induce this system-wide flow. Finding the shortest multilevel description of the random walker therefore gives us the best hierarchical clustering of the network--the optimal number of levels and modular partition at each level--with respect to the dynamics on the network. With a novel search algorithm, we extract and illustrate the rich multilevel organization of several large social and biological networks. For example, from the global air traffic network we uncover countries and continents, and from the pattern of scientific communication we reveal more than 100 scientific fields organized in four major disciplines: life sciences, physical sciences, ecology and earth sciences, and social sciences. In general, we find shallow hierarchical structures in globally interconnected systems, such as neural networks, and rich multilevel organizations in systems with highly separated regions, such as road networks.

  4. Assessment of the quality of water by hierarchical cluster and variance analyses of the Koudiat Medouar Watershed, East Algeria

    Science.gov (United States)

    Tiri, Ammar; Lahbari, Noureddine; Boudoukha, Abderrahmane

    2017-12-01

    The assessment of surface water in Koudiat Medouar watershed is very important especially when it comes to pollution of the dam waters by discharges of wastewater from neighboring towns in Oued Timgad, who poured into the basin of the dam, and agricultural lands located along the Oued Reboa. To this end, the multivariable method was used to evaluate the spatial and temporal variation of the water surface quality of the Koudiat Medouar dam, eastern Algeria. The stiff diagram has identified two main hydrochemical facies. The first facies Mg-HCO3 is reflected in the first sampling station (Oued Reboa) and in the second one (Oued Timgad), while the second facies Mg-SO4 is reflected in the third station (Basin Dam). The results obtained by the analysis of variance show that in the three stations all parameters are significant, except for Na, K and HCO3 in the first station (Oued Reboa) and the EC in the second station (Oued Timgad) and at the end NO3 and pH in the third station (Basin Dam). Q-mode hierarchical cluster analysis showed that two main groups in each sampling station. The chemistry of major ions (Mg, Ca, HCO3 and SO4) within the three stations results from anthropogenic impacts and water-rock interaction sources.

  5. XML documents cluster research based on frequent subpatterns

    Science.gov (United States)

    Ding, Tienan; Li, Wei; Li, Xiongfei

    2015-12-01

    XML data is widely used in the information exchange field of Internet, and XML document data clustering is the hot research topic. In the XML document clustering process, measure differences between two XML documents is time costly, and impact the efficiency of XML document clustering. This paper proposed an XML documents clustering method based on frequent patterns of XML document dataset, first proposed a coding tree structure for encoding the XML document, and translate frequent pattern mining from XML documents into frequent pattern mining from string. Further, using the cosine similarity calculation method and cohesive hierarchical clustering method for XML document dataset by frequent patterns. Because of frequent patterns are subsets of the original XML document data, so the time consumption of XML document similarity measure is reduced. The experiment runs on synthetic dataset and the real datasets, the experimental result shows that our method is efficient.

  6. ProtoBee: Hierarchical classification and annotation of the honey bee proteome

    OpenAIRE

    Kaplan, Noam; Linial, Michal

    2006-01-01

    The recently sequenced genome of the honey bee (Apis mellifera) has produced 10,157 predicted protein sequences, calling for a computational effort to extract biological insights from them. We have applied an unsupervised hierarchical protein-clustering method, which was previously used in the ProtoNet system, to nearly 200,000 proteins consisting of the predicted honey bee proteins, the SWISS-PROT protein database, and the complete set of proteins of the mouse (Mus musculus) and the fruit fl...

  7. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

    Science.gov (United States)

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

    2008-06-18

    Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson

  8. Cluster Oriented Spatio Temporal Multidimensional Data Visualization of Earthquakes in Indonesia

    Directory of Open Access Journals (Sweden)

    Mohammad Nur Shodiq

    2016-03-01

    Full Text Available Spatio temporal data clustering is challenge task. The result of clustering data are utilized to investigate the seismic parameters. Seismic parameters are used to describe the characteristics of earthquake behavior. One of the effective technique to study multidimensional spatio temporal data is visualization. But, visualization of multidimensional data is complicated problem. Because, this analysis consists of observed data cluster and seismic parameters. In this paper, we propose a visualization system, called as IES (Indonesia Earthquake System, for cluster analysis, spatio temporal analysis, and visualize the multidimensional data of seismic parameters. We analyze the cluster analysis by using automatic clustering, that consists of get optimal number of cluster and Hierarchical K-means clustering. We explore the visual cluster and multidimensional data in low dimensional space visualization. We made experiment with observed data, that consists of seismic data around Indonesian archipelago during 2004 to 2014. Keywords: Clustering, visualization, multidimensional data, seismic parameters.

  9. Investigating the provenance of iron artifacts of the Royal Iron Factory of Sao Joao de Ipanema by hierarchical cluster analysis of EDS microanalyses of slag inclusions

    Energy Technology Data Exchange (ETDEWEB)

    Mamani-Calcina, Elmer Antonio; Landgraf, Fernando Jose Gomes; Azevedo, Cesar Roberto de Farias, E-mail: c.azevedo@usp.br [Universidade de Sao Paulo (USP), Sao Paulo, SP (Brazil). Escola Politecnica. Departmento de Engenharia Metalurgica e de Materiais

    2017-01-15

    Microstructural characterization techniques, including EDX (Energy Dispersive X-ray Analysis) microanalyses, were used to investigate the slag inclusions in the microstructure of ferrous artifacts of the Royal Iron Factory of Sao Joao de Ipanema (first steel plant of Brazil, XIX century), the D. Pedro II Bridge (located in Bahia, assembled in XIX century and produced in Scotland) and the archaeological sites of Sao Miguel de Missoes (Rio Grande do Sul, Brazil, production site of iron artifacts, the XVIII century) and Afonso Sardinha (Sao Paulo, Brazil production site of iron artifacts, XVI century). The microanalyses results of the main micro constituents of the microstructure of the slag inclusions were investigated by hierarchical cluster analysis and the dendrogram with the microanalyses results of the wüstite phase (using as critical variables the contents of MnO, MgO, Al{sub 2}O{sub 3}, V{sub 2}O{sub 5} and TiO{sub 2}) allowed the identification of four clusters, which successfully represented the samples of the four investigated sites (Ipanema, Sardinha, Missoes and Bahia). Finally, the comparatively low volumetric fraction of slag inclusions in the samples of Ipanema (∼1%) suggested the existence of technological expertise at the iron making processing in the Royal Iron Factory of Sao Joao de Ipanema. (author)

  10. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    Directory of Open Access Journals (Sweden)

    Zahra Sharafi

    2017-01-01

    Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  11. Transferability of STS markers in studying genetic relationships of marvel grass (Dichanthium annulatum).

    Science.gov (United States)

    Saxena, Raghvendra; Chandra, Amaresh

    2011-11-01

    Transferability of sequence-tagged-sites (STS) markers was assessed for genetic relationships study among accessions of marvel grass (Dichanthium annulatum Forsk.). In total, 17 STS primers of Stylosanthes origin were tested for their reactivity with thirty accessions of Dichanthium annulatum. Of these, 14 (82.4%) reacted and a total 106 (84 polymorphic) bands were scored. The number of bands generated by individual primer pairs ranged from 4 to 11 with an average of 7.57 bands, whereas polymorphic bands ranged from 4 to 9 with an average of 6.0 bands accounts to an average polymorphism of 80.1%. Polymorphic information content (PIC) ranged from 0.222 to 0.499 and marker index (MI) from 1.33 to 4.49. Utilizing Dice coefficient of genetic similarity dendrogram was generated through un-weighted pairgroup method with arithmetic mean (UPGMA) algorithm. Further, clustering through sequential agglomerative hierarchical and nested (SAHN) method resulted three main clusters constituted all accessions except IGBANG-D-2. Though there was intermixing of few accessions of one agro-climatic region to another, largely groupings of accessions were with their regions of collections. Bootstrap analysis at 1000 scale also showed large number of nodes (11 to 17) having strong clustering (> 50). Thus, results demonstrate the utility of STS markers of Stylosanthes in studying the genetic relationships among accessions of Dichanthium.

  12. An evaluation of centrality measures used in cluster analysis

    Science.gov (United States)

    Engström, Christopher; Silvestrov, Sergei

    2014-12-01

    Clustering of data into groups of similar objects plays an important part when analysing many types of data, especially when the datasets are large as they often are in for example bioinformatics, social networks and computational linguistics. Many clustering algorithms such as K-means and some types of hierarchical clustering need a number of centroids representing the 'center' of the clusters. The choice of centroids for the initial clusters often plays an important role in the quality of the clusters. Since a data point with a high centrality supposedly lies close to the 'center' of some cluster, this can be used to assign centroids rather than through some other method such as picking them at random. Some work have been done to evaluate the use of centrality measures such as degree, betweenness and eigenvector centrality in clustering algorithms. The aim of this article is to compare and evaluate the usefulness of a number of common centrality measures such as the above mentioned and others such as PageRank and related measures.

  13. MCBT: Multi-Hop Cluster Based Stable Backbone Trees for Data Collection and Dissemination in WSNs

    Directory of Open Access Journals (Sweden)

    Tae-Jin Lee

    2009-07-01

    Full Text Available We propose a stable backbone tree construction algorithm using multi-hop clusters for wireless sensor networks (WSNs. The hierarchical cluster structure has advantages in data fusion and aggregation. Energy consumption can be decreased by managing nodes with cluster heads. Backbone nodes, which are responsible for performing and managing multi-hop communication, can reduce the communication overhead such as control traffic and minimize the number of active nodes. Previous backbone construction algorithms, such as Hierarchical Cluster-based Data Dissemination (HCDD and Multicluster, Mobile, Multimedia radio network (MMM, consume energy quickly. They are designed without regard to appropriate factors such as residual energy and degree (the number of connections or edges to other nodes of a node for WSNs. Thus, the network is quickly disconnected or has to reconstruct a backbone. We propose a distributed algorithm to create a stable backbone by selecting the nodes with higher energy or degree as the cluster heads. This increases the overall network lifetime. Moreover, the proposed method balances energy consumption by distributing the traffic load among nodes around the cluster head. In the simulation, the proposed scheme outperforms previous clustering schemes in terms of the average and the standard deviation of residual energy or degree of backbone nodes, the average residual energy of backbone nodes after disseminating the sensed data, and the network lifetime.

  14. Dynamic Hierarchical Sleep Scheduling for Wireless Ad-Hoc Sensor Networks

    Directory of Open Access Journals (Sweden)

    Chih-Yu Wen

    2009-05-01

    Full Text Available This paper presents two scheduling management schemes for wireless sensor networks, which manage the sensors by utilizing the hierarchical network structure and allocate network resources efficiently. A local criterion is used to simultaneously establish the sensing coverage and connectivity such that dynamic cluster-based sleep scheduling can be achieved. The proposed schemes are simulated and analyzed to abstract the network behaviors in a number of settings. The experimental results show that the proposed algorithms provide efficient network power control and can achieve high scalability in wireless sensor networks.

  15. Dynamic hierarchical sleep scheduling for wireless ad-hoc sensor networks.

    Science.gov (United States)

    Wen, Chih-Yu; Chen, Ying-Chih

    2009-01-01

    This paper presents two scheduling management schemes for wireless sensor networks, which manage the sensors by utilizing the hierarchical network structure and allocate network resources efficiently. A local criterion is used to simultaneously establish the sensing coverage and connectivity such that dynamic cluster-based sleep scheduling can be achieved. The proposed schemes are simulated and analyzed to abstract the network behaviors in a number of settings. The experimental results show that the proposed algorithms provide efficient network power control and can achieve high scalability in wireless sensor networks.

  16. New Heterogeneous Clustering Protocol for Prolonging Wireless Sensor Networks Lifetime

    Directory of Open Access Journals (Sweden)

    Md. Golam Rashed

    2014-06-01

    Full Text Available Clustering in wireless sensor networks is one of the crucial methods for increasing of network lifetime. The network characteristics of existing classical clustering protocols for wireless sensor network are homogeneous. Clustering protocols fail to maintain the stability of the system, especially when nodes are heterogeneous. We have seen that the behavior of Heterogeneous-Hierarchical Energy Aware Routing Protocol (H-HEARP becomes very unstable once the first node dies, especially in the presence of node heterogeneity. In this paper we assume a new clustering protocol whose network characteristics is heterogeneous for prolonging of network lifetime. The computer simulation results demonstrate that the proposed clustering algorithm outperforms than other clustering algorithms in terms of the time interval before the death of the first node (we refer to as stability period. The simulation results also show the high performance of the proposed clustering algorithm for higher values of extra energy brought by more powerful nodes.

  17. Hierarchical temporal structure in music, speech and animal vocalizations: jazz is like a conversation, humpbacks sing like hermit thrushes.

    Science.gov (United States)

    Kello, Christopher T; Bella, Simone Dalla; Médé, Butovens; Balasubramaniam, Ramesh

    2017-10-01

    Humans talk, sing and play music. Some species of birds and whales sing long and complex songs. All these behaviours and sounds exhibit hierarchical structure-syllables and notes are positioned within words and musical phrases, words and motives in sentences and musical phrases, and so on. We developed a new method to measure and compare hierarchical temporal structures in speech, song and music. The method identifies temporal events as peaks in the sound amplitude envelope, and quantifies event clustering across a range of timescales using Allan factor (AF) variance. AF variances were analysed and compared for over 200 different recordings from more than 16 different categories of signals, including recordings of speech in different contexts and languages, musical compositions and performances from different genres. Non-human vocalizations from two bird species and two types of marine mammals were also analysed for comparison. The resulting patterns of AF variance across timescales were distinct to each of four natural categories of complex sound: speech, popular music, classical music and complex animal vocalizations. Comparisons within and across categories indicated that nested clustering in longer timescales was more prominent when prosodic variation was greater, and when sounds came from interactions among individuals, including interactions between speakers, musicians, and even killer whales. Nested clustering also was more prominent for music compared with speech, and reflected beat structure for popular music and self-similarity across timescales for classical music. In summary, hierarchical temporal structures reflect the behavioural and social processes underlying complex vocalizations and musical performances. © 2017 The Author(s).

  18. Coopération décentralisée entre la Communauté d’Agglomération de Castres-Mazamet (France et la Ville de Guédiawaye (Sénégal

    Directory of Open Access Journals (Sweden)

    Françoise Desbordes

    2016-02-01

    Full Text Available La communauté de travail réunie lors des rencontres régionales de l’eAtlas francophone de l’Afrique de l’ouest a exprimé son intention de se doter d’une plateforme collaborative, outil d’information sur les relations entre sociétés, TIC et territoires en Afrique de l’ouest. La conception d’un portail collaboratif dédié à la gouvernance locale et au développement durable dans l’agglomération dakaroise a été retenue comme un de ces projets. Son objectif est bien de concourir à l’observation et ...

  19. Hierarchical clustering of Alzheimer and 'normal' brains using elemental concentrations and glucose metabolism determined by PIXE, INAA and PET

    International Nuclear Information System (INIS)

    Cutts, D.A.; Spyrou, N.M.

    2001-01-01

    Brain tissue samples, obtained from the Alzheimer Disease Brain Bank, Institute of Psychiatry, London, were taken from both left and right hemispheres of three regions of the cerebrum, namely the frontal, parietal and occipital lobes for both Alzheimer and 'normal' subjects. Trace element concentrations in the frontal lobe were determined for twenty six Alzheimer (15 male, 11 female) and twenty six 'normal' (8 male, 18 female) brain tissue samples. In the parietal lobe ten Alzheimer (2 male, 8 female) and ten 'normal' (8 male, 2 female) samples were taken along with ten Alzheimer (4 male, 6 female) and ten 'normal' (6 male, 4 female) from the occipital lobe. For the frontal lobe trace element concentrations were determined using proton induced X-ray emission (PIXE) analysis while in parietal and occipital regions instrumental neutron activation analysis (INAA) was used. Additionally eighteen Alzheimer (9 male, 9 female) and eighteen age matched 'normal' (8 male, 10 female) living subjects were examined using positron emission tomography (PET) in order to determine regional cerebral metabolic rates of glucose (rCMRGlu). The rCMRGlu of 36 regions of the brain was investigated including frontal, occipital and parietal lobes as in the trace element study. Hierarchical cluster analysis was applied to the trace element and glucose metabolism data to discover which variables in the resulting dendrograms displayed the most significant separation between Alzheimer and 'normal' subjects. (author)

  20. Clusters of Tourism Consumers in Romania

    Directory of Open Access Journals (Sweden)

    Pelau Corina

    2018-03-01

    Full Text Available The analysis and determination of typologies of tourism consumers has been a major concern for scientists, specialists and companies as well. Knowing the demographic and motivational factors that determine consumers to buy tourism products can have a major impact on the marketing strategy by a more efficient targeting of customers. This article presents the results of a research that aims to determine the factors which influence the buying decision for tourism products and the clusters of consumers resulted from these factors. 90 persons have been surveyed pursuing the determination of the most important factors for buying a tourism product and the correlation between them. The factor analysis and the cluster analysis have been applied with the help of the SPSS program. The results of the factor analysis group the items into six factors. In a second phase, the consumers have been divided into three categories based on a hierarchical Ward cluster analysis. The three clusters have been defined and analyzed and recommendations for the future research have been given.

  1. A rapid ATR-FTIR spectroscopic method for detection of sibutramine adulteration in tea and coffee based on hierarchical cluster and principal component analyses.

    Science.gov (United States)

    Cebi, Nur; Yilmaz, Mustafa Tahsin; Sagdic, Osman

    2017-08-15

    Sibutramine may be illicitly included in herbal slimming foods and supplements marketed as "100% natural" to enhance weight loss. Considering public health and legal regulations, there is an urgent need for effective, rapid and reliable techniques to detect sibutramine in dietetic herbal foods, teas and dietary supplements. This research comprehensively explored, for the first time, detection of sibutramine in green tea, green coffee and mixed herbal tea using ATR-FTIR spectroscopic technique combined with chemometrics. Hierarchical cluster analysis and PCA principle component analysis techniques were employed in spectral range (2746-2656cm -1 ) for classification and discrimination through Euclidian distance and Ward's algorithm. Unadulterated and adulterated samples were classified and discriminated with respect to their sibutramine contents with perfect accuracy without any false prediction. The results suggest that existence of the active substance could be successfully determined at the levels in the range of 0.375-12mg in totally 1.75g of green tea, green coffee and mixed herbal tea by using FTIR-ATR technique combined with chemometrics. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Aspects of Sentence Retrieval

    Science.gov (United States)

    2006-09-01

    length of pauses between speakers, to determine story segments in ASR texts. Eichmann and Srinivasan [39] use agglomerative clustering, starting with...Austrians believe they fought nobly or at least dutifully, Eichmann ran the Third Reich’s racial pogroms from Vienna and, after the war, the Allies for...39] Eichmann , David, and Srinivasan, Padmini. A cluster-based approach to broad- cast news. In Topic Detection and Tracking, James Allan, Ed

  3. Self-similar hierarchical energetics in the ICM of massive galaxy clusters

    Science.gov (United States)

    Miniati, Francesco; Beresnyak, Andrey

    Massive galaxy clusters (GC) are filled with a hot, turbulent and magnetised intra-cluster medium (ICM). They are still forming under the action of gravitational instability, which drives supersonic mass accretion flows. These partially dissipate into heat through a complex network of large scale shocks, and partly excite giant turbulent eddies and cascade. Turbulence dissipation not only contributes to heating of the ICM but also amplifies magnetic energy by way of dynamo action. The pattern of gravitational energy turning into kinetic, thermal, turbulent and magnetic is a fundamental feature of GC hydrodynamics but quantitative modelling has remained a challenge. In this contribution we present results from a recent high resolution, fully cosmological numerical simulation of a massive Coma-like galaxy cluster in which the time dependent turbulent motions of the ICM are resolved (Miniati 2014) and their statistical properties are quantified for the first time (Miniati 2015, Beresnyak & Miniati 2015). We combine these results with independent state-of-the art numerical simulations of MHD turbulence (Beresnyak 2012), which shows that in the nonlinear regime of turbulent dynamo (for magnetic Prandtl numbers>~ 1) the growth rate of the magnetic energy corresponds to a fraction CE ~= 4 - 5 × 10-2 of the turbulent dissipation rate. We thus determine without adjustable parameters the thermal, turbulent and magnetic history of giant GC (Miniati & Beresnyak 2015). We find that the energy components of the ICM are ordered according to a permanent hierarchy, in which the sonic Mach number at the turbulent injection scale is of order unity, the beta of the plasma of order forty and the ratio of turbulent injection scale to Alfvén scale is of order one hundred. These dimensionless numbers remain virtually unaltered throughout the cluster's history, despite evolution of each individual component and the drive towards equipartition of the turbulent dynamo, thus revealing a new

  4. A Bone Sample Containing a Bone Graft Substitute Analyzed by Correlating Density Information Obtained by X-ray Micro Tomography with Compositional Information Obtained by Raman Microscopy

    Directory of Open Access Journals (Sweden)

    Johann Charwat-Pessler

    2015-06-01

    Full Text Available The ability of bone graft substitutes to promote new bone formation has been increasingly used in the medical field to repair skeletal defects or to replace missing bone in a broad range of applications in dentistry and orthopedics. A common way to assess such materials is via micro computed tomography (µ-CT, through the density information content provided by the absorption of X-rays. Information on the chemical composition of a material can be obtained via Raman spectroscopy. By investigating a bone sample from miniature pigs containing the bone graft substitute Bio Oss®, we pursued the target of assessing to what extent the density information gained by µ-CT imaging matches the chemical information content provided by Raman spectroscopic imaging. Raman images and Raman correlation maps of the investigated sample were used in order to generate a Raman based segmented image by means of an agglomerative, hierarchical cluster analysis. The resulting segments, showing chemically related areas, were subsequently compared with the µ-CT image by means of a one-way ANOVA. We found out that to a certain extent typical gray-level values (and the related histograms in the µ-CT image can be reliably related to specific segments within the image resulting from the cluster analysis.

  5. Spatial Air Quality Modelling Using Chemometrics Techniques: A Case Study in Peninsular Malaysia

    International Nuclear Information System (INIS)

    Azman Azid; Hafizan Juahir; Mohammad Azizi Amran; Zarizal Suhaili; Mohamad Romizan Osman; Asyaari Muhamad; Asyaari Muhamad; Ismail Zainal Abidin; Nur Hishaam Sulaiman; Ahmad Shakir Mohd Saudi

    2015-01-01

    This study shows the effectiveness of hierarchical agglomerative cluster analysis (HACA), discriminant analysis (DA), principal component analysis (PCA), and multiple linear regressions (MLR) for assessment of air quality data and recognition of air pollution sources. 12 months data (January-December 2007) consisting of 14 stations in Peninsular Malaysia with 14 parameters were applied. Three significant clusters - low pollution source (LPS), moderate pollution source (MPS), and slightly high pollution source (SHPS) were generated via HACA. Forward stepwise of DA managed to discriminate eight variables, whereas backward stepwise of DA managed to discriminate nine variables out of fourteen variables. The PCA and FA results show the main contributor of air pollution in Peninsular Malaysia is the combustion of fossil fuel from industrial activities, transportation and agriculture systems. Four MLR models show that PM_1_0 account as the most and the highest pollution contributor to Malaysian air quality. From the study, it can be stipulated that the application of chemometrics techniques can disclose meaningful information on the spatial variability of a large and complex air quality data. A clearer review about the air quality and a novelty design of air quality monitoring network for better management of air pollution can be achieved via these methods. (author)

  6. Chemical Polymorphism of Essential Oils of Artemisia vulgaris Growing Wild in Lithuania.

    Science.gov (United States)

    Judzentiene, Asta; Budiene, Jurga

    2018-02-01

    Compositional variability of mugwort (Artemisia vulgaris L.) essential oils has been investigated in the study. Plant material (over ground parts at full flowering stage) was collected from forty-four wild populations in Lithuania. The oils from aerial parts were obtained by hydrodistillation and analyzed by GC(FID) and GC/MS. In total, up to 111 components were determined in the oils. As the major constituents were found: sabinene, 1,8-cineole, artemisia ketone, both thujone isomers, camphor, cis-chrysanthenyl acetate, davanone and davanone B. The compositional data were subjected to statistical analysis. The application of PCA (Principal Component Analysis) and AHC (Agglomerative Hierarchical Clustering) allowed grouping the oils into six clusters. AHC permitted to distinguish an artemisia ketone chemotype, which, to the best of our knowledge, is very scarce. Additionally, two rare cis-chrysanthenyl acetate and sabinene oil types were determined for the plants growing in Lithuania. Besides, davanone was found for the first time as a principal component in mugwort oils. The performed study revealed significant chemical polymorphism of essential oils in mugwort plants native to Lithuania; it has expanded our chemotaxonomic knowledge both of A. vulgaris species and Artemisia genus. © 2018 Wiley-VHCA AG, Zurich, Switzerland.

  7. IDENTIFICAÇÃO DE CLUSTERS INTERNACIONAIS COM BASE NAS DIMENSÕES CULTURAIS DE HOFSTEDE. / Identification of international clusters based on the hofstede’s cultural dimensions

    Directory of Open Access Journals (Sweden)

    Valderí de Castro Alcântara1

    2012-08-01

    , K-Means Cluster Analysis and Discriminant Analysis to determine and validate groupings of countries based on Hofstede’s cultural dimensions (Distance Index, Individualism, Masculinity and Uncertainty Avoidance Index. The results led to four clusters: Cluster 1 - countries with masculine culture and individualistic; Cluster 2 - collectivistic and uncertainty averse; Cluster 3 - feminine culture and low hierarchical distance and Cluster 4 - culturewith high hierarchical distance and propensity to uncertainty.

  8. Hierarchical Solution of the Traveling Salesman Problem with Random Dyadic Tilings

    Science.gov (United States)

    Kalmár-Nagy, Tamás; Bak, Bendegúz Dezső

    We propose a hierarchical heuristic approach for solving the Traveling Salesman Problem (TSP) in the unit square. The points are partitioned with a random dyadic tiling and clusters are formed by the points located in the same tile. Each cluster is represented by its geometrical barycenter and a “coarse” TSP solution is calculated for these barycenters. Midpoints are placed at the middle of each edge in the coarse solution. Near-optimal (or optimal) minimum tours are computed for each cluster. The tours are concatenated using the midpoints yielding a solution for the original TSP. The method is tested on random TSPs (independent, identically distributed points in the unit square) up to 10,000 points as well as on a popular benchmark problem (att532 — coordinates of 532 American cities). Our solutions are 8-13% longer than the optimal ones. We also present an optimization algorithm for the partitioning to improve our solutions. This algorithm further reduces the solution errors (by several percent using 1000 iteration steps). The numerical experiments demonstrate the viability of the approach.

  9. Detecting Hierarchical Structure in Networks

    DEFF Research Database (Denmark)

    Herlau, Tue; Mørup, Morten; Schmidt, Mikkel Nørgaard

    2012-01-01

    Many real-world networks exhibit hierarchical organization. Previous models of hierarchies within relational data has focused on binary trees; however, for many networks it is unknown whether there is hierarchical structure, and if there is, a binary tree might not account well for it. We propose...... a generative Bayesian model that is able to infer whether hierarchies are present or not from a hypothesis space encompassing all types of hierarchical tree structures. For efficient inference we propose a collapsed Gibbs sampling procedure that jointly infers a partition and its hierarchical structure....... On synthetic and real data we demonstrate that our model can detect hierarchical structure leading to better link-prediction than competing models. Our model can be used to detect if a network exhibits hierarchical structure, thereby leading to a better comprehension and statistical account the network....

  10. Forecasting building energy consumption with hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system

    Energy Technology Data Exchange (ETDEWEB)

    Li, Kangji [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China); School of Electricity Information Engineering, Jiangsu University, Zhenjiang 212013 (China); Su, Hongye [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China)

    2010-11-15

    There are several ways to forecast building energy consumption, varying from simple regression to models based on physical principles. In this paper, a new method, namely, the hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system (GA-HANFIS) model is developed. In this model, hierarchical structure decreases the rule base dimension. Both clustering and rule base parameters are optimized by GAs and neural networks (NNs). The model is applied to predict a hotel's daily air conditioning consumption for a period over 3 months. The results obtained by the proposed model are presented and compared with regular method of NNs, which indicates that GA-HANFIS model possesses better performance than NNs in terms of their forecasting accuracy. (author)

  11. TWO-STAGE CHARACTER CLASSIFICATION : A COMBINED APPROACH OF CLUSTERING AND SUPPORT VECTOR CLASSIFIERS

    NARCIS (Netherlands)

    Vuurpijl, L.; Schomaker, L.

    2000-01-01

    This paper describes a two-stage classification method for (1) classification of isolated characters and (2) verification of the classification result. Character prototypes are generated using hierarchical clustering. For those prototypes known to sometimes produce wrong classification results, a

  12. A Cluster Analysis of Personality Style in Adults with ADHD

    Science.gov (United States)

    Robin, Arthur L.; Tzelepis, Angela; Bedway, Marquita

    2008-01-01

    Objective: The purpose of this study was to use hierarchical linear cluster analysis to examine the normative personality styles of adults with ADHD. Method: A total of 311 adults with ADHD completed the Millon Index of Personality Styles, which consists of 24 scales assessing motivating aims, cognitive modes, and interpersonal behaviors. Results:…

  13. Interactive visual exploration and refinement of cluster assignments.

    Science.gov (United States)

    Kern, Michael; Lex, Alexander; Gehlenborg, Nils; Johnson, Chris R

    2017-09-12

    With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data. In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes. Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.

  14. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    Science.gov (United States)

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  15. An Integrated Intrusion Detection Model of Cluster-Based Wireless Sensor Network.

    Science.gov (United States)

    Sun, Xuemei; Yan, Bo; Zhang, Xinzhong; Rong, Chuitian

    2015-01-01

    Considering wireless sensor network characteristics, this paper combines anomaly and mis-use detection and proposes an integrated detection model of cluster-based wireless sensor network, aiming at enhancing detection rate and reducing false rate. Adaboost algorithm with hierarchical structures is used for anomaly detection of sensor nodes, cluster-head nodes and Sink nodes. Cultural-Algorithm and Artificial-Fish-Swarm-Algorithm optimized Back Propagation is applied to mis-use detection of Sink node. Plenty of simulation demonstrates that this integrated model has a strong performance of intrusion detection.

  16. Clustering cliques for graph-based summarization of the biomedical research literature

    DEFF Research Database (Denmark)

    Zhang, Han; Fiszman, Marcelo; Shin, Dongwook

    2013-01-01

    Background: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts).Results: Sem......Rep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm...

  17. An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins.

    Directory of Open Access Journals (Sweden)

    Angela F Harper

    2017-02-01

    Full Text Available Peroxiredoxins (Prxs or Prdxs are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique. MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is

  18. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    International Nuclear Information System (INIS)

    Harmon, S; Wendelberger, B; Jeraj, R

    2014-01-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [ 18 F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI mean = 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI range : 0.2301–1). Conclusion: Using commonly-used clustering algorithms, we found

  19. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Harmon, S; Wendelberger, B [University of Wisconsin-Madison, Madison, WI (United States); Jeraj, R [University of Wisconsin-Madison, Madison, WI (United States); University of Ljubljana (Slovenia)

    2014-06-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [{sup 18}F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI{sub mean}= 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI{sub range}: 0.2301–1). Conclusion: Using commonly-used clustering algorithms

  20. Catalysis with hierarchical zeolites

    DEFF Research Database (Denmark)

    Holm, Martin Spangsberg; Taarning, Esben; Egeblad, Kresten

    2011-01-01

    Hierarchical (or mesoporous) zeolites have attracted significant attention during the first decade of the 21st century, and so far this interest continues to increase. There have already been several reviews giving detailed accounts of the developments emphasizing different aspects of this research...... topic. Until now, the main reason for developing hierarchical zeolites has been to achieve heterogeneous catalysts with improved performance but this particular facet has not yet been reviewed in detail. Thus, the present paper summaries and categorizes the catalytic studies utilizing hierarchical...... zeolites that have been reported hitherto. Prototypical examples from some of the different categories of catalytic reactions that have been studied using hierarchical zeolite catalysts are highlighted. This clearly illustrates the different ways that improved performance can be achieved with this family...

  1. The Design of Cluster Randomized Trials with Random Cross-Classifications

    Science.gov (United States)

    Moerbeek, Mirjam; Safarkhani, Maryam

    2018-01-01

    Data from cluster randomized trials do not always have a pure hierarchical structure. For instance, students are nested within schools that may be crossed by neighborhoods, and soldiers are nested within army units that may be crossed by mental health-care professionals. It is important that the random cross-classification is taken into account…

  2. A Hybrid Fuzzy Multi-hop Unequal Clustering Algorithm for Dense Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Shawkat K. Guirguis

    2017-01-01

    Full Text Available Clustering is carried out to explore and solve power dissipation problem in wireless sensor network (WSN. Hierarchical network architecture, based on clustering, can reduce energy consumption, balance traffic load, improve scalability, and prolong network lifetime. However, clustering faces two main challenges: hotspot problem and searching for effective techniques to perform clustering. This paper introduces a fuzzy unequal clustering technique for heterogeneous dense WSNs to determine both final cluster heads and their radii. Proposed fuzzy system blends three effective parameters together which are: the distance to the base station, the density of the cluster, and the deviation of the noders residual energy from the average network energy. Our objectives are achieving gain for network lifetime, energy distribution, and energy consumption. To evaluate the proposed algorithm, WSN clustering based routing algorithms are analyzed, simulated, and compared with obtained results. These protocols are LEACH, SEP, HEED, EEUC, and MOFCA.

  3. Canonical PSO Based K-Means Clustering Approach for Real Datasets.

    Science.gov (United States)

    Dey, Lopamudra; Chakraborty, Sanjay

    2014-01-01

    "Clustering" the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different types of indexes are used to solve different types of problems and indices selection depends on the kind of available data. This paper first proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the effects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. This paper also describes the nature of the clusters and finally compares the performances of these clustering algorithms according to the validity assessment. It also defines which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms.

  4. Gene expression data clustering and it’s application in differential analysis of leukemia

    Directory of Open Access Journals (Sweden)

    M. Vahedi

    2008-02-01

    Full Text Available Introduction: DNA microarray technique is one of the most important categories in bioinformatics,which allows the possibility of monitoring thousands of expressed genes has been resulted in creatinggiant data bases of gene expression data, recently. Statistical analysis of such databases includednormalization, clustering, classification and etc.Materials and Methods: Golub et al (1999 collected data bases of leukemia based on the method ofoligonucleotide. The data is on the internet. In this paper, we analyzed gene expression data. It wasclustered by several methods including multi-dimensional scaling, hierarchical and non-hierarchicalclustering. Data set included 20 Acute Lymphoblastic Leukemia (ALL patients and 14 Acute MyeloidLeukemia (AML patients. The results of tow methods of clustering were compared with regard to realgrouping (ALL & AML. R software was used for data analysis.Results: Specificity and sensitivity of divisive hierarchical clustering in diagnosing of ALL patientswere 75% and 92%, respectively. Specificity and sensitivity of partitioning around medoids indiagnosing of ALL patients were 90% and 93%, respectively. These results showed a wellaccomplishment of both methods of clustering. It is considerable that, due to clustering methodsresults, one of the samples was placed in ALL groups, which was in AML group in clinical test.Conclusion: With regard to concordance of the results with real grouping of data, therefore we canuse these methods in the cases where we don't have accurate information of real grouping of data.Moreover, Results of clustering might distinct subgroups of data in such a way that would be necessaryfor concordance with clinical outcomes, laboratory results and so on.

  5. Parallel hierarchical radiosity rendering

    Energy Technology Data Exchange (ETDEWEB)

    Carter, Michael [Iowa State Univ., Ames, IA (United States)

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  6. Hierarchical prisoner’s dilemma in hierarchical game for resource competition

    Science.gov (United States)

    Fujimoto, Yuma; Sagawa, Takahiro; Kaneko, Kunihiko

    2017-07-01

    Dilemmas in cooperation are one of the major concerns in game theory. In a public goods game, each individual cooperates by paying a cost or defecting without paying it, and receives a reward from the group out of the collected cost. Thus, defecting is beneficial for each individual, while cooperation is beneficial for the group. Now, groups (say, countries) consisting of individuals also play games. To study such a multi-level game, we introduce a hierarchical game in which multiple groups compete for limited resources by utilizing the collected cost in each group, where the power to appropriate resources increases with the population of the group. Analyzing this hierarchical game, we found a hierarchical prisoner’s dilemma, in which groups choose the defecting policy (say, armament) as a Nash strategy to optimize each group’s benefit, while cooperation optimizes the total benefit. On the other hand, for each individual, refusing to pay the cost (say, tax) is a Nash strategy, which turns out to be a cooperation policy for the group, thus leading to a hierarchical dilemma. Here the group reward increases with the group size. However, we find that there exists an optimal group size that maximizes the individual payoff. Furthermore, when the population asymmetry between two groups is large, the smaller group will choose a cooperation policy (say, disarmament) to avoid excessive response from the larger group, and the prisoner’s dilemma between the groups is resolved. Accordingly, the relevance of this hierarchical game on policy selection in society and the optimal size of human or animal groups are discussed.

  7. Steroids from Poison Hemlock (Conium maculatum L.: A GC-MS analysis

    Directory of Open Access Journals (Sweden)

    Radulović Niko S.

    2011-01-01

    Full Text Available The steroid content of Conium maculatum L. (Poison Hemlock, Apiaceae, a well-known weed plant species, was studied herein for the first time. This was achieved by detailed GC-MS analyses of twenty two samples (dichloromethane extracts of different plant organs of C. maculatum at three or four different stages of phenological development, collected from three locations. In total, twenty four different steroids were identified. Six steroids had an ergostane nucleus while the other ones possessed a stigmastane carbon framework. The identity of these compounds was determined by spectral means (MS fragmentation, GC co-injections with authentic standards and chemical transformation (silylation. Steroid compounds were noted to be the main chemical constituents of root extracts (up to 70 % of this plant species in the last phase of development. The predominant ones were stigmasta-5,22- dien-3β-ol (stigmasterol and stigmasta-5-en-3β-ol (β-sitosterol. In an attempt to classify the samples, principal component analysis (PCA and agglomerative hierarchical clustering (AHC were performed using steroid percentages as variables.

  8. CORRELAÇÕES ENTRE CONTAGEM DE CÉLULAS SOMÁTICAS E PARÂMETROS FÍSICO-QUÍMICOS E MICROBIOLÓGICOS DE QUALIDADE DO LEITE

    Directory of Open Access Journals (Sweden)

    José Laerte Nörnberg

    2014-12-01

    Full Text Available The study aimed to evaluate the correlations between somatic cell count (SCC and milk components, and verify the associations of environmental conditions with SCC. Data were obtained from 1,541 dairy farms located in 15 municipalities in the dairy region of Vale do Taquari, Rio Grande do Sul. The data from SCC, total bacterial count (TBC and milk composition, from June 2008 to December 2011, were tabulated, totaling 44,089 samples. The environment temperature showed positive and significant correlation with the somatic cell score, while rainfall and air relative humidity showed no correlation. The fat, protein, minerals and total solids were directly correlated with the SCC, while non-fat-solids and lactose showed an opposite behavior. By the principal component analysis (PCA followed by the agglomerative hierarchical clustering method, the seven treatments in the present study were reduced to five groups according to the similarity, showing that milk with SCC above 400,000 to 750,000 cels mL-1 present the same quality, not justifying the interval stratification within this range of variation.

  9. A functional interaction approach to the definition of meso regions: The case of the Czech Republic

    Directory of Open Access Journals (Sweden)

    Erlebach Martin

    2016-06-01

    Full Text Available The definition of functional meso regions for the territory of the Czech Republic is articulated in this article. Functional regions reflect horizontal interactions in space and are presented as a useful tool for various types of geographical analyses, and also for spatial planning, economic policy designs, etc. This paper attempts to add to the discussion on the need to delineate areal units at different hierarchical levels, and to understand the functional flows and spatial behaviours of the population in a given space. Three agglomerative methods are applied in the paper (the CURDS regionalisation algorithm, Intramax, and cluster analysis, and they have not been used previously in Czech geography for the delineation of functional meso regions. Existing functional regions at the micro-level, based on daily travel-to-work flows from the 2001 census, have served as the building blocks. The analyses have produced five regional systems at the meso level, based on daily labour commuting movements of the population. Basic statistics and a characterisation of these systems are provided in this paper.

  10. Dittrichia graveolens (L.) Greuter Essential Oil: Chemical Composition, Multivariate Analysis, and Antimicrobial Activity.

    Science.gov (United States)

    Mitic, Violeta; Stankov Jovanovic, Vesna; Ilic, Marija; Jovanovic, Olga; Djordjevic, Aleksandra; Stojanovic, Gordana

    2016-01-01

    The chemical composition and in vitro antimicrobial activities of Dittrichia graveolens (L.) Greuter essential oil was studied. Moreover, using agglomerative hierarchical cluster (AHC) and principal component analyses (PCA), the interrelationships of the D. graveolens essential-oil profiles characterized so far (including the sample from this study) were investigated. To evaluate the chemical composition of the essential oil, GC-FID and GC/MS analyses were performed. Altogether, 54 compounds were identified, accounting for 92.9% of the total oil composition. The D. graveolens oil belongs to the monoterpenoid chemotype, with monoterpenoids comprising 87.4% of the totally identified compounds. The major components were borneol (43.6%) and bornyl acetate (38.3%). Multivariate analysis showed that the compounds borneol and bornyl acetate exerted the greatest influence on the spatial differences in the composition of the reported oils. The antimicrobial activity against five bacterial and one fungal strain was determined using a disk-diffusion assay. The studied essential oil was active only against Gram-positive bacteria. Copyright © 2016 Verlag Helvetica Chimica Acta AG, Zürich.

  11. The outskirts of the Coma cluster

    Science.gov (United States)

    Gavazzi, Giuseppe

    Evolved Coma-like clusters of galaxies are constituted of relaxed cores composed of ''old'' early-type galaxies, embedded in large-scale structures, mostly constituted of unevolved (late-type) systems. According to the hierarchical theory of cluster formation the central regions are being fed with unevolved, low-mass systems infalling from the surroundings that are gradually transformed into elliptical/S0 galaxies by tidal galaxy-galaxy and galaxy-cluster interactions, taking place at some boundary distance. The Coma cluster, the most studied of all local clusters, provides us with the ideal test-bed for such an evolutionary study because of the completeness of the photometric and kinematic information already at hands. The field of view of the planned GALEX observations is not big enough to include the boundary interface where most transformations processes are expected to take place, including the truncation of the current star formation. We propose to complete the outskirt of Coma with an additional corona of 11 GALEX imaging fields of 1500 sec exposure each, matching the deepness (UV_{AB}=23.5 mag) of the fields observed in guarantee time. Given the priority of the target, we also propose one optional Central pointing that includes one bright star marginally exceeding the detector brightness limit.

  12. Hierarchical cluster analysis and chemical characterisation of Myrtus communis L. essential oil from Yemen region and its antimicrobial, antioxidant and anti-colorectal adenocarcinoma properties.

    Science.gov (United States)

    Anwar, Sirajudheen; Crouch, Rebecca A; Awadh Ali, Nasser A; Al-Fatimi, Mohamed A; Setzer, William N; Wessjohann, Ludger

    2017-09-01

    The hydrodistilled essential oil obtained from the dried leaves of Myrtus communis, collected in Yemen, was analysed by GC-MS. Forty-one compounds were identified, representing 96.3% of the total oil. The major constituents of essential oil were oxygenated monoterpenoids (87.1%), linalool (29.1%), 1,8-cineole (18.4%), α-terpineol (10.8%), geraniol (7.3%) and linalyl acetate (7.4%). The essential oil was assessed for its antimicrobial activity using a disc diffusion assay and resulted in moderate to potent antibacterial and antifungal activities targeting mainly Bacillus subtilis, Staphylococcus aureus and Candida albicans. The oil moderately reduced the diphenylpicrylhydrazyl radical (IC 50  = 4.2 μL/mL or 4.1 mg/mL). In vitro cytotoxicity evaluation against HT29 (human colonic adenocarcinoma cells) showed that the essential oil exhibited a moderate antitumor effect with IC 50 of 110 ± 4 μg/mL. Hierarchical cluster analysis of M. communis has been carried out based on the chemical compositions of 99 samples reported in the literature, including Yemeni sample.

  13. A Link-Based Cluster Ensemble Approach For Improved Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    P.Balaji

    2015-01-01

    Full Text Available Abstract It is difficult from possibilities to select a most suitable effective way of clustering algorithm and its dataset for a defined set of gene expression data because we have a huge number of ways and huge number of gene expressions. At present many researchers are preferring to use hierarchical clustering in different forms this is no more totally optimal. Cluster ensemble research can solve this type of problem by automatically merging multiple data partitions from a wide range of different clusterings of any dimensions to improve both the quality and robustness of the clustering result. But we have many existing ensemble approaches using an association matrix to condense sample-cluster and co-occurrence statistics and relations within the ensemble are encapsulated only at raw level while the existing among clusters are totally discriminated. Finding these missing associations can greatly expand the capability of those ensemble methodologies for microarray data clustering. We propose general K-means cluster ensemble approach for the clustering of general categorical data into required number of partitions.

  14. The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    Chen Yidong

    2004-01-01

    Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.

  15. Delineation of Stenotrophomonas maltophilia isolates from cystic fibrosis patients by fatty acid methyl ester profiles and matrix-assisted laser desorption/ionization time-of-flight mass spectra using hierarchical cluster analysis and principal component analysis.

    Science.gov (United States)

    Vidigal, Pedrina Gonçalves; Mosel, Frank; Koehling, Hedda Luise; Mueller, Karl Dieter; Buer, Jan; Rath, Peter Michael; Steinmann, Joerg

    2014-12-01

    Stenotrophomonas maltophilia is an opportunist multidrug-resistant pathogen that causes a wide range of nosocomial infections. Various cystic fibrosis (CF) centres have reported an increasing prevalence of S. maltophilia colonization/infection among patients with this disease. The purpose of this study was to assess specific fingerprints of S. maltophilia isolates from CF patients (n = 71) by investigating fatty acid methyl esters (FAMEs) through gas chromatography (GC) and highly abundant proteins by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), and to compare them with isolates obtained from intensive care unit (ICU) patients (n = 20) and the environment (n = 11). Principal component analysis (PCA) of GC-FAME patterns did not reveal a clustering corresponding to distinct CF, ICU or environmental types. Based on the peak area index, it was observed that S. maltophilia isolates from CF patients produced significantly higher amounts of fatty acids in comparison with ICU patients and the environmental isolates. Hierarchical cluster analysis (HCA) based on the MALDI-TOF MS peak profiles of S. maltophilia revealed the presence of five large clusters, suggesting a high phenotypic diversity. Although HCA of MALDI-TOF mass spectra did not result in distinct clusters predominantly composed of CF isolates, PCA revealed the presence of a distinct cluster composed of S. maltophilia isolates from CF patients. Our data suggest that S. maltophilia colonizing CF patients tend to modify not only their fatty acid patterns but also their protein patterns as a response to adaptation in the unfavourable environment of the CF lung. © 2014 The Authors.

  16. Anti-inflammatory and anti-angiogenic activities in vitro of eight diterpenes from Daphne genkwa based on hierarchical cluster and principal component analysis.

    Science.gov (United States)

    Wang, Ling; Lan, Xin-Yi; Ji, Jun; Zhang, Chun-Feng; Li, Fei; Wang, Chong-Zhi; Yuan, Chun-Su

    2018-06-01

    Rheumatoid arthritis (RA) is one of the most prevalent chronic inflammatory and angiogenic diseases. The aim of this study was to evaluate the anti-inflammatory and anti-angiogenic activities in vitro of eight diterpenoids isolated from Daphne genkwa. LC-MS was used to identify diterpenes isolated from D. genkwa. The anti-inflammatory and anti-angiogenic activities of eight diterpenoids were evaluated on LPS-induced macrophage RAW264.7 cells and TNF-α-stimulated human umbilical vein endothelial cells (HUVECs) using hierarchical cluster analysis (HCA) and principal component analysis (PCA). The eight diterpenes isolated from D. genkwa were identified as yuanhuaphnin, isoyuanhuacine, 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl, yuanhuagine, isoyuanhuadine, yuanhuadine, yuanhuaoate C and yuanhuacine. All the eight diterpenes significantly down-regulated the excessive secretion of TNF-α, IL-6, IL-1β and NO in LPS-induced RAW264.7 macrophages. However, only 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl markedly reduced production of VEGF, MMP-3, ICAM and VCAM in TNF-α-stimulated HUVECs. HCA obtained 4 clusters, containing 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl, isoyuanhuacine, isoyuanhuadine and five other compounds. PCA showed that the ranking of diterpenes sorted by efficacy from highest to lowest was 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl, yuanhuaphnin, isoyuanhuacine, yuanhuacine, yuanhuaoate C, yuanhuagine, isoyuanhuadine, yuanhuadine. In conclusion, eight diterpenes isolated from D. genkwa showed different levels of activity in LPS-induced RAW264.7 cells and TNF-α-stimulated HUVECs. The comprehensive evaluation of activity by HCA and PCA indicated that of the eight diterpenes, 12-O-(2'E,4'E-decadienoyl)-4-hydroxyphorbol-13-acetyl was the best, and can be developed as a new drug for RA therapy.

  17. ClustOfVar: An R Package for the Clustering of Variables

    Directory of Open Access Journals (Sweden)

    Marie Chavent

    2012-09-01

    Full Text Available Clustering of variables is as a way to arrange variables into homogeneous clusters, i.e., groups of variables which are strongly related to each other and thus bring the same information. These approaches can then be useful for dimension reduction and variable selection. Several specific methods have been developed for the clustering of numerical variables. However concerning qualitative variables or mixtures of quantitative and qualitative variables, far fewer methods have been proposed. The R package ClustOfVar was specifically developed for this purpose. The homogeneity criterion of a cluster is defined as the sum of correlation ratios (for qualitative variables and squared correlations (for quantitative variables to a synthetic quantitative variable, summarizing ``as good as possible'' the variables in the cluster. This synthetic variable is the first principal component obtained with the PCAMIX method. Two clustering algorithms are proposed to optimize the homogeneity criterion: iterative relocation algorithm and ascendant hierarchical clustering. We also propose a bootstrap approach in order to determine suitable numbers of clusters. We illustrate the methodologies and the associated package on small datasets.

  18. Applying Clustering Methods in Drawing Maps of Science: Case Study of the Map For Urban Management Science

    Directory of Open Access Journals (Sweden)

    Mohammad Abuei Ardakan

    2010-04-01

    Full Text Available The present paper offers a basic introduction to data clustering and demonstrates the application of clustering methods in drawing maps of science. All approaches towards classification and clustering of information are briefly discussed. Their application to the process of visualization of conceptual information and drawing of science maps are illustrated by reviewing similar researches in this field. By implementing aggregated hierarchical clustering algorithm, which is an algorithm based on complete-link method, the map for urban management science as an emerging, interdisciplinary scientific field is analyzed and reviewed.

  19. Enhancement of Adaptive Cluster Hierarchical Routing Protocol using Distance and Energy for Wireless Sensor Networks

    International Nuclear Information System (INIS)

    Nawar, N.M.; Soliman, S.E.; Kelash, H.M.; Ayad, N.M.

    2014-01-01

    The application of wireless networking is widely used in nuclear applications. This includes reactor control and fire dedication system. This paper is devoted to the application of this concept in the intrusion system of the Radioisotope Production Facility (RPF) of the Egyptian Atomic Energy Authority. This includes the tracking, monitoring and control components of this system. The design and implementation of wireless sensor networks has become a hot area of research due to the extensive use of sensor networks to enable applications that connect the physical world to the virtual world [1-2]. The original LEACH is named a communication protocol (clustering-based); the extended LEACH’s stochastic cluster head selection algorithm by a deterministic component. Depending on the network configuration an increase of network lifetime can be accomplished [3]. The proposed routing mechanisms after enhancement divide the nodes into clusters. A cluster head performs its task which is considerably more energy-intensive than the rest of the nodes inside sensor network. So, nodes rotate tasks at different rounds between a cluster head and other sensors throughout the lifetime of the network to balance the energy dissipation [4-5].The performance improvement when using routing protocol after enhancement of the algorithm which takes into consideration the distance and the remaining energy for choosing the cluster head by obtains from the advertise message. Network Simulator (Ns2 simulator) is used to prove that LEACH after enhancement performs better than the original LEACH protocol in terms of Average Energy, Network Life Time, Delay, Throughput and Overhead.

  20. A bio-inspired N-doped porous carbon electrocatalyst with hierarchical superstructure for efficient oxygen reduction reaction

    Science.gov (United States)

    Miao, Yue-E.; Yan, Jiajie; Ouyang, Yue; Lu, Hengyi; Lai, Feili; Wu, Yue; Liu, Tianxi

    2018-06-01

    The bio-inspired hierarchical "grape cluster" superstructure provides an effective integration of one-dimensional carbon nanofibers (CNF) with isolated carbonaceous nanoparticles into three-dimensional (3D) conductive frameworks for efficient electron and mass transfer. Herein, a 3D N-doped porous carbon electrocatalyst consisting of carbon nanofibers with grape-like N-doped hollow carbon particles (CNF@NC) has been prepared through a simple electrospinning strategy combined with in-situ growth and carbonization processes. Such a bio-inspired hierarchically organized conductive network largely facilitates both the mass diffusion and electron transfer during the oxygen reduction reactions (ORR). Therefore, the metal-free CNF@NC catalyst demonstrates superior catalytic activity with an absolute four-electron transfer mechanism, strong methanol tolerance and good long-term stability towards ORR in alkaline media.

  1. Cluster Analysis of the Newcastle Electronic Corpus of Tyneside English: A Comparison of Methods

    NARCIS (Netherlands)

    Moisl, Hermann; Jones, Valerie M.

    2005-01-01

    This article examines the feasibility of an empirical approach to sociolinguistic analysis of the Newcastle Electronic Corpus of Tyneside English using exploratory multivariate methods. It addresses a known problem with one class of such methods, hierarchical cluster analysis—that different

  2. A study of hierarchical structure on South China industrial electricity-consumption correlation

    Science.gov (United States)

    Yao, Can-Zhong; Lin, Ji-Nan; Liu, Xiao-Feng

    2016-02-01

    Based on industrial electricity-consumption data of five southern provinces of China from 2005 to 2013, we study the industrial correlation mechanism with MST (minimal spanning tree) and HT (hierarchical tree) models. First, we comparatively analyze the industrial electricity-consumption correlation structure in pre-crisis and after-crisis period using MST model and Bootstrap technique of statistical reliability test of links. Results exhibit that all industrial electricity-consumption trees of five southern provinces of China in pre-crisis and after-crisis time are in formation of chain, and the "center-periphery structure" of those chain-like trees is consistent with industrial specialization in classical industrial chain theory. Additionally, the industrial structure of some provinces is reorganized and transferred in pre-crisis and after-crisis time. Further, the comparative analysis with hierarchical tree and Bootstrap technique demonstrates that as for both observations of GD and overall NF, the industrial electricity-consumption correlation is non-significant clustered in pre-crisis period, whereas it turns significant clustered in after-crisis time. Therefore we propose that in perspective of electricity-consumption, their industrial structures are directed to optimized organization and global correlation. Finally, the analysis of distance of HTs verifies that industrial reorganization and development may strengthen market integration, coordination and correlation of industrial production. Except GZ, other four provinces have a shorter distance of industrial electricity-consumption correlation in after-crisis period, revealing a better performance of regional specialization and integration.

  3. How hierarchical is language use?

    Science.gov (United States)

    Frank, Stefan L.; Bod, Rens; Christiansen, Morten H.

    2012-01-01

    It is generally assumed that hierarchical phrase structure plays a central role in human language. However, considerations of simplicity and evolutionary continuity suggest that hierarchical structure should not be invoked too hastily. Indeed, recent neurophysiological, behavioural and computational studies show that sequential sentence structure has considerable explanatory power and that hierarchical processing is often not involved. In this paper, we review evidence from the recent literature supporting the hypothesis that sequential structure may be fundamental to the comprehension, production and acquisition of human language. Moreover, we provide a preliminary sketch outlining a non-hierarchical model of language use and discuss its implications and testable predictions. If linguistic phenomena can be explained by sequential rather than hierarchical structure, this will have considerable impact in a wide range of fields, such as linguistics, ethology, cognitive neuroscience, psychology and computer science. PMID:22977157

  4. Hierarchical model generation for architecture reconstruction using laser-scanned point clouds

    Science.gov (United States)

    Ning, Xiaojuan; Wang, Yinghui; Zhang, Xiaopeng

    2014-06-01

    Architecture reconstruction using terrestrial laser scanner is a prevalent and challenging research topic. We introduce an automatic, hierarchical architecture generation framework to produce full geometry of architecture based on a novel combination of facade structures detection, detailed windows propagation, and hierarchical model consolidation. Our method highlights the generation of geometric models automatically fitting the design information of the architecture from sparse, incomplete, and noisy point clouds. First, the planar regions detected in raw point clouds are interpreted as three-dimensional clusters. Then, the boundary of each region extracted by projecting the points into its corresponding two-dimensional plane is classified to obtain detailed shape structure elements (e.g., windows and doors). Finally, a polyhedron model is generated by calculating the proposed local structure model, consolidated structure model, and detailed window model. Experiments on modeling the scanned real-life buildings demonstrate the advantages of our method, in which the reconstructed models not only correspond to the information of architectural design accurately, but also satisfy the requirements for visualization and analysis.

  5. Cluster fusion-fission dynamics in the Singapore stock exchange

    Science.gov (United States)

    Teh, Boon Kin; Cheong, Siew Ann

    2015-10-01

    In this paper, we investigate how the cross-correlations between stocks in the Singapore stock exchange (SGX) evolve over 2008 and 2009 within overlapping one-month time windows. In particular, we examine how these cross-correlations change before, during, and after the Sep-Oct 2008 Lehman Brothers Crisis. To do this, we extend the complete-linkage hierarchical clustering algorithm, to obtain robust clusters of stocks with stronger intracluster correlations, and weaker intercluster correlations. After we identify the robust clusters in all time windows, we visualize how these change in the form of a fusion-fission diagram. Such a diagram depicts graphically how the cluster sizes evolve, the exchange of stocks between clusters, as well as how strongly the clusters mix. From the fusion-fission diagram, we see a giant cluster growing and disintegrating in the SGX, up till the Lehman Brothers Crisis in September 2008 and the market crashes of October 2008. After the Lehman Brothers Crisis, clusters in the SGX remain small for few months before giant clusters emerge once again. In the aftermath of the crisis, we also find strong mixing of component stocks between clusters. As a result, the correlation between initially strongly-correlated pairs of stocks decay exponentially with average life time of about a month. These observations impact strongly how portfolios and trading strategies should be formulated.

  6. Clustering Methods Application for Customer Segmentation to Manage Advertisement Campaign

    Directory of Open Access Journals (Sweden)

    Maciej Kutera

    2010-10-01

    Full Text Available Clustering methods are recently so advanced elaborated algorithms for large collection data analysis that they have been already included today to data mining methods. Clustering methods are nowadays larger and larger group of methods, very quickly evolving and having more and more various applications. In the article, our research concerning usefulness of clustering methods in customer segmentation to manage advertisement campaign is presented. We introduce results obtained by using four selected methods which have been chosen because their peculiarities suggested their applicability to our purposes. One of the analyzed method k-means clustering with random selected initial cluster seeds gave very good results in customer segmentation to manage advertisement campaign and these results were presented in details in the article. In contrast one of the methods (hierarchical average linkage was found useless in customer segmentation. Further investigations concerning benefits of clustering methods in customer segmentation to manage advertisement campaign is worth continuing, particularly that finding solutions in this field can give measurable profits for marketing activity.

  7. Using cluster analysis and a classification and regression tree model to developed cover types in the Sky Islands of southeastern Arizona [Abstract

    Science.gov (United States)

    Jose M. Iniguez; Joseph L. Ganey; Peter J. Daugherty; John D. Bailey

    2005-01-01

    The objective of this study was to develop a rule based cover type classification system for the forest and woodland vegetation in the Sky Islands of southeastern Arizona. In order to develop such system we qualitatively and quantitatively compared a hierarchical (Ward’s) and a non-hierarchical (k-means) clustering method. Ecologically, unique groups and plots...

  8. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

    Science.gov (United States)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

    2017-12-01

    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  9. Hierarchical architecture of active knits

    International Nuclear Information System (INIS)

    Abel, Julianna; Luntz, Jonathan; Brei, Diann

    2013-01-01

    Nature eloquently utilizes hierarchical structures to form the world around us. Applying the hierarchical architecture paradigm to smart materials can provide a basis for a new genre of actuators which produce complex actuation motions. One promising example of cellular architecture—active knits—provides complex three-dimensional distributed actuation motions with expanded operational performance through a hierarchically organized structure. The hierarchical structure arranges a single fiber of active material, such as shape memory alloys (SMAs), into a cellular network of interlacing adjacent loops according to a knitting grid. This paper defines a four-level hierarchical classification of knit structures: the basic knit loop, knit patterns, grid patterns, and restructured grids. Each level of the hierarchy provides increased architectural complexity, resulting in expanded kinematic actuation motions of active knits. The range of kinematic actuation motions are displayed through experimental examples of different SMA active knits. The results from this paper illustrate and classify the ways in which each level of the hierarchical knit architecture leverages the performance of the base smart material to generate unique actuation motions, providing necessary insight to best exploit this new actuation paradigm. (paper)

  10. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius

    2017-07-03

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  11. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius; Huser, Raphaë l; Prasad, Avinash

    2017-01-01

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  12. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng

    2016-08-26

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated protein structure classifications (e.g., SCOP and CATH) are very successful, but recently suffer the slow updating problem because of the increased throughput of newly solved protein structures. Thus, fully automatic methods to cluster proteins in the protein structure space have been designed and developed. In this study, we observed that the SCOP superfamilies are highly consistent with clustering trees representing hierarchical clustering procedures, but the tree cutting is very challenging and becomes the bottleneck of clustering accuracy. To overcome this challenge, we proposed a novel density-based K-nearest uphill clustering method that effectively eliminates noisy pairwise protein structure similarities and identifies density peaks as cluster centers. Specifically, the density peaks are identified based on K-nearest uphills (i.e., proteins with higher densities) and K-nearest neighbors. To our knowledge, this is the first attempt to apply and develop density-based clustering methods in the protein structure space. Our results show that our density-based clustering method outperforms the state-of-the-art clustering methods previously applied to the problem. Moreover, we observed that computational methods and human experts could produce highly similar clusters at high precision values, while computational methods also suggest to split some large superfamilies into smaller clusters. © 2016 Elsevier B.V.

  13. Hierarchical modularization of biochemical pathways using fuzzy-c means clustering.

    Science.gov (United States)

    de Luis Balaguer, Maria A; Williams, Cranos M

    2014-08-01

    Biological systems that are representative of regulatory, metabolic, or signaling pathways can be highly complex. Mathematical models that describe such systems inherit this complexity. As a result, these models can often fail to provide a path toward the intuitive comprehension of these systems. More coarse information that allows a perceptive insight of the system is sometimes needed in combination with the model to understand control hierarchies or lower level functional relationships. In this paper, we present a method to identify relationships between components of dynamic models of biochemical pathways that reside in different functional groups. We find primary relationships and secondary relationships. The secondary relationships reveal connections that are present in the system, which current techniques that only identify primary relationships are unable to show. We also identify how relationships between components dynamically change over time. This results in a method that provides the hierarchy of the relationships among components, which can help us to understand the low level functional structure of the system and to elucidate potential hierarchical control. As a proof of concept, we apply the algorithm to the epidermal growth factor signal transduction pathway, and to the C3 photosynthesis pathway. We identify primary relationships among components that are in agreement with previous computational decomposition studies, and identify secondary relationships that uncover connections among components that current computational approaches were unable to reveal.

  14. Edge Principal Components and Squash Clustering: Using the Special Structure of Phylogenetic Placement Data for Sample Comparison

    Science.gov (United States)

    Matsen IV, Frederick A.; Evans, Steven N.

    2013-01-01

    Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome. PMID:23505415

  15. Evolution of galaxy cluster scaling and structural properties from XMM observations: probing the physics of structure formation

    International Nuclear Information System (INIS)

    Anokhin, Sergey

    2008-01-01

    Clusters of galaxies are the largest gravitationally bound objects in the Universe. It is possible to study the hierarchical structure formation based on these youngest objects in the Universe. In order to complete the results found with hot clusters, we choose the cold distant galaxy clusters selected from The Southern SHARC catalogue. In the same time, we studied archived galaxy clusters to test the theory and treatment analysis. To study these weak cluster of galaxies, we optimized our treatment analysis: in particular, searching for the best background subtraction and modeling it for our surface brightness profile and spectra. Our results are in a good agreement with Scaling Relation obtained from hot galaxy clusters. (author) [fr

  16. Synchronous Firefly Algorithm for Cluster Head Selection in WSN

    Directory of Open Access Journals (Sweden)

    Madhusudhanan Baskaran

    2015-01-01

    Full Text Available Wireless Sensor Network (WSN consists of small low-cost, low-power multifunctional nodes interconnected to efficiently aggregate and transmit data to sink. Cluster-based approaches use some nodes as Cluster Heads (CHs and organize WSNs efficiently for aggregation of data and energy saving. A CH conveys information gathered by cluster nodes and aggregates/compresses data before transmitting it to a sink. However, this additional responsibility of the node results in a higher energy drain leading to uneven network degradation. Low Energy Adaptive Clustering Hierarchy (LEACH offsets this by probabilistically rotating cluster heads role among nodes with energy above a set threshold. CH selection in WSN is NP-Hard as optimal data aggregation with efficient energy savings cannot be solved in polynomial time. In this work, a modified firefly heuristic, synchronous firefly algorithm, is proposed to improve the network performance. Extensive simulation shows the proposed technique to perform well compared to LEACH and energy-efficient hierarchical clustering. Simulations show the effectiveness of the proposed method in decreasing the packet loss ratio by an average of 9.63% and improving the energy efficiency of the network when compared to LEACH and EEHC.

  17. A Comprehensive Survey on Hierarchical-Based Routing Protocols for Mobile Wireless Sensor Networks: Review, Taxonomy, and Future Directions

    Directory of Open Access Journals (Sweden)

    Nabil Sabor

    2017-01-01

    Full Text Available Introducing mobility to Wireless Sensor Networks (WSNs puts new challenges particularly in designing of routing protocols. Mobility can be applied to the sensor nodes and/or the sink node in the network. Many routing protocols have been developed to support the mobility of WSNs. These protocols are divided depending on the routing structure into hierarchical-based, flat-based, and location-based routing protocols. However, the hierarchical-based routing protocols outperform the other routing types in saving energy, scalability, and extending lifetime of Mobile WSNs (MWSNs. Selecting an appropriate hierarchical routing protocol for specific applications is an important and difficult task. Therefore, this paper focuses on reviewing some of the recently hierarchical-based routing protocols that are developed in the last five years for MWSNs. This survey divides the hierarchical-based routing protocols into two broad groups, namely, classical-based and optimized-based routing protocols. Also, we present a detailed classification of the reviewed protocols according to the routing approach, control manner, mobile element, mobility pattern, network architecture, clustering attributes, protocol operation, path establishment, communication paradigm, energy model, protocol objectives, and applications. Moreover, a comparison between the reviewed protocols is investigated in this survey depending on delay, network size, energy-efficiency, and scalability while mentioning the advantages and drawbacks of each protocol. Finally, we summarize and conclude the paper with future directions.

  18. Intracluster age gradients in numerous young stellar clusters

    Science.gov (United States)

    Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Bate, M. R.; Broos, P. S.; Garmire, G. P.

    2018-05-01

    The pace and pattern of star formation leading to rich young stellar clusters is quite uncertain. In this context, we analyse the spatial distribution of ages within 19 young (median t ≲ 3 Myr on the Siess et al. time-scale), morphologically simple, isolated, and relatively rich stellar clusters. Our analysis is based on young stellar object (YSO) samples from the Massive Young Star-Forming Complex Study in Infrared and X-ray and Star Formation in Nearby Clouds surveys, and a new estimator of pre-main sequence (PMS) stellar ages, AgeJX, derived from X-ray and near-infrared photometric data. Median cluster ages are computed within four annular subregions of the clusters. We confirm and extend the earlier result of Getman et al. (2014): 80 per cent of the clusters show age trends where stars in cluster cores are younger than in outer regions. Our cluster stacking analyses establish the existence of an age gradient to high statistical significance in several ways. Time-scales vary with the choice of PMS evolutionary model; the inferred median age gradient across the studied clusters ranges from 0.75 to 1.5 Myr pc-1. The empirical finding reported in the present study - late or continuing formation of stars in the cores of star clusters with older stars dispersed in the outer regions - has a strong foundation with other observational studies and with the astrophysical models like the global hierarchical collapse model of Vázquez-Semadeni et al.

  19. Cluster Analysis of the Newcastle Electronic Corpus of Tyneside English: In A Comparison of Methods

    NARCIS (Netherlands)

    Moisl, Hermann; Jones, Valerie M.

    2005-01-01

    This article examines the feasibility of an empirical approach to sociolinguistic analysis of the Newcastle Electronic Corpus of Tyneside English using exploratory multivariate methods. It addresses a known problem with one class of such methods, hierarchical cluster analysis—that different

  20. Clustering Suicide Attempters: Impulsive-Ambivalent, Well-Planned, or Frequent.

    Science.gov (United States)

    Lopez-Castroman, Jorge; Nogue, Erika; Guillaume, Sebastien; Picot, Marie Christine; Courtet, Philippe

    2016-06-01

    Attempts to predict suicidal behavior within high-risk populations have so far shown insufficient accuracy. Although several psychosocial and clinical features have been consistently associated with suicide attempts, investigations of latent structure in well-characterized populations of suicide attempters are lacking. We analyzed a sample of 1,009 hospitalized suicide attempters that were recruited between 1999 and 2012. Eleven clinically relevant items related to the characteristics of suicidal behavior were submitted to a Hierarchical Ascendant Classification. Phenotypic profiles were compared between the resulting clusters. A decisional tree was constructed to facilitate the differentiation of individuals classified within the first 2 clusters. Most individuals were included in a cluster characterized by less lethal means and planning ("impulse-ambivalent"). A second cluster featured more carefully planned attempts ("well-planned"), more alcohol or drug use before the attempt, and more precautions to avoid interruptions. Finally, a small, third cluster included individuals reporting more attempts ("frequent"), more often serious or violent attempts, and an earlier age at first attempt. Differences across clusters by demographic and clinical characteristics were also found, particularly with the third cluster whose participants had experienced high levels of childhood abuse. Cluster analysis consistently supported 3 distinct clusters of individuals with specific features in their suicidal behaviors and phenotypic profiles that could help clinicians to better focus prevention strategies. © Copyright 2016 Physicians Postgraduate Press, Inc.

  1. A new clustering algorithm for scanning electron microscope images

    Science.gov (United States)

    Yousef, Amr; Duraisamy, Prakash; Karim, Mohammad

    2016-04-01

    A scanning electron microscope (SEM) is a type of electron microscope that produces images of a sample by scanning it with a focused beam of electrons. The electrons interact with the sample atoms, producing various signals that are collected by detectors. The gathered signals contain information about the sample's surface topography and composition. The electron beam is generally scanned in a raster scan pattern, and the beam's position is combined with the detected signal to produce an image. The most common configuration for an SEM produces a single value per pixel, with the results usually rendered as grayscale images. The captured images may be produced with insufficient brightness, anomalous contrast, jagged edges, and poor quality due to low signal-to-noise ratio, grained topography and poor surface details. The segmentation of the SEM images is a tackling problems in the presence of the previously mentioned distortions. In this paper, we are stressing on the clustering of these type of images. In that sense, we evaluate the performance of the well-known unsupervised clustering and classification techniques such as connectivity based clustering (hierarchical clustering), centroid-based clustering, distribution-based clustering and density-based clustering. Furthermore, we propose a new spatial fuzzy clustering technique that works efficiently on this type of images and compare its results against these regular techniques in terms of clustering validation metrics.

  2. Hierarchical clustering of gene expression patterns in the Eomes + lineage of excitatory neurons during early neocortical development

    Directory of Open Access Journals (Sweden)

    Cameron David A

    2012-08-01

    Full Text Available Abstract Background Cortical neurons display dynamic patterns of gene expression during the coincident processes of differentiation and migration through the developing cerebrum. To identify genes selectively expressed by the Eomes + (Tbr2 lineage of excitatory cortical neurons, GFP-expressing cells from Tg(Eomes::eGFP Gsat embryos were isolated to > 99% purity and profiled. Results We report the identification, validation and spatial grouping of genes selectively expressed within the Eomes + cortical excitatory neuron lineage during early cortical development. In these neurons 475 genes were expressed ≥ 3-fold, and 534 genes ≤ 3-fold, compared to the reference population of neuronal precursors. Of the up-regulated genes, 328 were represented at the Genepaint in situ hybridization database and 317 (97% were validated as having spatial expression patterns consistent with the lineage of differentiating excitatory neurons. A novel approach for quantifying in situ hybridization patterns (QISP across the cerebral wall was developed that allowed the hierarchical clustering of genes into putative co-regulated groups. Forty four candidate genes were identified that show spatial expression with Intermediate Precursor Cells, 49 candidate genes show spatial expression with Multipolar Neurons, while the remaining 224 genes achieved peak expression in the developing cortical plate. Conclusions This analysis of differentiating excitatory neurons revealed the expression patterns of 37 transcription factors, many chemotropic signaling molecules (including the Semaphorin, Netrin and Slit signaling pathways, and unexpected evidence for non-canonical neurotransmitter signaling and changes in mechanisms of glucose metabolism. Over half of the 317 identified genes are associated with neuronal disease making these findings a valuable resource for studies of neurological development and disease.

  3. CATCHprofiles: Clustering and Alignment Tool for ChIP Profiles

    DEFF Research Database (Denmark)

    G. G. Nielsen, Fiona; Galschiøt Markus, Kasper; Møllegaard Friborg, Rune

    2012-01-01

    IP-profiling data and detect potentially meaningful patterns, the areas of enrichment must be aligned and clustered, which is an algorithmically and computationally challenging task. We have developed CATCHprofiles, a novel tool for exhaustive pattern detection in ChIP profiling data. CATCHprofiles is built upon...... a computationally efficient implementation for the exhaustive alignment and hierarchical clustering of ChIP profiling data. The tool features a graphical interface for examination and browsing of the clustering results. CATCHprofiles requires no prior knowledge about functional sites, detects known binding patterns...... it an invaluable tool for explorative research based on ChIP profiling data. CATCHprofiles and the CATCH algorithm run on all platforms and is available for free through the CATCH website: http://catch.cmbi.ru.nl/. User support is available by subscribing to the mailing list catch-users@bioinformatics.org....

  4. HLM in Cluster-Randomised Trials--Measuring Efficacy across Diverse Populations of Learners

    Science.gov (United States)

    Hegedus, Stephen; Tapper, John; Dalton, Sara; Sloane, Finbarr

    2013-01-01

    We describe the application of Hierarchical Linear Modelling (HLM) in a cluster-randomised study to examine learning algebraic concepts and procedures in an innovative, technology-rich environment in the US. HLM is applied to measure the impact of such treatment on learning and on contextual variables. We provide a detailed description of such…

  5. Sleep, Dietary, and Exercise Behavioral Clusters Among Truck Drivers With Obesity: Implications for Interventions.

    Science.gov (United States)

    Olson, Ryan; Thompson, Sharon V; Wipfli, Brad; Hanson, Ginger; Elliot, Diane L; Anger, W Kent; Bodner, Todd; Hammer, Leslie B; Hohn, Elliot; Perrin, Nancy A

    2016-03-01

    The objectives of the study were to describe a sample of truck drivers, identify clusters of drivers with similar patterns in behaviors affecting energy balance (sleep, diet, and exercise), and test for cluster differences in health safety, and psychosocial factors. Participants' (n = 452, body mass index M = 37.2, 86.4% male) self-reported behaviors were dichotomized prior to hierarchical cluster analysis, which identified groups with similar behavior covariation. Cluster differences were tested with generalized estimating equations. Five behavioral clusters were identified that differed significantly in age, smoking status, diabetes prevalence, lost work days, stress, and social support, but not in body mass index. Cluster 2, characterized by the best sleep quality, had significantly lower lost workdays and stress than other clusters. Weight management interventions for drivers should explicitly address sleep, and may be maximally effective after establishing socially supportive work environments that reduce stress exposures.

  6. Calculations of light scattering matrices for stochastic ensembles of nanosphere clusters

    International Nuclear Information System (INIS)

    Bunkin, N.F.; Shkirin, A.V.; Suyazov, N.V.; Starosvetskiy, A.V.

    2013-01-01

    Results of the calculation of the light scattering matrices for systems of stochastic nanosphere clusters are presented. A mathematical model of spherical particle clustering with allowance for cluster–cluster aggregation is used. The fractal properties of cluster structures are explored at different values of the model parameter that governs cluster–cluster interaction. General properties of the light scattering matrices of nanosphere-cluster ensembles as dependent on their mean fractal dimension have been found. The scattering-matrix calculations were performed for finite samples of 10 3 random clusters, made up of polydisperse spherical nanoparticles, having lognormal size distribution with the effective radius 50 nm and effective variance 0.02; the mean number of monomers in a cluster and its standard deviation were set to 500 and 70, respectively. The implemented computation environment, modeling the scattering matrices for overall sequences of clusters, is based upon T-matrix program code for a given single cluster of spheres, which was developed in [1]. The ensemble-averaged results have been compared with orientation-averaged ones calculated for individual clusters. -- Highlights: ► We suggested a hierarchical model of cluster growth allowing for cluster–cluster aggregation. ► We analyzed the light scattering by whole ensembles of nanosphere clusters. ► We studied the evolution of the light scattering matrix when changing the fractal dimension

  7. Identifying Hierarchical and Overlapping Protein Complexes Based on Essential Protein-Protein Interactions and “Seed-Expanding” Method

    Directory of Open Access Journals (Sweden)

    Jun Ren

    2014-01-01

    Full Text Available Many evidences have demonstrated that protein complexes are overlapping and hierarchically organized in PPI networks. Meanwhile, the large size of PPI network wants complex detection methods have low time complexity. Up to now, few methods can identify overlapping and hierarchical protein complexes in a PPI network quickly. In this paper, a novel method, called MCSE, is proposed based on λ-module and “seed-expanding.” First, it chooses seeds as essential PPIs or edges with high edge clustering values. Then, it identifies protein complexes by expanding each seed to a λ-module. MCSE is suitable for large PPI networks because of its low time complexity. MCSE can identify overlapping protein complexes naturally because a protein can be visited by different seeds. MCSE uses the parameter λ_th to control the range of seed expanding and can detect a hierarchical organization of protein complexes by tuning the value of λ_th. Experimental results of S. cerevisiae show that this hierarchical organization is similar to that of known complexes in MIPS database. The experimental results also show that MCSE outperforms other previous competing algorithms, such as CPM, CMC, Core-Attachment, Dpclus, HC-PIN, MCL, and NFC, in terms of the functional enrichment and matching with known protein complexes.

  8. SATELLITE DWARF GALAXIES IN A HIERARCHICAL UNIVERSE: THE PREVALENCE OF DWARF-DWARF MAJOR MERGERS

    OpenAIRE

    Deason, A; Wetzel, A; Garrison-Kimmel, S

    2014-01-01

    Mergers are a common phenomenon in hierarchical structure formation, especially for massive galaxies and clusters, but their importance for dwarf galaxies in the Local Group remains poorly understood. We investigate the frequency of major mergers between dwarf galaxies in the Local Group using the ELVIS suite of cosmological zoom-in dissipationless simulations of Milky Way- and M31-like host halos. We find that ~10% of satellite dwarf galaxies with M_star > 10^6 M_sun that are within the host...

  9. Morphological quantification of hierarchical geomaterials by X-ray nano-CT bridges the gap from nano to micro length scales

    KAUST Repository

    Brisard, S.

    2012-01-30

    Morphological quantification of the complex structure of hierarchical geomaterials is of great relevance for Earth science and environmental engineering, among others. To date, methods that quantify the 3D morphology on length scales ranging from a few tens of nanometers to several hun-dred nanometers have had limited success. We demonstrate, for the first time, that it is possible to go beyond visualization and to extract quantitative morphological information from X-ray images in the aforementioned length scales. As examples, two different hierarchical geomaterials exhibiting complex porous structures ranging from nanometer to macroscopic scale are studied: a flocculated clay water suspension and two hydrated cement pastes. We show that from a single projection image it is possible to perform a direct computation of the ultra-small angle-scattering spectra. The predictions matched very well the experimental data obtained by the best ultra-small angle-scattering experimental setups as observed for the cement paste. In this context, we demonstrate that the structure of flocculated clay suspension exhibit two well-distinct regimes of aggregation, a dense mass fractal aggregation at short distance and a more open structure at large distance, which can be generated by a 3D reaction limited cluster-cluster aggregation process. For the first time, a high-resolution 3D image of fibrillar cement paste cluster was obtained from limited angle nanotomography.

  10. EVOLUTION AND DISTRIBUTION OF MAGNETIC FIELDS FROM ACTIVE GALACTIC NUCLEI IN GALAXY CLUSTERS. II. THE EFFECTS OF CLUSTER SIZE AND DYNAMICAL STATE

    International Nuclear Information System (INIS)

    Xu Hao; Li Hui; Collins, David C.; Li, Shengtai; Norman, Michael L.

    2011-01-01

    Theory and simulations suggest that magnetic fields from radio jets and lobes powered by their central super massive black holes can be an important source of magnetic fields in the galaxy clusters. This is Paper II in a series of studies where we present self-consistent high-resolution adaptive mesh refinement cosmological magnetohydrodynamic simulations that simultaneously follow the formation of a galaxy cluster and evolution of magnetic fields ejected by an active galactic nucleus. We studied 12 different galaxy clusters with virial masses ranging from 1 x 10 14 to 2 x 10 15 M sun . In this work, we examine the effects of the mass and merger history on the final magnetic properties. We find that the evolution of magnetic fields is qualitatively similar to those of previous studies. In most clusters, the injected magnetic fields can be transported throughout the cluster and be further amplified by the intracluster medium (ICM) turbulence during the cluster formation process with hierarchical mergers, while the amplification history and the magnetic field distribution depend on the cluster formation and magnetism history. This can be very different for different clusters. The total magnetic energies in these clusters are between 4 x 10 57 and 10 61 erg, which is mainly decided by the cluster mass, scaling approximately with the square of the total mass. Dynamically older relaxed clusters usually have more magnetic fields in their ICM. The dynamically very young clusters may be magnetized weakly since there is not enough time for magnetic fields to be amplified.

  11. Coping profiles, perceived stress and health-related behaviors: a cluster analysis approach.

    Science.gov (United States)

    Doron, Julie; Trouillet, Raphael; Maneveau, Anaïs; Ninot, Grégory; Neveu, Dorine

    2015-03-01

    Using cluster analytical procedure, this study aimed (i) to determine whether people could be differentiated on the basis of coping profiles (or unique combinations of coping strategies); and (ii) to examine the relationships between these profiles and perceived stress and health-related behaviors. A sample of 578 French students (345 females, 233 males; M(age)= 21.78, SD(age)= 2.21) completed the Perceived Stress Scale-14 ( Bruchon-Schweitzer, 2002), the Brief COPE ( Muller and Spitz, 2003) and a series of items measuring health-related behaviors. A two-phased cluster analytic procedure (i.e. hierarchical and non-hierarchical-k-means) was employed to derive clusters of coping strategy profiles. The results yielded four distinctive coping profiles: High Copers, Adaptive Copers, Avoidant Copers and Low Copers. The results showed that clusters differed significantly in perceived stress and health-related behaviors. High Copers and Avoidant Copers displayed higher levels of perceived stress and engaged more in unhealthy behavior, compared with Adaptive Copers and Low Copers who reported lower levels of stress and engaged more in healthy behaviors. These findings suggested that individuals' relative reliance on some strategies and de-emphasis on others may be a more advantageous way of understanding the manner in which individuals cope with stress. Therefore, cluster analysis approach may provide an advantage over more traditional statistical techniques by identifying distinct coping profiles that might best benefit from interventions. Future research should consider coping profiles to provide a deeper understanding of the relationships between coping strategies and health outcomes and to identify risk groups. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. Fast gene ontology based clustering for microarray experiments.

    Science.gov (United States)

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  13. Spatial-area selective retrieval of multiple object-place associations in a hierarchical cognitive map formed by theta phase coding.

    Science.gov (United States)

    Sato, Naoyuki; Yamaguchi, Yoko

    2009-06-01

    The human cognitive map is known to be hierarchically organized consisting of a set of perceptually clustered landmarks. Patient studies have demonstrated that these cognitive maps are maintained by the hippocampus, while the neural dynamics are still poorly understood. The authors have shown that the neural dynamic "theta phase precession" observed in the rodent hippocampus may be capable of forming hierarchical cognitive maps in humans. In the model, a visual input sequence consisting of object and scene features in the central and peripheral visual fields, respectively, results in the formation of a hierarchical cognitive map for object-place associations. Surprisingly, it is possible for such a complex memory structure to be formed in a few seconds. In this paper, we evaluate the memory retrieval of object-place associations in the hierarchical network formed by theta phase precession. The results show that multiple object-place associations can be retrieved with the initial cue of a scene input. Importantly, according to the wide-to-narrow unidirectional connections among scene units, the spatial area for object-place retrieval can be controlled by the spatial area of the initial cue input. These results indicate that the hierarchical cognitive maps have computational advantages on a spatial-area selective retrieval of multiple object-place associations. Theta phase precession dynamics is suggested as a fundamental neural mechanism of the human cognitive map.

  14. WESTERN CHARPATHIAN RURAL MOUNTAIN TOURISM MAPPING THROUGH CLUSTER METHODOLOGY

    Directory of Open Access Journals (Sweden)

    Elena TOMA

    2013-10-01

    Full Text Available Rural tourism from Western Carpathian Mountain was characterized in the last years by a low occupancy rate and a decline in tourist arrivals, due, beside of the direct effects of economic crises, to the remote location of mountain villages and to the low quality of infrastructure. For this reason we consider that the implementation of complex and integrated products based on tour thematic circuits represents a real opportunity to develop local rural tourism industry. The aim of this paper is to identify which is the best networking solution, based on clustering analysis. The Multidimensional Scaling Method and Hierarchical Cluster Method permitted us to demonstrate and identify the best way of clustering, and, in this way, the best route for a potential tour touristic circuit. Reported to the counties from which the villages take part, the identified cluster concentrate 57.7% of rural touristic accommodations and 65.0% of tourist arrivals, but it has an occupancy rate of only 5.9%. By implementing new complex touristic products we consider that can be assured a rise of this touristic dimension of the cluster and we propose more in depth studies regarding the profile of the potential customers.

  15. Endohedral gallide cluster superconductors and superconductivity in ReGa5.

    Science.gov (United States)

    Xie, Weiwei; Luo, Huixia; Phelan, Brendan F; Klimczuk, Tomasz; Cevallos, Francois Alexandre; Cava, Robert Joseph

    2015-12-22

    We present transition metal-embedded (T@Gan) endohedral Ga-clusters as a favorable structural motif for superconductivity and develop empirical, molecule-based, electron counting rules that govern the hierarchical architectures that the clusters assume in binary phases. Among the binary T@Gan endohedral cluster systems, Mo8Ga41, Mo6Ga31, Rh2Ga9, and Ir2Ga9 are all previously known superconductors. The well-known exotic superconductor PuCoGa5 and related phases are also members of this endohedral gallide cluster family. We show that electron-deficient compounds like Mo8Ga41 prefer architectures with vertex-sharing gallium clusters, whereas electron-rich compounds, like PdGa5, prefer edge-sharing cluster architectures. The superconducting transition temperatures are highest for the electron-poor, corner-sharing architectures. Based on this analysis, the previously unknown endohedral cluster compound ReGa5 is postulated to exist at an intermediate electron count and a mix of corner sharing and edge sharing cluster architectures. The empirical prediction is shown to be correct and leads to the discovery of superconductivity in ReGa5. The Fermi levels for endohedral gallide cluster compounds are located in deep pseudogaps in the electronic densities of states, an important factor in determining their chemical stability, while at the same time limiting their superconducting transition temperatures.

  16. Self Organizing Maps to efficiently cluster and functionally interpret protein conformational ensembles

    Directory of Open Access Journals (Sweden)

    Fabio Stella

    2013-09-01

    Full Text Available An approach that combines Self-Organizing maps, hierarchical clustering and network components is presented, aimed at comparing protein conformational ensembles obtained from multiple Molecular Dynamic simulations. As a first result the original ensembles can be summarized by using only the representative conformations of the clusters obtained. In addition the network components analysis allows to discover and interpret the dynamic behavior of the conformations won by each neuron. The results showed the ability of this approach to efficiently derive a functional interpretation of the protein dynamics described by the original conformational ensemble, highlighting its potential as a support for protein engineering.

  17. Hierarchical modeling of genome-wide Short Tandem Repeat (STR) markers infers native American prehistory.

    Science.gov (United States)

    Lewis, Cecil M

    2010-02-01

    This study examines a genome-wide dataset of 678 Short Tandem Repeat loci characterized in 444 individuals representing 29 Native American populations as well as the Tundra Netsi and Yakut populations from Siberia. Using these data, the study tests four current hypotheses regarding the hierarchical distribution of neutral genetic variation in native South American populations: (1) the western region of South America harbors more variation than the eastern region of South America, (2) Central American and western South American populations cluster exclusively, (3) populations speaking the Chibchan-Paezan and Equatorial-Tucanoan language stock emerge as a group within an otherwise South American clade, (4) Chibchan-Paezan populations in Central America emerge together at the tips of the Chibchan-Paezan cluster. This study finds that hierarchical models with the best fit place Central American populations, and populations speaking the Chibchan-Paezan language stock, at a basal position or separated from the South American group, which is more consistent with a serial founder effect into South America than that previously described. Western (Andean) South America is found to harbor similar levels of variation as eastern (Equatorial-Tucanoan and Ge-Pano-Carib) South America, which is inconsistent with an initial west coast migration into South America. Moreover, in all relevant models, the estimates of genetic diversity within geographic regions suggest a major bottleneck or founder effect occurring within the North American subcontinent, before the peopling of Central and South America. 2009 Wiley-Liss, Inc.

  18. 2 x 2 Achievement Goals and Achievement Emotions: A Cluster Analysis of Students' Motivation

    Science.gov (United States)

    Jang, Leong Yeok; Liu, Woon Chia

    2012-01-01

    This study sought to better understand the adoption of multiple achievement goals at an intra-individual level, and its links to emotional well-being, learning, and academic achievement. Participants were 480 Secondary Two students (aged between 13 and 14 years) from two coeducational government schools. Hierarchical cluster analysis revealed the…

  19. The Hierarchical Perspective

    Directory of Open Access Journals (Sweden)

    Daniel Sofron

    2015-05-01

    Full Text Available This paper is focused on the hierarchical perspective, one of the methods for representing space that was used before the discovery of the Renaissance linear perspective. The hierarchical perspective has a more or less pronounced scientific character and its study offers us a clear image of the way the representatives of the cultures that developed it used to perceive the sensitive reality. This type of perspective is an original method of representing three-dimensional space on a flat surface, which characterises the art of Ancient Egypt and much of the art of the Middle Ages, being identified in the Eastern European Byzantine art, as well as in the Western European Pre-Romanesque and Romanesque art. At the same time, the hierarchical perspective is also present in naive painting and infantile drawing. Reminiscences of this method can be recognised also in the works of some precursors of the Italian Renaissance. The hierarchical perspective can be viewed as a subjective ranking criterion, according to which the elements are visually represented by taking into account their relevance within the image while perception is ignored. This paper aims to show how the main objective of the artists of those times was not to faithfully represent the objective reality, but rather to emphasize the essence of the world and its perennial aspects. This may represent a possible explanation for the refusal of perspective in the Egyptian, Romanesque and Byzantine painting, characterised by a marked two-dimensionality.

  20. Comparisons of Flow Patterns over a Hierarchical and a Non-hierarchical Surface in Relation to Biofouling Control

    Directory of Open Access Journals (Sweden)

    Bin Ahmad Fawzan Mohammed Ridha

    2018-01-01

    Full Text Available Biofouling can be defined as unwanted deposition and development of organisms on submerged surfaces. It is a major problem as it causes water contamination, infrastructures damage and increase in maintenance and operational cost especially in the shipping industry. There are a few methods that can prevent this problem. One of the most effective methods which is using chemicals particularly Tributyltin has been banned due to adverse effects on the environment. One of the non-toxic methods found to be effective is surface modification which involves altering the surface topography so that it becomes a low-fouling or a non-stick surface to biofouling organisms. Current literature suggested that non-hierarchical topographies has lower antifouling performance compared to hierarchical topographies. It is still unclear if the effects of the flow on these topographies could have aided in their antifouling properties. This research will use Computational Fluid Dynamics (CFD simulations to study the flow on these two topographies which also involves comparison study of the topographies used. According to the results obtained, it is shown that hierarchical topography has higher antifouling performance compared to non-hierarchical topography. This is because the fluid characteristics at the hierarchical topography is more favorable in controlling biofouling. In addition, hierarchical topography has higher wall shear stress distribution compared to non-hierarchical topography

  1. Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.

    Science.gov (United States)

    Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D

    2017-06-01

    Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.

  2. Hopfield-K-Means clustering algorithm: A proposal for the segmentation of electricity customers

    Energy Technology Data Exchange (ETDEWEB)

    Lopez, Jose J.; Aguado, Jose A.; Martin, F.; Munoz, F.; Rodriguez, A.; Ruiz, Jose E. [Department of Electrical Engineering, University of Malaga, C/ Dr. Ortiz Ramos, sn., Escuela de Ingenierias, 29071 Malaga (Spain)

    2011-02-15

    Customer classification aims at providing electric utilities with a volume of information to enable them to establish different types of tariffs. Several methods have been used to segment electricity customers, including, among others, the hierarchical clustering, Modified Follow the Leader and K-Means methods. These, however, entail problems with the pre-allocation of the number of clusters (Follow the Leader), randomness of the solution (K-Means) and improvement of the solution obtained (hierarchical algorithm). Another segmentation method used is Hopfield's autonomous recurrent neural network, although the solution obtained only guarantees that it is a local minimum. In this paper, we present the Hopfield-K-Means algorithm in order to overcome these limitations. This approach eliminates the randomness of the initial solution provided by K-Means based algorithms and it moves closer to the global optimun. The proposed algorithm is also compared against other customer segmentation and characterization techniques, on the basis of relative validation indexes. Finally, the results obtained by this algorithm with a set of 230 electricity customers (residential, industrial and administrative) are presented. (author)

  3. Hopfield-K-Means clustering algorithm: A proposal for the segmentation of electricity customers

    International Nuclear Information System (INIS)

    Lopez, Jose J.; Aguado, Jose A.; Martin, F.; Munoz, F.; Rodriguez, A.; Ruiz, Jose E.

    2011-01-01

    Customer classification aims at providing electric utilities with a volume of information to enable them to establish different types of tariffs. Several methods have been used to segment electricity customers, including, among others, the hierarchical clustering, Modified Follow the Leader and K-Means methods. These, however, entail problems with the pre-allocation of the number of clusters (Follow the Leader), randomness of the solution (K-Means) and improvement of the solution obtained (hierarchical algorithm). Another segmentation method used is Hopfield's autonomous recurrent neural network, although the solution obtained only guarantees that it is a local minimum. In this paper, we present the Hopfield-K-Means algorithm in order to overcome these limitations. This approach eliminates the randomness of the initial solution provided by K-Means based algorithms and it moves closer to the global optimun. The proposed algorithm is also compared against other customer segmentation and characterization techniques, on the basis of relative validation indexes. Finally, the results obtained by this algorithm with a set of 230 electricity customers (residential, industrial and administrative) are presented. (author)

  4. Adaptive hierarchical multi-agent organizations

    NARCIS (Netherlands)

    Ghijsen, M.; Jansweijer, W.N.H.; Wielinga, B.J.; Babuška, R.; Groen, F.C.A.

    2010-01-01

    In this chapter, we discuss the design of adaptive hierarchical organizations for multi-agent systems (MAS). Hierarchical organizations have a number of advantages such as their ability to handle complex problems and their scalability to large organizations. By introducing adaptivity in the

  5. Stochastic clustering of material surface under high-heat plasma load

    Science.gov (United States)

    Budaev, Viacheslav P.

    2017-11-01

    The results of a study of a surface formed by high-temperature plasma loads on various materials such as tungsten, carbon and stainless steel are presented. High-temperature plasma irradiation leads to an inhomogeneous stochastic clustering of the surface with self-similar granularity - fractality on the scale from nanoscale to macroscales. Cauliflower-like structure of tungsten and carbon materials are formed under high heat plasma load in fusion devices. The statistical characteristics of hierarchical granularity and scale invariance are estimated. They differ qualitatively from the roughness of the ordinary Brownian surface, which is possibly due to the universal mechanisms of stochastic clustering of material surface under the influence of high-temperature plasma.

  6. Identification of Counterfeit Alcoholic Beverages Using Cluster Analysis in Principal-Component Space

    Science.gov (United States)

    Khodasevich, M. A.; Sinitsyn, G. V.; Gres'ko, M. A.; Dolya, V. M.; Rogovaya, M. V.; Kazberuk, A. V.

    2017-07-01

    A study of 153 brands of commercial vodka products showed that counterfeit samples could be identified by introducing a unified additive at the minimum concentration acceptable for instrumental detection and multivariate analysis of UV-Vis transmission spectra. Counterfeit products were detected with 100% probability by using hierarchical cluster analysis or the C-means method in two-dimensional principal-component space.

  7. D Partition-Based Clustering for Supply Chain Data Management

    Science.gov (United States)

    Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.

    2015-10-01

    Supply Chain Management (SCM) is the management of the products and goods flow from its origin point to point of consumption. During the process of SCM, information and dataset gathered for this application is massive and complex. This is due to its several processes such as procurement, product development and commercialization, physical distribution, outsourcing and partnerships. For a practical application, SCM datasets need to be managed and maintained to serve a better service to its three main categories; distributor, customer and supplier. To manage these datasets, a structure of data constellation is used to accommodate the data into the spatial database. However, the situation in geospatial database creates few problems, for example the performance of the database deteriorate especially during the query operation. We strongly believe that a more practical hierarchical tree structure is required for efficient process of SCM. Besides that, three-dimensional approach is required for the management of SCM datasets since it involve with the multi-level location such as shop lots and residential apartments. 3D R-Tree has been increasingly used for 3D geospatial database management due to its simplicity and extendibility. However, it suffers from serious overlaps between nodes. In this paper, we proposed a partition-based clustering for the construction of a hierarchical tree structure. Several datasets are tested using the proposed method and the percentage of the overlapping nodes and volume coverage are computed and compared with the original 3D R-Tree and other practical approaches. The experiments demonstrated in this paper substantiated that the hierarchical structure of the proposed partitionbased clustering is capable of preserving minimal overlap and coverage. The query performance was tested using 300,000 points of a SCM dataset and the results are presented in this paper. This paper also discusses the outlook of the structure for future reference.

  8. Dynamics of the baryonic component in hierarchical clustering universes

    Science.gov (United States)

    Navarro, Julio

    1993-01-01

    I present self-consistent 3-D simulations of the formation of virialized systems containing both gas and dark matter in a flat universe. A fully Lagrangian code based on the Smoothed Particle Hydrodynamics technique and a tree data structure has been used to evolve regions of comoving radius 2-3 Mpc. Tidal effects are included by coarse-sampling the density of the outer regions up to a radius approx. 20 Mpc. Initial conditions are set at high redshift (z greater than 7) using a standard Cold Dark Matter perturbation spectrum and a baryon mass fraction of 10 percent (omega(sub b) = 0.1). Simulations in which the gas evolves either adiabatically or radiates energy at a rate determined locally by its cooling function were performed. This allows us to investigate with the same set of simulations the importance of radiative losses in the formation of galaxies and the equilibrium structure of virialized systems where cooling is very inefficient. In the absence of radiative losses, the simulations can be rescaled to the density and radius typical of galaxy clusters. A summary of the main results is presented.

  9. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  10. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Science.gov (United States)

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is

  11. TH-E-BRF-08: Subpopulations of Similarly-Responding Lesions in Metastatic Prostate Cancer

    International Nuclear Information System (INIS)

    Lin, C; Harmon, S; Perk, T; Jeraj, R

    2014-01-01

    Purpose: In patients with multiple lesions, resistance to cancer treatments and subsequent disease recurrence may be due to heterogeneity of response across lesions. This study aims to identify subpopulations of similarly-responding metastatic prostate cancer lesions in bone using quantitative PET metrics. Methods: Seven metastatic prostate cancer patients treated with AR-directed therapy received pre-treatment and mid-treatment [F-18]NaF PET/CT scans. Images were registered using an articulated CT registration algorithm and transformations were applied to PET segmentations. Midtreatment response was calculated on PET-based texture features. Hierarchical agglomerative clustering was used to form groups of similarly-responding lesions, with the number of natural clusters (K) determined by the inconsistency coefficient. Lesion clustering was performed within each patient, and for the pooled population. The cophenetic coefficient (C) quantified how well the data was clustered. The Jaccard Index (JI) assessed similarity of cluster assignments from patient clustering and from population clustering. Results: 188 lesions in seven patients were identified for analysis (between 6 to 53 lesions per patient). Lesion response was defined as percent change relative to pre-treatment for 23 uncorrelated PET-based feature identifiers. . High response heterogeneity was found across all lesions (i.e. range ΔSUVmax =−95.98% to 775.00%). For intra-patient clustering, K ranged from 1–20. Population-based clustering resulted in 75 clusters, of 1-6 lesions each. Intra-patient clustering resulted in higher quality clusters than population clustering (mean C=0.95, range=0.89 to 1.00). For all patients, cluster assignments from population clustering showed good agreement to intra-patient clustering (mean JI=0.87, range=0.68 to 1.00). Conclusion: Subpopulations of similarly-responding lesions were identified in patients with multiple metastatic lesions. Good agreement was found between

  12. Farm Typology in the Berambadi Watershed (India: Farming Systems Are Determined by Farm Size and Access to Groundwater

    Directory of Open Access Journals (Sweden)

    Marion Robert

    2017-01-01

    Full Text Available Farmers’ production decisions and agricultural practices directly and indirectly influence the quantity and quality of natural resources, some being depleted common resources such as groundwater. Representing farming systems while accounting for their flexibility is needed to evaluate targeted, regional water management policies. Farmers’ decisions regarding investing in irrigation and adopting cropping systems are inherently dynamic and must adapt to changes in climate and agronomic, economic and social, and institutional, conditions. To represent this diversity, we developed a typology of Indian farmers from a survey of 684 farms in Berambadi, an agricultural watershed in southern India (state of Karnataka. The survey provided information on farm structure, the cropping system and farm practices, water management for irrigation, and economic performances of the farm. Descriptive statistics and multivariate analysis (Multiple Correspondence Analysis and Agglomerative Hierarchical Clustering were used to analyze relationships between observed factors and establish the farm typology. We identified three main types of farms: (1 large diversified and productivist farms; (2 small and marginal rainfed farms, and (3 small irrigated marketing farms. This typology represents the heterogeneity of farms in the Berambadi watershed.

  13. Recent foraminifera of the Sao Francisco river delta, Sergipe, Brazil: a proposal for the ecological and environmental diagnostic model; Foraminiferos recentes do delta do Rio Sao Francisco, Sergipe (Brasil): uma proposta de modelo ecologico e de diagnostico ambiental

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2003-07-01

    Recent foraminifera assemblages were studied to determine an ecological model of species distribution, diversity, equability and confining degree, with implications to environmental diagnosis and paleoenvironmental reconstructions. The study area is inserted in a sector from Sao Francisco River delta, with interconnected channels and a lagoon connected to the ocean. Besides the biotic variables, sediment salinity and granulometry was considered. An agglomerative hierarchical clustering (AHC) was performed and it was recognized three bio facies distributed along the sector: Miliammina/Arenoparrella in the channels, and Ammonia/Elphidium and Quinqueloculina at the lagoon. The diversity and equability values increase from the Miliammina/Arenoparrella biofacies to the Ammonia/Elphidium and Quinqueloculina biofacies. The confining indices show environments ranging from confined (channels) to different degrees of low restricted to marine influence. From the results obtained, it is possible to recognize that environments dominated by textulariines are characterized by low diversity, equability and high confining degree. Environments dominated by rotaliines and miliolines tend to be more diversified, equitable and low restricted to marine influence. The results are similar to other obtained in other Brazilian estuarine environments, differing only on dominance by some species (author)

  14. Correlation between the pattern volatiles and the overall aroma of wild edible mushrooms.

    Science.gov (United States)

    de Pinho, P Guedes; Ribeiro, Bárbara; Gonçalves, Rui F; Baptista, Paula; Valentão, Patrícia; Seabra, Rosa M; Andrade, Paula B

    2008-03-12

    Volatile and semivolatile components of 11 wild edible mushrooms, Suillus bellini, Suillus luteus, Suillus granulatus, Tricholomopsis rutilans, Hygrophorus agathosmus, Amanita rubescens, Russula cyanoxantha, Boletus edulis, Tricholoma equestre, Fistulina hepatica, and Cantharellus cibarius, were determined by headspace solid-phase microextraction (HS-SPME) and by liquid extraction combined with gas chromatography-mass spectrometry (GC-MS). Fifty volatiles and nonvolatiles components were formally identified and 13 others were tentatively identified. Using sensorial analysis, the descriptors "mushroomlike", "farm-feed", "floral", "honeylike", "hay-herb", and "nutty" were obtained. A correlation between sensory descriptors and volatiles was observed by applying multivariate analysis (principal component analysis and agglomerative hierarchic cluster analysis) to the sensorial and chemical data. The studied edible mushrooms can be divided in three groups. One of them is rich in C8 derivatives, such as 3-octanol, 1-octen-3-ol, trans-2-octen-1-ol, 3-octanone, and 1-octen-3-one; another one is rich in terpenic volatile compounds; and the last one is rich in methional. The presence and contents of these compounds give a considerable contribution to the sensory characteristics of the analyzed species.

  15. Chemical Composition of Ballota macedonica Vandas and Ballota nigra L. ssp. foetida (Vis.) Hayek Essential Oils - The Chemotaxonomic Approach.

    Science.gov (United States)

    Đorđević, Aleksandra S; Jovanović, Olga P; Zlatković, Bojan K; Stojanović, Gordana S

    2016-06-01

    The essential oils isolated from fresh aerial parts of Ballota macedonica (two populations) and Ballota nigra ssp. foetida were analyzed by GC and GC/MS. Eighty five components were identified in total; 60 components in B. macedonica oil (population from the Former Yugoslav Republic of Macedonia), 34 components in B. macedonica oil (population from the Republic of Serbia), and 33 components in the oil of B. nigra ssp. foetida accounting for 93.9%, 98.4%, and 95.8% of the total oils, respectively. The most abundant components in B. macedonica oils were carotol (13.7 - 52.1%), germacrene D (8.6 - 24.6%), and (E)-caryophyllene (6.5 - 16.5%), while B. nigra ssp. foetida oil was dominated by (E)-phytol (56.9%), germacrene D (10.0%), and (E)-caryophyllene (4.7%). Multivariate statistical analyses (agglomerative hierarchical cluster analysis and principal component analysis) were used to compare and discuss relationships among Ballota species examined so far based on their volatile profiles. The chemical compositions of B. macedonica essential oils are reported for the first time. © 2016 Verlag Helvetica Chimica Acta AG, Zürich.

  16. Observations of fluorescent aerosol-cloud interactions in the free troposphere at the Sphinx high Alpine research station, Jungfraujoch

    Science.gov (United States)

    Crawford, I.; Lloyd, G.; Bower, K. N.; Connolly, P. J.; Flynn, M. J.; Kaye, P. H.; Choularton, T. W.; Gallagher, M. W.

    2015-09-01

    The fluorescent nature of aerosol at a high Alpine site was studied using a wide-band integrated bioaerosol (WIBS-4) single particle multi-channel ultra violet-light induced fluorescence (UV-LIF) spectrometer. This was supported by comprehensive cloud microphysics and meteorological measurements with the aims of cataloguing concentrations of bio-fluorescent aerosols at this high altitude site and also investigating possible influences of UV-fluorescent particle types on cloud-aerosol processes. Analysis of background free tropospheric air masses, using a total aerosol inlet, showed there to be a minor but statistically insignificant increase in the fluorescent aerosol fraction during in-cloud cases compared to out of cloud cases. The size dependence of the fluorescent aerosol fraction showed the larger aerosol to be more likely to be fluorescent with 80 % of 10 μm particles being fluorescent. Whilst the fluorescent particles were in the minority (NFl/NAll = 0.27±0.19), a new hierarchical agglomerative cluster analysis approach, Crawford et al. (2015) revealed the majority of the fluorescent aerosol were likely to be representative of fluorescent mineral dust. A minor episodic contribution from a cluster likely to be representative of primary biological aerosol particles (PBAP) was also observed with a wintertime baseline concentration of 0.1±0.4 L-1. Given the low concentration of this cluster and the typically low ice active fraction of studied PBAP (e.g. pseudomonas syringae) we suggest that the contribution to the observed ice crystal concentration at this location is not significant during the wintertime.

  17. Fast Gene Ontology based clustering for microarray experiments

    Directory of Open Access Journals (Sweden)

    Ovaska Kristian

    2008-11-01

    Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  18. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  19. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  20. Magnetic ordering of CoCl2-GIC, a spin ceramic: hierarchical successive transitions and the intermediate glassy phase

    International Nuclear Information System (INIS)

    Suzuki, Masatsugu; Suzuki, Itsuko S; Matsuura, Motohiro

    2007-01-01

    Stage-2 CoCl 2 -graphite intercalation compound (GIC) is a spin ceramic which shows hierarchical successive transitions at T cu (= 8.9 K) and T cl (= 7.0 K) from the paramagnetic phase into an intra-cluster (two-dimensional ferromagnetic) order with inter-cluster disorder and then to an inter-cluster (three-dimensional antiferromagnetic like) order over the whole system. The nature of the inter-cluster disorder was suggested to be of spin glass by nonlinear magnetic response analyses around T cu and by studies on dynamical aspects of ordering between T cu and T cl . Here, we present a further extensive examination of a series of time dependence of zero-field cooled magnetization M ZFC after the ageing protocol below T cu . The time dependence of the relaxation rates S ZFC (t) = (1/H) dM ZFC (t)/dlnt dramatically changes from the curves of simple spin glass ageing effect below T cl to those of two peaks above T cl . The characteristic relaxation behaviour apparently indicates that there coexist two different kinds of glassy correlated region below T cu

  1. Taxonomic Analysis of Voivodships Development in Terms of ICT Usage in Enterprises

    Directory of Open Access Journals (Sweden)

    Kaczmarczyk Paweł

    2017-12-01

    Full Text Available The aim of this paper is to analyse the development of voivodships in terms of ICT usage in enterprises by means of taxonomic methods. The theoretical part of the paper has been devoted to the role of modern information and communication technologies in post-industrial enterprises, in particular to the meaning of these technologies for new management concepts (e-business, online marketing, CRM, network management, X-engineering. The research methodology has been carried out, with special attention being paid to the agglomerative clustering method and optimization clustering method.

  2. Hierarchically Nanostructured Materials for Sustainable Environmental Applications

    Directory of Open Access Journals (Sweden)

    Zheng eRen

    2013-11-01

    Full Text Available This article presents a comprehensive overview of the hierarchical nanostructured materials with either geometry or composition complexity in environmental applications. The hierarchical nanostructures offer advantages of high surface area, synergistic interactions and multiple functionalities towards water remediation, environmental gas sensing and monitoring as well as catalytic gas treatment. Recent advances in synthetic strategies for various hierarchical morphologies such as hollow spheres and urchin-shaped architectures have been reviewed. In addition to the chemical synthesis, the physical mechanisms associated with the materials design and device fabrication have been discussed for each specific application. The development and application of hierarchical complex perovskite oxide nanostructures have also been introduced in photocatalytic water remediation, gas sensing and catalytic converter. Hierarchical nanostructures will open up many possibilities for materials design and device fabrication in environmental chemistry and technology.

  3. The VMC Survey. XXIX. Turbulence-controlled Hierarchical Star Formation in the Small Magellanic Cloud

    Science.gov (United States)

    Sun, Ning-Chen; de Grijs, Richard; Cioni, Maria-Rosa L.; Rubele, Stefano; Subramanian, Smitha; van Loon, Jacco Th.; Bekki, Kenji; Bell, Cameron P. M.; Ivanov, Valentin D.; Marconi, Marcella; Muraveva, Tatiana; Oliveira, Joana M.; Ripepi, Vincenzo

    2018-05-01

    In this paper we report a clustering analysis of upper main-sequence stars in the Small Magellanic Cloud, using data from the VMC survey (the VISTA near-infrared YJK s survey of the Magellanic system). Young stellar structures are identified as surface overdensities on a range of significance levels. They are found to be organized in a hierarchical pattern, such that larger structures at lower significance levels contain smaller ones at higher significance levels. They have very irregular morphologies, with a perimeter–area dimension of 1.44 ± 0.02 for their projected boundaries. They have a power-law mass–size relation, power-law size/mass distributions, and a log-normal surface density distribution. We derive a projected fractal dimension of 1.48 ± 0.03 from the mass–size relation, or of 1.4 ± 0.1 from the size distribution, reflecting significant lumpiness of the young stellar structures. These properties are remarkably similar to those of a turbulent interstellar medium, supporting a scenario of hierarchical star formation regulated by supersonic turbulence.

  4. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong; Wu, Tao

    2017-01-01

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced

  5. Heuristics for Hierarchical Partitioning with Application to Model Checking

    DEFF Research Database (Denmark)

    Möller, Michael Oliver; Alur, Rajeev

    2001-01-01

    Given a collection of connected components, it is often desired to cluster together parts of strong correspondence, yielding a hierarchical structure. We address the automation of this process and apply heuristics to battle the combinatorial and computational complexity. We define a cost function...... that captures the quality of a structure relative to the connections and favors shallow structures with a low degree of branching. Finding a structure with minimal cost is NP-complete. We present a greedy polynomial-time algorithm that approximates good solutions incrementally by local evaluation of a heuristic...... function. We argue for a heuristic function based on four criteria: the number of enclosed connections, the number of components, the number of touched connections and the depth of the structure. We report on an application in the context of formal verification, where our algorithm serves as a preprocessor...

  6. Hierarchically organized layout for visualization of biochemical pathways.

    Science.gov (United States)

    Tsay, Jyh-Jong; Wu, Bo-Liang; Jeng, Yu-Sen

    2010-01-01

    Many complex pathways are described as hierarchical structures in which a pathway is recursively partitioned into several sub-pathways, and organized hierarchically as a tree. The hierarchical structure provides a natural way to visualize the global structure of a complex pathway. However, none of the previous research on pathway visualization explores the hierarchical structures provided by many complex pathways. In this paper, we aim to develop algorithms that can take advantages of hierarchical structures, and give layouts that explore the global structures as well as local structures of pathways. We present a new hierarchically organized layout algorithm to produce layouts for hierarchically organized pathways. Our algorithm first decomposes a complex pathway into sub-pathway groups along the hierarchical organization, and then partition each sub-pathway group into basic components. It then applies conventional layout algorithms, such as hierarchical layout and force-directed layout, to compute the layout of each basic component. Finally, component layouts are joined to form a final layout of the pathway. Our main contribution is the development of algorithms for decomposing pathways and joining layouts. Experiment shows that our algorithm is able to give comprehensible visualization for pathways with hierarchies, cycles as well as complex structures. It clearly renders the global component structures as well as the local structure in each component. In addition, it runs very fast, and gives better visualization for many examples from previous related research. 2009 Elsevier B.V. All rights reserved.

  7. Hierarchical Cu precipitation in lamellated steel after multistage heat treatment

    Science.gov (United States)

    Liu, Qingdong; Gu, Jianfeng

    2017-09-01

    The hierarchical distribution of Cu-rich precipitates (CRPs) and related partitioning and segregation behaviours of solute atoms were investigated in a 1.54 Cu-3.51 Ni (wt.%) low-carbon high-strength low-alloy (HSLA) steel after multistage heat treatment by using the combination of electron backscatter diffraction (EBSD), transmission electron microscopy (TEM) and atom probe tomography (APT). Intercritical tempering at 725 °C of as-quenched lathlike martensitic structure leads to the coprecipitation of CRPs at the periphery of a carbide precipitate which is possibly in its paraequilibrium state due to distinct solute segregation at the interface. The alloyed carbide and CRPs provide constituent elements for each other and make the coprecipitation thermodynamically favourable. Meanwhile, austenite reversion occurs to form fresh secondary martensite (FSM) zone where is rich in Cu and pertinent Ni and Mn atoms, which gives rise to a different distributional morphology of CRPs with large size and high density. In addition, conventional tempering at 500 °C leads to the formation of nanoscale Cu-rich clusters in α-Fe matrix. As a consequence, three populations of CRPs are hierarchically formed around carbide precipitate, at FSM zone and in α-Fe matrix. The formation of different precipitated features can be turned by controlling diffusion pathways of related solute atoms and further to tailor mechanical properties via proper multistage heat treatments.

  8. Adaptive clustering procedure for continuous gravitational wave searches

    Science.gov (United States)

    Singh, Avneet; Papa, Maria Alessandra; Eggenstein, Heinz-Bernd; Walsh, Sinéad

    2017-10-01

    In hierarchical searches for continuous gravitational waves, clustering of candidates is an important post-processing step because it reduces the number of noise candidates that are followed up at successive stages [J. Aasi et al., Phys. Rev. Lett. 88, 102002 (2013), 10.1103/PhysRevD.88.102002; B. Behnke, M. A. Papa, and R. Prix, Phys. Rev. D 91, 064007 (2015), 10.1103/PhysRevD.91.064007; M. A. Papa et al., Phys. Rev. D 94, 122006 (2016), 10.1103/PhysRevD.94.122006]. Previous clustering procedures bundled together nearby candidates ascribing them to the same root cause (be it a signal or a disturbance), based on a predefined cluster volume. In this paper, we present a procedure that adapts the cluster volume to the data itself and checks for consistency of such volume with what is expected from a signal. This significantly improves the noise rejection capabilities at fixed detection threshold, and at fixed computing resources for the follow-up stages, this results in an overall more sensitive search. This new procedure was employed in the first Einstein@Home search on data from the first science run of the advanced LIGO detectors (O1) [LIGO Scientific Collaboration and Virgo Collaboration, arXiv:1707.02669 [Phys. Rev. D (to be published)

  9. Hierarchical screening for multiple mental disorders.

    Science.gov (United States)

    Batterham, Philip J; Calear, Alison L; Sunderland, Matthew; Carragher, Natacha; Christensen, Helen; Mackinnon, Andrew J

    2013-10-01

    There is a need for brief, accurate screening when assessing multiple mental disorders. Two-stage hierarchical screening, consisting of brief pre-screening followed by a battery of disorder-specific scales for those who meet diagnostic criteria, may increase the efficiency of screening without sacrificing precision. This study tested whether more efficient screening could be gained using two-stage hierarchical screening than by administering multiple separate tests. Two Australian adult samples (N=1990) with high rates of psychopathology were recruited using Facebook advertising to examine four methods of hierarchical screening for four mental disorders: major depressive disorder, generalised anxiety disorder, panic disorder and social phobia. Using K6 scores to determine whether full screening was required did not increase screening efficiency. However, pre-screening based on two decision tree approaches or item gating led to considerable reductions in the mean number of items presented per disorder screened, with estimated item reductions of up to 54%. The sensitivity of these hierarchical methods approached 100% relative to the full screening battery. Further testing of the hierarchical screening approach based on clinical criteria and in other samples is warranted. The results demonstrate that a two-phase hierarchical approach to screening multiple mental disorders leads to considerable increases efficiency gains without reducing accuracy. Screening programs should take advantage of prescreeners based on gating items or decision trees to reduce the burden on respondents. © 2013 Elsevier B.V. All rights reserved.

  10. Self-assembled biomimetic superhydrophobic hierarchical arrays.

    Science.gov (United States)

    Yang, Hongta; Dou, Xuan; Fang, Yin; Jiang, Peng

    2013-09-01

    Here, we report a simple and inexpensive bottom-up technology for fabricating superhydrophobic coatings with hierarchical micro-/nano-structures, which are inspired by the binary periodic structure found on the superhydrophobic compound eyes of some insects (e.g., mosquitoes and moths). Binary colloidal arrays consisting of exemplary large (4 and 30 μm) and small (300 nm) silica spheres are first assembled by a scalable Langmuir-Blodgett (LB) technology in a layer-by-layer manner. After surface modification with fluorosilanes, the self-assembled hierarchical particle arrays become superhydrophobic with an apparent water contact angle (CA) larger than 150°. The throughput of the resulting superhydrophobic coatings with hierarchical structures can be significantly improved by templating the binary periodic structures of the LB-assembled colloidal arrays into UV-curable fluoropolymers by a soft lithography approach. Superhydrophobic perfluoroether acrylate hierarchical arrays with large CAs and small CA hysteresis can be faithfully replicated onto various substrates. Both experiments and theoretical calculations based on the Cassie's dewetting model demonstrate the importance of the hierarchical structure in achieving the final superhydrophobic surface states. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Ensemble clustering in visual working memory biases location memories and reduces the Weber noise of relative positions.

    Science.gov (United States)

    Lew, Timothy F; Vul, Edward

    2015-01-01

    People seem to compute the ensemble statistics of objects and use this information to support the recall of individual objects in visual working memory. However, there are many different ways that hierarchical structure might be encoded. We examined the format of structured memories by asking subjects to recall the locations of objects arranged in different spatial clustering structures. Consistent with previous investigations of structured visual memory, subjects recalled objects biased toward the center of their clusters. Subjects also recalled locations more accurately when they were arranged in fewer clusters containing more objects, suggesting that subjects used the clustering structure of objects to aid recall. Furthermore, subjects had more difficulty recalling larger relative distances, consistent with subjects encoding the positions of objects relative to clusters and recalling them with magnitude-proportional (Weber) noise. Our results suggest that clustering improved the fidelity of recall by biasing the recall of locations toward cluster centers to compensate for uncertainty and by reducing the magnitude of encoded relative distances.

  12. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong

    2017-08-03

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced on a large-scale approach. The unique HNDCM holds great promise as components in separation and advanced carbon devices because they could offer unconventional fluidic transport phenomena on the nanoscale. Overall, the invention set forth herein covers a hierarchically structured, nitrogen-doped carbon membranes and methods of making and using such a membranes.

  13. Hierarchical Rhetorical Sentence Categorization for Scientific Papers

    Science.gov (United States)

    Rachman, G. H.; Khodra, M. L.; Widyantoro, D. H.

    2018-03-01

    Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naïve Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child-category can be built by hierarchical classifier.

  14. THE METALLICITY BIMODALITY OF GLOBULAR CLUSTER SYSTEMS: A TEST OF GALAXY ASSEMBLY AND OF THE EVOLUTION OF THE GALAXY MASS-METALLICITY RELATION

    International Nuclear Information System (INIS)

    Tonini, Chiara

    2013-01-01

    We build a theoretical model to study the origin of the globular cluster metallicity bimodality in the hierarchical galaxy assembly scenario. The model is based on empirical relations such as the galaxy mass-metallicity relation [O/H]-M star as a function of redshift, and on the observed galaxy stellar mass function up to redshift z ∼ 4. We make use of the theoretical merger rates as a function of mass and redshift from the Millennium simulation to build galaxy merger trees. We derive a new galaxy [Fe/H]-M star relation as a function of redshift, and by assuming that globular clusters share the metallicity of their original parent galaxy at the time of their formation, we populate the merger tree with globular clusters. We perform a series of Monte Carlo simulations of the galaxy hierarchical assembly, and study the properties of the final globular cluster population as a function of galaxy mass, assembly and star formation history, and under different assumptions for the evolution of the galaxy mass-metallicity relation. The main results and predictions of the model are the following. (1) The hierarchical clustering scenario naturally predicts a metallicity bimodality in the galaxy globular cluster population, where the metal-rich subpopulation is composed of globular clusters formed in the galaxy main progenitor around redshift z ∼ 2, and the metal-poor subpopulation is composed of clusters accreted from satellites, and formed at redshifts z ∼ 3-4. (2) The model reproduces the observed relations by Peng et al. for the metallicities of the metal-rich and metal-poor globular cluster subpopulations as a function of galaxy mass; the positions of the metal-poor and metal-rich peaks depend exclusively on the evolution of the galaxy mass-metallicity relation and the [O/Fe], both of which can be constrained by this method. In particular, we find that the galaxy [O/Fe] evolves linearly with redshift from a value of ∼0.5 at redshift z ∼ 4 to a value of ∼0.1 at

  15. A generic algorithm for constructing hierarchical representations of geometric objects

    International Nuclear Information System (INIS)

    Xavier, P.G.

    1995-01-01

    For a number of years, robotics researchers have exploited hierarchical representations of geometrical objects and scenes in motion-planning, collision-avoidance, and simulation. However, few general techniques exist for automatically constructing them. We present a generic, bottom-up algorithm that uses a heuristic clustering technique to produced balanced, coherent hierarchies. Its worst-case running time is O(N 2 logN), but for non-pathological cases it is O(NlogN), where N is the number of input primitives. We have completed a preliminary C++ implementation for input collections of 3D convex polygons and 3D convex polyhedra and conducted simple experiments with scenes of up to 12,000 polygons, which take only a few minutes to process. We present examples using spheres and convex hulls as hierarchy primitives

  16. Modulated modularity clustering as an exploratory tool for functional genomic inference.

    Directory of Open Access Journals (Sweden)

    Eric A Stone

    2009-05-01

    Full Text Available In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC, seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation.

  17. An Invitation to Open Innovation in Malaria Drug Discovery: 47 Quality Starting Points from the TCAMS.

    Science.gov (United States)

    Calderón, Félix; Barros, David; Bueno, José María; Coterón, José Miguel; Fernández, Esther; Gamo, Francisco Javier; Lavandera, José Luís; León, María Luisa; Macdonald, Simon J F; Mallo, Araceli; Manzano, Pilar; Porras, Esther; Fiandor, José María; Castro, Julia

    2011-10-13

    In 2010, GlaxoSmithKline published the structures of 13533 chemical starting points for antimalarial lead identification. By using an agglomerative structural clustering technique followed by computational filters such as antimalarial activity, physicochemical properties, and dissimilarity to known antimalarial structures, we have identified 47 starting points for lead optimization. Their structures are provided. We invite potential collaborators to work with us to discover new clinical candidates.

  18. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Directory of Open Access Journals (Sweden)

    Valentina Meuti

    2014-01-01

    Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  19. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Science.gov (United States)

    Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta

    2014-01-01

    Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499

  20. Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.

    Science.gov (United States)

    Lu, Xiaoqiang; Chen, Yaxiong; Li, Xuelong

    Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep learning architectures can learn more effective image representation features. However, these methods only use semantic features to generate hash codes by shallow projection but ignore texture details. In this paper, we proposed a novel hashing method, namely hierarchical recurrent neural hashing (HRNH), to exploit hierarchical recurrent neural network to generate effective hash codes. There are three contributions of this paper. First, a deep hashing method is proposed to extensively exploit both spatial details and semantic information, in which, we leverage hierarchical convolutional features to construct image pyramid representation. Second, our proposed deep network can exploit directly convolutional feature maps as input to preserve the spatial structure of convolutional feature maps. Finally, we propose a new loss function that considers the quantization error of binarizing the continuous embeddings into the discrete binary codes, and simultaneously maintains the semantic similarity and balanceable property of hash codes. Experimental results on four widely used data sets demonstrate that the proposed HRNH can achieve superior performance over other state-of-the-art hashing methods.Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep

  1. One-step synthesis of SnCo nanoconfined in hierarchical carbon nanostructures for lithium ion battery anode.

    Science.gov (United States)

    Qin, Jian; Liu, Dongye; Zhang, Xiang; Zhao, Naiqin; Shi, Chunsheng; Liu, En-Zuo; He, Fang; Ma, Liying; Li, Qunying; Li, Jiajun; He, Chunnian

    2017-10-26

    A new strategy for the one-step synthesis of a 0D SnCo nanoparticles-1D carbon nanotubes-3D hollow carbon submicrocube cluster (denoted as SnCo@CNT-3DC) hierarchical nanostructured material was developed via a simple chemical vapor deposition (CVD) process with the assistance of a water-soluble salt (NaCl). The adopted NaCl not only acted as a cubic template for inducing the formation of the 3D hollow carbon submicrocube cluster but also provides a substrate for the SnCo catalysts impregnation and CNT growth, ultimately leading to the successful construction of the unique 0D-1D-3D structured SnCo@CNT-3DC during the CVD of C 2 H 2 . When utilized as a lithium-ion battery anode, the SnCo@CNT-3DC composite electrode demonstrated an excellent rate performance and cycling stability for Li-ion storage. Specifically, an impressive reversible capacity of 826 mA h g -1 after 100 cycles at 0.1 A g -1 and a high rate capacity of 278 mA h g -1 even after 1000 cycles at 5 A g -1 were achieved. This remarkable electrochemical performance could be ascribed to the unique hierarchical nanostructure of SnCo@CNT-3DC, which guarantees a deep permeation of electrolytes and a shortened lithium salt diffusion pathway in the solid phase as well as numerous hyperchannels for electron transfer.

  2. Comparison of multianalyte proficiency test results by sum of ranking differences, principal component analysis, and hierarchical cluster analysis.

    Science.gov (United States)

    Škrbić, Biljana; Héberger, Károly; Durišić-Mladenović, Nataša

    2013-10-01

    Sum of ranking differences (SRD) was applied for comparing multianalyte results obtained by several analytical methods used in one or in different laboratories, i.e., for ranking the overall performances of the methods (or laboratories) in simultaneous determination of the same set of analytes. The data sets for testing of the SRD applicability contained the results reported during one of the proficiency tests (PTs) organized by EU Reference Laboratory for Polycyclic Aromatic Hydrocarbons (EU-RL-PAH). In this way, the SRD was also tested as a discriminant method alternative to existing average performance scores used to compare mutlianalyte PT results. SRD should be used along with the z scores--the most commonly used PT performance statistics. SRD was further developed to handle the same rankings (ties) among laboratories. Two benchmark concentration series were selected as reference: (a) the assigned PAH concentrations (determined precisely beforehand by the EU-RL-PAH) and (b) the averages of all individual PAH concentrations determined by each laboratory. Ranking relative to the assigned values and also to the average (or median) values pointed to the laboratories with the most extreme results, as well as revealed groups of laboratories with similar overall performances. SRD reveals differences between methods or laboratories even if classical test(s) cannot. The ranking was validated using comparison of ranks by random numbers (a randomization test) and using seven folds cross-validation, which highlighted the similarities among the (methods used in) laboratories. Principal component analysis and hierarchical cluster analysis justified the findings based on SRD ranking/grouping. If the PAH-concentrations are row-scaled, (i.e., z scores are analyzed as input for ranking) SRD can still be used for checking the normality of errors. Moreover, cross-validation of SRD on z scores groups the laboratories similarly. The SRD technique is general in nature, i.e., it can

  3. Zeolitic materials with hierarchical porous structures.

    Science.gov (United States)

    Lopez-Orozco, Sofia; Inayat, Amer; Schwab, Andreas; Selvam, Thangaraj; Schwieger, Wilhelm

    2011-06-17

    During the past several years, different kinds of hierarchical structured zeolitic materials have been synthesized due to their highly attractive properties, such as superior mass/heat transfer characteristics, lower restriction of the diffusion of reactants in the mesopores, and low pressure drop. Our contribution provides general information regarding types and preparation methods of hierarchical zeolitic materials and their relative advantages and disadvantages. Thereafter, recent advances in the preparation and characterization of hierarchical zeolitic structures within the crystallites by post-synthetic treatment methods, such as dealumination or desilication; and structured devices by in situ and ex situ zeolite coatings on open-cellular ceramic foams as (non-reactive as well as reactive) supports are highlighted. Specific advantages of using hierarchical zeolitic catalysts/structures in selected catalytic reactions, such as benzene to phenol (BTOP) and methanol to olefins (MTO) are presented. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Processing of hierarchical syntactic structure in music.

    Science.gov (United States)

    Koelsch, Stefan; Rohrmeier, Martin; Torrecuso, Renzo; Jentschke, Sebastian

    2013-09-17

    Hierarchical structure with nested nonlocal dependencies is a key feature of human language and can be identified theoretically in most pieces of tonal music. However, previous studies have argued against the perception of such structures in music. Here, we show processing of nonlocal dependencies in music. We presented chorales by J. S. Bach and modified versions in which the hierarchical structure was rendered irregular whereas the local structure was kept intact. Brain electric responses differed between regular and irregular hierarchical structures, in both musicians and nonmusicians. This finding indicates that, when listening to music, humans apply cognitive processes that are capable of dealing with long-distance dependencies resulting from hierarchically organized syntactic structures. Our results reveal that a brain mechanism fundamental for syntactic processing is engaged during the perception of music, indicating that processing of hierarchical structure with nested nonlocal dependencies is not just a key component of human language, but a multidomain capacity of human cognition.

  5. Cluster analysis of obesity and asthma phenotypes.

    Directory of Open Access Journals (Sweden)

    E Rand Sutherland

    Full Text Available Asthma is a heterogeneous disease with variability among patients in characteristics such as lung function, symptoms and control, body weight, markers of inflammation, and responsiveness to glucocorticoids (GC. Cluster analysis of well-characterized cohorts can advance understanding of disease subgroups in asthma and point to unsuspected disease mechanisms. We utilized an hypothesis-free cluster analytical approach to define the contribution of obesity and related variables to asthma phenotype.In a cohort of clinical trial participants (n = 250, minimum-variance hierarchical clustering was used to identify clinical and inflammatory biomarkers important in determining disease cluster membership in mild and moderate persistent asthmatics. In a subset of participants, GC sensitivity was assessed via expression of GC receptor alpha (GCRα and induction of MAP kinase phosphatase-1 (MKP-1 expression by dexamethasone. Four asthma clusters were identified, with body mass index (BMI, kg/m(2 and severity of asthma symptoms (AEQ score the most significant determinants of cluster membership (F = 57.1, p<0.0001 and F = 44.8, p<0.0001, respectively. Two clusters were composed of predominantly obese individuals; these two obese asthma clusters differed from one another with regard to age of asthma onset, measures of asthma symptoms (AEQ and control (ACQ, exhaled nitric oxide concentration (F(ENO and airway hyperresponsiveness (methacholine PC(20 but were similar with regard to measures of lung function (FEV(1 (% and FEV(1/FVC, airway eosinophilia, IgE, leptin, adiponectin and C-reactive protein (hsCRP. Members of obese clusters demonstrated evidence of reduced expression of GCRα, a finding which was correlated with a reduced induction of MKP-1 expression by dexamethasoneObesity is an important determinant of asthma phenotype in adults. There is heterogeneity in expression of clinical and inflammatory biomarkers of asthma across obese individuals

  6. Methodology сomparative statistical analysis of Russian industry based on cluster analysis

    Directory of Open Access Journals (Sweden)

    Sergey S. Shishulin

    2017-01-01

    Full Text Available The article is devoted to researching of the possibilities of applying multidimensional statistical analysis in the study of industrial production on the basis of comparing its growth rates and structure with other developed and developing countries of the world. The purpose of this article is to determine the optimal set of statistical methods and the results of their application to industrial production data, which would give the best access to the analysis of the result.Data includes such indicators as output, output, gross value added, the number of employed and other indicators of the system of national accounts and operational business statistics. The objects of observation are the industry of the countrys of the Customs Union, the United States, Japan and Erope in 2005-2015. As the research tool used as the simplest methods of transformation, graphical and tabular visualization of data, and methods of statistical analysis. In particular, based on a specialized software package (SPSS, the main components method, discriminant analysis, hierarchical methods of cluster analysis, Ward’s method and k-means were applied.The application of the method of principal components to the initial data makes it possible to substantially and effectively reduce the initial space of industrial production data. Thus, for example, in analyzing the structure of industrial production, the reduction was from fifteen industries to three basic, well-interpreted factors: the relatively extractive industries (with a low degree of processing, high-tech industries and consumer goods (medium-technology sectors. At the same time, as a result of comparison of the results of application of cluster analysis to the initial data and data obtained on the basis of the principal components method, it was established that clustering industrial production data on the basis of new factors significantly improves the results of clustering.As a result of analyzing the parameters of

  7. Hierarchical Nanoceramics for Industrial Process Sensors

    Energy Technology Data Exchange (ETDEWEB)

    Ruud, James, A.; Brosnan, Kristen, H.; Striker, Todd; Ramaswamy, Vidya; Aceto, Steven, C.; Gao, Yan; Willson, Patrick, D.; Manoharan, Mohan; Armstrong, Eric, N., Wachsman, Eric, D.; Kao, Chi-Chang

    2011-07-15

    This project developed a robust, tunable, hierarchical nanoceramics materials platform for industrial process sensors in harsh-environments. Control of material structure at multiple length scales from nano to macro increased the sensing response of the materials to combustion gases. These materials operated at relatively high temperatures, enabling detection close to the source of combustion. It is anticipated that these materials can form the basis for a new class of sensors enabling widespread use of efficient combustion processes with closed loop feedback control in the energy-intensive industries. The first phase of the project focused on materials selection and process development, leading to hierarchical nanoceramics that were evaluated for sensing performance. The second phase focused on optimizing the materials processes and microstructures, followed by validation of performance of a prototype sensor in a laboratory combustion environment. The objectives of this project were achieved by: (1) synthesizing and optimizing hierarchical nanostructures; (2) synthesizing and optimizing sensing nanomaterials; (3) integrating sensing functionality into hierarchical nanostructures; (4) demonstrating material performance in a sensing element; and (5) validating material performance in a simulated service environment. The project developed hierarchical nanoceramic electrodes for mixed potential zirconia gas sensors with increased surface area and demonstrated tailored electrocatalytic activity operable at high temperatures enabling detection of products of combustion such as NOx close to the source of combustion. Methods were developed for synthesis of hierarchical nanostructures with high, stable surface area, integrated catalytic functionality within the structures for gas sensing, and demonstrated materials performance in harsh lab and combustion gas environments.

  8. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  9. Classification using Hierarchical Naive Bayes models

    DEFF Research Database (Denmark)

    Langseth, Helge; Dyhre Nielsen, Thomas

    2006-01-01

    Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe......, termed Hierarchical Naïve Bayes models. Hierarchical Naïve Bayes models extend the modeling flexibility of Naïve Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naïve Bayes models...

  10. Improved Adhesion and Compliancy of Hierarchical Fibrillar Adhesives.

    Science.gov (United States)

    Li, Yasong; Gates, Byron D; Menon, Carlo

    2015-08-05

    The gecko relies on van der Waals forces to cling onto surfaces with a variety of topography and composition. The hierarchical fibrillar structures on their climbing feet, ranging from mesoscale to nanoscale, are hypothesized to be key elements for the animal to conquer both smooth and rough surfaces. An epoxy-based artificial hierarchical fibrillar adhesive was prepared to study the influence of the hierarchical structures on the properties of a dry adhesive. The presented experiments highlight the advantages of a hierarchical structure despite a reduction of overall density and aspect ratio of nanofibrils. In contrast to an adhesive containing only nanometer-size fibrils, the hierarchical fibrillar adhesives exhibited a higher adhesion force and better compliancy when tested on an identical substrate.

  11. The state of the residential fire fatality problem in Sweden: Epidemiology, risk factors, and event typologies.

    Science.gov (United States)

    Jonsson, Anders; Bonander, Carl; Nilson, Finn; Huss, Fredrik

    2017-09-01

    Residential fires represent the largest category of fatal fires in Sweden. The purpose of this study was to describe the epidemiology of fatal residential fires in Sweden and to identify clusters of events. Data was collected from a database that combines information on fatal fires with data from forensic examinations and the Swedish Cause of Death-register. Mortality rates were calculated for different strata using population statistics and rescue service turnout reports. Cluster analysis was performed using multiple correspondence analysis with agglomerative hierarchical clustering. Male sex, old age, smoking, and alcohol were identified as risk factors, and the most common primary injury diagnosis was exposure to toxic gases. Compared to non-fatal fires, fatal residential fires more often originated in the bedroom, were more often caused by smoking, and were more likely to occur at night. Six clusters were identified. The first two clusters were both smoking-related, but were separated into (1) fatalities that often involved elderly people, usually female, whose clothes were ignited (17% of the sample), (2) middle-aged (45-64years old), (often) intoxicated men, where the fire usually originated in furniture (30%). Other clusters that were identified in the analysis were related to (3) fires caused by technical fault, started in electrical installations in single houses (13%), (4) cooking appliances left on (8%), (5) events with unknown cause, room and object of origin (25%), and (6) deliberately set fires (7%). Fatal residential fires were unevenly distributed in the Swedish population. To further reduce the incidence of fire mortality, specialized prevention efforts that focus on the different needs of each cluster are required. Cooperation between various societal functions, e.g. rescue services, elderly care, psychiatric clinics and other social services, with an application of both human and technological interventions, should reduce residential fire

  12. Maximum-likelihood model averaging to profile clustering of site types across discrete linear sequences.

    Directory of Open Access Journals (Sweden)

    Zhang Zhang

    2009-06-01

    Full Text Available A major analytical challenge in computational biology is the detection and description of clusters of specified site types, such as polymorphic or substituted sites within DNA or protein sequences. Progress has been stymied by a lack of suitable methods to detect clusters and to estimate the extent of clustering in discrete linear sequences, particularly when there is no a priori specification of cluster size or cluster count. Here we derive and demonstrate a maximum likelihood method of hierarchical clustering. Our method incorporates a tripartite divide-and-conquer strategy that models sequence heterogeneity, delineates clusters, and yields a profile of the level of clustering associated with each site. The clustering model may be evaluated via model selection using the Akaike Information Criterion, the corrected Akaike Information Criterion, and the Bayesian Information Criterion. Furthermore, model averaging using weighted model likelihoods may be applied to incorporate model uncertainty into the profile of heterogeneity across sites. We evaluated our method by examining its performance on a number of simulated datasets as well as on empirical polymorphism data from diverse natural alleles of the Drosophila alcohol dehydrogenase gene. Our method yielded greater power for the detection of clustered sites across a breadth of parameter ranges, and achieved better accuracy and precision of estimation of clusters, than did the existing empirical cumulative distribution function statistics.

  13. A novel community detection method in bipartite networks

    Science.gov (United States)

    Zhou, Cangqi; Feng, Liang; Zhao, Qianchuan

    2018-02-01

    Community structure is a common and important feature in many complex networks, including bipartite networks, which are used as a standard model for many empirical networks comprised of two types of nodes. In this paper, we propose a two-stage method for detecting community structure in bipartite networks. Firstly, we extend the widely-used Louvain algorithm to bipartite networks. The effectiveness and efficiency of the Louvain algorithm have been proved by many applications. However, there lacks a Louvain-like algorithm specially modified for bipartite networks. Based on bipartite modularity, a measure that extends unipartite modularity and that quantifies the strength of partitions in bipartite networks, we fill the gap by developing the Bi-Louvain algorithm that iteratively groups the nodes in each part by turns. This algorithm in bipartite networks often produces a balanced network structure with equal numbers of two types of nodes. Secondly, for the balanced network yielded by the first algorithm, we use an agglomerative clustering method to further cluster the network. We demonstrate that the calculation of the gain of modularity of each aggregation, and the operation of joining two communities can be compactly calculated by matrix operations for all pairs of communities simultaneously. At last, a complete hierarchical community structure is unfolded. We apply our method to two benchmark data sets and a large-scale data set from an e-commerce company, showing that it effectively identifies community structure in bipartite networks.

  14. Oligosaccharides in feces of breast- and formula-fed babies.

    Science.gov (United States)

    Albrecht, Simone; Schols, Henk A; van Zoeren, Diny; van Lingen, Richard A; Groot Jebbink, Liesbeth J M; van den Heuvel, Ellen G H M; Voragen, Alphons G J; Gruppen, Harry

    2011-10-18

    So far, little is known on the fate of oligosaccharides in the colon of breast- and formula-fed babies. Using capillary electrophoresis with laser induced fluorescence detector coupled to a mass spectrometer (CE-LIF-MS(n)), we studied the fecal oligosaccharide profiles of 27 two-month-old breast-, formula- and mixed-fed preterm babies. The interpretation of the complex oligosaccharide profiles was facilitated by beforehand clustering the CE-LIF data points by agglomerative hierarchical clustering (AHC). In the feces of breast-fed babies, characteristic human milk oligosaccharide (HMO) profiles, showing genetic fingerprints known for human milk of secretors and non-secretors, were recognized. Alternatively, advanced degradation and bioconversion of HMOs, resulting in an accumulation of acidic HMOs or HMO bioconversion products was observed. Independent of the prebiotic supplementation of the formula with galactooligosaccharides (GOS) at the level used, similar oligosaccharide profiles of low peak abundance were obtained for formula-fed babies. Feeding influences the presence of diet-related oligosaccharides in baby feces and gastrointestinal adaptation plays an important role herein. Four fecal oligosaccharides, characterized as HexNAc-Hex-Hex, Hex-[Fuc]-HexNAc-Hex, HexNAc-[Fuc]-Hex-Hex and HexNAc-[Fuc]-Hex-HexNAc-Hex-Hex, highlighted an active gastrointestinal metabolization of the feeding-related oligosaccharides. Their presence was linked to the gastrointestinal mucus layer and the blood-group determinant oligosaccharides therein, which are characteristic for the host's genotype. Copyright © 2011 Elsevier Ltd. All rights reserved.

  15. Phenotypes of asthma in low-income children and adolescents: cluster analysis.

    Science.gov (United States)

    Cabral, Anna Lucia Barros; Sousa, Andrey Wirgues; Mendes, Felipe Augusto Rodrigues; Carvalho, Celso Ricardo Fernandes de

    2017-01-01

    Studies characterizing asthma phenotypes have predominantly included adults or have involved children and adolescents in developed countries. Therefore, their applicability in other populations, such as those of developing countries, remains indeterminate. Our objective was to determine how low-income children and adolescents with asthma in Brazil are distributed across a cluster analysis. We included 306 children and adolescents (6-18 years of age) with a clinical diagnosis of asthma and under medical treatment for at least one year of follow-up. At enrollment, all the patients were clinically stable. For the cluster analysis, we selected 20 variables commonly measured in clinical practice and considered important in defining asthma phenotypes. Variables with high multicollinearity were excluded. A cluster analysis was applied using a twostep agglomerative test and log-likelihood distance measure. Three clusters were defined for our population. Cluster 1 (n = 94) included subjects with normal pulmonary function, mild eosinophil inflammation, few exacerbations, later age at asthma onset, and mild atopy. Cluster 2 (n = 87) included those with normal pulmonary function, a moderate number of exacerbations, early age at asthma onset, more severe eosinophil inflammation, and moderate atopy. Cluster 3 (n = 108) included those with poor pulmonary function, frequent exacerbations, severe eosinophil inflammation, and severe atopy. Asthma was characterized by the presence of atopy, number of exacerbations, and lung function in low-income children and adolescents in Brazil. The many similarities with previous cluster analyses of phenotypes indicate that this approach shows good generalizability. Estudos que caracterizam fenótipos de asma predominantemente incluem adultos ou foram realizados em crianças e adolescentes de países desenvolvidos; portanto, sua aplicabilidade em outras populações, tais como as de países em desenvolvimento, permanece indeterminada. Nosso

  16. Shape Analysis of HII Regions - I. Statistical Clustering

    Science.gov (United States)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

    2018-04-01

    We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  17. Identifying novel phenotypes of acute heart failure using cluster analysis of clinical variables.

    Science.gov (United States)

    Horiuchi, Yu; Tanimoto, Shuzou; Latif, A H M Mahbub; Urayama, Kevin Y; Aoki, Jiro; Yahagi, Kazuyuki; Okuno, Taishi; Sato, Yu; Tanaka, Tetsu; Koseki, Keita; Komiyama, Kota; Nakajima, Hiroyoshi; Hara, Kazuhiro; Tanabe, Kengo

    2018-07-01

    Acute heart failure (AHF) is a heterogeneous disease caused by various cardiovascular (CV) pathophysiology and multiple non-CV comorbidities. We aimed to identify clinically important subgroups to improve our understanding of the pathophysiology of AHF and inform clinical decision-making. We evaluated detailed clinical data of 345 consecutive AHF patients using non-hierarchical cluster analysis of 77 variables, including age, sex, HF etiology, comorbidities, physical findings, laboratory data, electrocardiogram, echocardiogram and treatment during hospitalization. Cox proportional hazards regression analysis was performed to estimate the association between the clusters and clinical outcomes. Three clusters were identified. Cluster 1 (n=108) represented "vascular failure". This cluster had the highest average systolic blood pressure at admission and lung congestion with type 2 respiratory failure. Cluster 2 (n=89) represented "cardiac and renal failure". They had the lowest ejection fraction (EF) and worst renal function. Cluster 3 (n=148) comprised mostly older patients and had the highest prevalence of atrial fibrillation and preserved EF. Death or HF hospitalization within 12-month occurred in 23% of Cluster 1, 36% of Cluster 2 and 36% of Cluster 3 (p=0.034). Compared with Cluster 1, risk of death or HF hospitalization was 1.74 (95% CI, 1.03-2.95, p=0.037) for Cluster 2 and 1.82 (95% CI, 1.13-2.93, p=0.014) for Cluster 3. Cluster analysis may be effective in producing clinically relevant categories of AHF, and may suggest underlying pathophysiology and potential utility in predicting clinical outcomes. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Ant Colony Optimization Approaches to Clustering of Lung Nodules from CT Images

    Directory of Open Access Journals (Sweden)

    Ravichandran C. Gopalakrishnan

    2014-01-01

    Full Text Available Lung cancer is becoming a threat to mankind. Applying machine learning algorithms for detection and segmentation of irregular shaped lung nodules remains a remarkable milestone in CT scan image analysis research. In this paper, we apply ACO algorithm for lung nodule detection. We have compared the performance against three other algorithms, namely, Otsu algorithm, watershed algorithm, and global region based segmentation. In addition, we suggest a novel approach which involves variations of ACO, namely, refined ACO, logical ACO, and variant ACO. Variant ACO shows better reduction in false positives. In addition we propose black circular neighborhood approach to detect nodule centers from the edge detected image. Genetic algorithm based clustering is performed to cluster the nodules based on intensity, shape, and size. The performance of the overall approach is compared with hierarchical clustering to establish the improvisation in the proposed approach.

  19. The Assessment of Hydrogen Energy Systems for Fuel Cell Vehicles Using Principal Componenet Analysis and Cluster Analysis

    DEFF Research Database (Denmark)

    Ren, Jingzheng; Tan, Shiyu; Dong, Lichun

    2012-01-01

    and analysis of the hydrogen systems is meaningful for decision makers to select the best scenario. principal component analysis (PCA) has been used to evaluate the integrated performance of different hydrogen energy systems and select the best scenario, and hierarchical cluster analysis (CA) has been used...... for transportation of hydrogen, hydrogen gas tank for the storage of hydrogen at refueling stations, and gaseous hydrogen as power energy for fuel cell vehicles has been recognized as the best scenario. Also, the clustering results calculated by CA are consistent with those determined by PCA, denoting...

  20. Leadership styles across hierarchical levels in nursing departments.

    Science.gov (United States)

    Stordeur, S; Vandenberghe, C; D'hoore, W

    2000-01-01

    Some researchers have reported on the cascading effect of transformational leadership across hierarchical levels. One study examined this effect in nursing, but it was limited to a single hospital. To examine the cascading effect of leadership styles across hierarchical levels in a sample of nursing departments and to investigate the effect of hierarchical level on the relationships between leadership styles and various work outcomes. Based on a sample of eight hospitals, the cascading effect was tested using correlation analysis. The main sources of variation among leadership scores were determined with analyses of variance (ANOVA), and the interaction effect of hierarchical level and leadership styles on criterion variables was tested with moderated regression analysis. No support was found for a cascading effect of leadership across hierarchical levels. Rather, the variation of leadership scores was explained primarily by the organizational context. Transformational leadership had a stronger impact on criterion variables than transactional leadership. Interaction effects between leadership styles and hierarchical level were observed only for perceived unit effectiveness. The hospital's structure and culture are major determinants of leadership styles.

  1. Learning with hierarchical-deep models.

    Science.gov (United States)

    Salakhutdinov, Ruslan; Tenenbaum, Joshua B; Torralba, Antonio

    2013-08-01

    We introduce HD (or “Hierarchical-Deep”) models, a new compositional learning architecture that integrates deep learning models with structured hierarchical Bayesian (HB) models. Specifically, we show how we can learn a hierarchical Dirichlet process (HDP) prior over the activities of the top-level features in a deep Boltzmann machine (DBM). This compound HDP-DBM model learns to learn novel concepts from very few training example by learning low-level generic features, high-level features that capture correlations among low-level features, and a category hierarchy for sharing priors over the high-level features that are typical of different kinds of concepts. We present efficient learning and inference algorithms for the HDP-DBM model and show that it is able to learn new concepts from very few examples on CIFAR-100 object recognition, handwritten character recognition, and human motion capture datasets.

  2. Multi-wavelength study of young and massive galaxy clusters

    International Nuclear Information System (INIS)

    Lemonon, Ludovic

    1999-01-01

    Clusters of galaxies are the most massive objects gravitationally bound observed. They are the consequence of the evolution of most important perturbations in the cosmological microwave background. Their formation depends strongly of the cosmology, so they represent key objects to understand the Universe. The aim of this thesis is to study the processes of formation in clusters of galaxies well far away than previous studies clone, by high-resolution observations obtained by using most powerful telescope in each studied wavelength: X-ray, visible, infrared and radio. After data reductions of 12 clusters located at 0.1; z; 0.3, I was able to classified them in three categories: dynamically perturbed clusters, with substructures in their X-ray/optical image or velocity distribution of galaxies; cooling flows clusters, more relaxed than previous, with huge amount of gas cooling in their center; AGN contaminated, where the central dominant galaxy is an AGN which contaminate considerably the X-ray emission. I have obtained a measurement of the baryonic fraction of the Universe mass, and an estimation of the Universe matter density parameter at the mega-parsec scale, claiming for a low density universe. The ISOCAM data showed the effect of the ICM interactions on the star formation in cluster galaxies, and demonstrated that optical and mid-IR deduced star-formation are not basically compatible. They also showed how IR-emitting galaxies distribute in clusters, most noticeably how 15 um galaxies are located preferably on the edge of clusters. X-ray and radio data showed that clusters at z 0.25 could be find in several dynamical state, similarly with nearby ones, from relaxed to severely perturbed. All clusters present signs of past or present merging, in agreement with hierarchical structure formation scenario. This clusters database is an excellent starting point to study process of merging in clusters since they showed different aspect of this evolution. (author) [fr

  3. Continuous Autoregulatory Indices Derived from Multi-Modal Monitoring: Each One Is Not Like the Other.

    Science.gov (United States)

    Zeiler, Frederick A; Donnelly, Joseph; Menon, David K; Smielewski, Peter; Zweifel, Christian; Brady, Ken; Czosnyka, Marek

    2017-11-15

    We assess the relationships between various continuous measures of autoregulatory capacity in a cohort of adults with traumatic brain injury (TBI). We assessed relationships between autoregulatory indices derived from intracranial pressure (ICP: PRx, PAx, RAC), transcranial Doppler (TCD: Mx, Sx, Dx), brain tissue-oxygenation (ORx), and spatially resolved near infrared spectroscopy (NIRS resolved: TOx, THx). Relationships between indices were assessed using Pearson correlation coefficient, Friedman test, principal component analysis (PCA), agglomerative hierarchal clustering (AHC) and k-means cluster analysis (KMCA). All analytic techniques were repeated for a range of temporal resolutions of data, including minute-by-minute averages, moving means of 30 samples, and grand mean for each patient. Thirty-seven patients were studied. The PRx displayed strong association with PAx/RAC across all the analytical techniques: Pearson correlation (r = 0.682/r = 0.677, p indices (Mx, Dx) were correlated and co-clustered on PCA, AHC, and KMCA. The Sx was found to be more closely associated with ICP-derived indices on Pearson correlation, PCA, AHC, and KMCA. The NIRS indices displayed variable correlation with each other and with indices derived from ICP and TCD signals. Of interest, TOx and THx co-cluster with ICP-based indices on PCA and AHC. The ORx failed to display any meaningful correlations with other indices in neither of the analytical method used. Thirty-minute moving average and minute-by-minute data set displayed similar results across all the methods. The RAC, Mx, and Sx were the strongest predictors of outcome at six months. Continuously updating autoregulatory indices are not all correlated with one another. Caution must be advised when utilizing less commonly described autoregulation indices (i.e., ORx) for the clinical assessment of autoregulatory capacity, because they appear to not be related to commonly measured/establish indices, such as PRx

  4. Classification of Tropical River Using Chemometrics Technique: Case Study in Pahang River, Malaysia

    International Nuclear Information System (INIS)

    Mohd Khairul Amri Kamarudin; Mohd Ekhwan Toriman; Nur Hishaam Sulaiman

    2015-01-01

    River classification is very important to know the river characteristic in study areas, where this database can help to understand the behaviour of the river. This article discusses about river classification using Chemometrics techniques in mainstream of Pahang River. Based on river survey, GIS and Remote Sensing database, the chemometric analysis techniques have been used to identify the cluster on the Pahang River using Hierarchical Agglomerative Cluster Analysis (HACA). Calibration and validation process using Discriminant Analysis (DA) has been used to confirm the HACA result. Principal Component Analysis (PCA) study to see the strong coefficient where the Pahang River has been classed. The results indicated the main of Pahang River has been classed to three main clusters as upstream, middle stream and downstream. Base on DA analysis, the calibration and validation model shows 100 % convinced. While the PCA indicates there are three variables that have a significant correlation, domination slope with R"2 0.796, L/D ratio with R"2 -0868 and sinuosity with R"2 0.557. Map of the river classification with moving class also was produced. Where the green colour considered in valley erosion zone, yellow in a low terrace of land near the channels and red colour class in flood plain and valley deposition zone. From this result, the basic information can be produced to understand the characteristics of the main Pahang River. This result is important to local authorities to make decisions according to the cluster or guidelines for future study in Pahang River, Malaysia specifically and for Tropical River generally. The research findings are important to local authorities by providing basic data as a guidelines to the integrated river management at Pahang River, and Tropical River in general. (author)

  5. Analysis of indoor air pollutants checklist using environmetric technique for health risk assessment of sick building complaint in nonindustrial workplace.

    Science.gov (United States)

    Syazwan, Ai; Rafee, B Mohd; Juahir, Hafizan; Azman, Azf; Nizar, Am; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, Aa; Yunos, Ma Syafiq; Anita, Ar; Hanafiah, J Muhamad; Shaharuddin, Ms; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, Mn Mohamad; Azizan, Hs; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, Ft

    2012-01-01

    To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ

  6. Review on Control of DC Microgrids and Multiple Microgrid Clusters

    DEFF Research Database (Denmark)

    Meng, Lexuan; Shafiee, Qobad; Ferrari-Trecate, Giancarlo

    2017-01-01

    This paper performs an extensive review on control schemes and architectures applied to DC microgrids. It covers multi-layer hierarchical control schemes, coordinated control strategies, plug-and-play operations, stability and active damping aspects as well as nonlinear control algorithms....... Islanding detection, protection and microgrid clusters control are also briefly summarized. All the mentioned issues are discussed with the goal of providing control design guidelines for DC microgrids. The future research challenges, from the authors’ point of view, are also provided in the final...

  7. Hierarchical nanoparticle morphology for platinum supported on SrTiO3 (0 0 1): A combined microscopy and X-ray scattering study

    International Nuclear Information System (INIS)

    Christensen, Steven T.; Lee, Byeongdu; Feng Zhenxing; Hersam, Mark C.; Bedzyk, Michael J.

    2009-01-01

    The morphology of metal nanoparticles supported on oxide substrates plays an important role in heterogeneous catalysis and in the nucleation of thin films. For platinum evaporated onto SrTiO 3 (0 0 1) and vacuum annealed we find an unexpected growth formation of Pt nanoparticles that aggregate into clusters without coalescence. This hierarchical nanoparticle morphology with an enhanced surface-to-volume ratio for Pt is analyzed by grazing incidence small-angle X-ray scattering (GISAXS), X-ray fluorescence (XRF), atomic force microscopy (AFM) and high-resolution scanning electron microscopy (SEM). The nanoparticle constituents of the clusters measure 2-4 nm in size and are nearly contiguously spaced where the average edge-to-edge spacing is less than 1 nm. These particles make up the clusters, which are 10-50 nm in diameter and are spaced on the order of 100 nm apart.

  8. A hybrid clustering approach to recognition of protein families in 114 microbial genomes

    Directory of Open Access Journals (Sweden)

    Gogarten J Peter

    2004-04-01

    Full Text Available Abstract Background Grouping proteins into sequence-based clusters is a fundamental step in many bioinformatic analyses (e.g., homology-based prediction of structure or function. Standard clustering methods such as single-linkage clustering capture a history of cluster topologies as a function of threshold, but in practice their usefulness is limited because unrelated sequences join clusters before biologically meaningful families are fully constituted, e.g. as the result of matches to so-called promiscuous domains. Use of the Markov Cluster algorithm avoids this non-specificity, but does not preserve topological or threshold information about protein families. Results We describe a hybrid approach to sequence-based clustering of proteins that combines the advantages of standard and Markov clustering. We have implemented this hybrid approach over a relational database environment, and describe its application to clustering a large subset of PDB, and to 328577 proteins from 114 fully sequenced microbial genomes. To demonstrate utility with difficult problems, we show that hybrid clustering allows us to constitute the paralogous family of ATP synthase F1 rotary motor subunits into a single, biologically interpretable hierarchical grouping that was not accessible using either single-linkage or Markov clustering alone. We describe validation of this method by hybrid clustering of PDB and mapping SCOP families and domains onto the resulting clusters. Conclusion Hybrid (Markov followed by single-linkage clustering combines the advantages of the Markov Cluster algorithm (avoidance of non-specific clusters resulting from matches to promiscuous domains and single-linkage clustering (preservation of topological information as a function of threshold. Within the individual Markov clusters, single-linkage clustering is a more-precise instrument, discerning sub-clusters of biological relevance. Our hybrid approach thus provides a computationally efficient

  9. THE XMM CLUSTER SURVEY: THE BUILD-UP OF STELLAR MASS IN BRIGHTEST CLUSTER GALAXIES AT HIGH REDSHIFT

    International Nuclear Information System (INIS)

    Stott, J. P.; Collins, C. A.; Hilton, M.; Capozzi, D.; Sahlen, M.; Lloyd-Davies, E.; Hosmer, M.; Liddle, A. R.; Mehrtens, N.; Romer, A. K.; Miller, C. J.; Stanford, S. A.; Viana, P. T. P.; Davidson, M.; Hoyle, B.; Kay, S. T.; Nichol, R. C.

    2010-01-01

    We present deep J- and K s -band photometry of 20 high redshift galaxy clusters between z = 0.8 and1.5, 19 of which are observed with the MOIRCS instrument on the Subaru telescope. By using near-infrared light as a proxy for stellar mass we find the surprising result that the average stellar mass of Brightest Cluster Galaxies (BCGs) has remained constant at ∼9 x 10 11 M sun since z ∼ 1.5. We investigate the effect on this result of differing star formation histories generated by three well-known and independent stellar population codes and find it to be robust for reasonable, physically motivated choices of age and metallicity. By performing Monte Carlo simulations we find that the result is unaffected by any correlation between BCG mass and cluster mass in either the observed or model clusters. The large stellar masses imply that the assemblage of these galaxies took place at the same time as the initial burst of star formation. This result leads us to conclude that dry merging has had little effect on the average stellar mass of BCGs over the last 9-10 Gyr in stark contrast to the predictions of semi-analytic models, based on the hierarchical merging of dark matter halos, which predict a more protracted mass build-up over a Hubble time. However, we discuss that there is potential for reconciliation between observation and theory if there is a significant growth of material in the intracluster light over the same period.

  10. Hierarchical analysis of acceptable use policies

    Directory of Open Access Journals (Sweden)

    P. A. Laughton

    2008-01-01

    Full Text Available Acceptable use policies (AUPs are vital tools for organizations to protect themselves and their employees from misuse of computer facilities provided. A well structured, thorough AUP is essential for any organization. It is impossible for an effective AUP to deal with every clause and remain readable. For this reason, some sections of an AUP carry more weight than others, denoting importance. The methodology used to develop the hierarchical analysis is a literature review, where various sources were consulted. This hierarchical approach to AUP analysis attempts to highlight important sections and clauses dealt with in an AUP. The emphasis of the hierarchal analysis is to prioritize the objectives of an AUP.

  11. Markov Chain Model-Based Optimal Cluster Heads Selection for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Gulnaz Ahmed

    2017-02-01

    Full Text Available The longer network lifetime of Wireless Sensor Networks (WSNs is a goal which is directly related to energy consumption. This energy consumption issue becomes more challenging when the energy load is not properly distributed in the sensing area. The hierarchal clustering architecture is the best choice for these kind of issues. In this paper, we introduce a novel clustering protocol called Markov chain model-based optimal cluster heads (MOCHs selection for WSNs. In our proposed model, we introduce a simple strategy for the optimal number of cluster heads selection to overcome the problem of uneven energy distribution in the network. The attractiveness of our model is that the BS controls the number of cluster heads while the cluster heads control the cluster members in each cluster in such a restricted manner that a uniform and even load is ensured in each cluster. We perform an extensive range of simulation using five quality measures, namely: the lifetime of the network, stable and unstable region in the lifetime of the network, throughput of the network, the number of cluster heads in the network, and the transmission time of the network to analyze the proposed model. We compare MOCHs against Sleep-awake Energy Efficient Distributed (SEED clustering, Artificial Bee Colony (ABC, Zone Based Routing (ZBR, and Centralized Energy Efficient Clustering (CEEC using the above-discussed quality metrics and found that the lifetime of the proposed model is almost 1095, 2630, 3599, and 2045 rounds (time steps greater than SEED, ABC, ZBR, and CEEC, respectively. The obtained results demonstrate that the MOCHs is better than SEED, ABC, ZBR, and CEEC in terms of energy efficiency and the network throughput.

  12. Robust fiber clustering of cerebral fiber bundles in white matter

    Science.gov (United States)

    Yao, Xufeng; Wang, Yongxiong; Zhuang, Songlin

    2014-11-01

    Diffusion tensor imaging fiber tracking (DTI-FT) has been widely accepted in the diagnosis and treatment of brain diseases. During the rendering pipeline of specific fiber tracts, the image noise and low resolution of DTI would lead to false propagations. In this paper, we propose a robust fiber clustering (FC) approach to diminish false fibers from one fiber tract. Our algorithm consists of three steps. Firstly, the optimized fiber assignment continuous tracking (FACT) is implemented to reconstruct one fiber tract; and then each curved fiber in the fiber tract is mapped to a point by kernel principal component analysis (KPCA); finally, the point clouds of fiber tract are clustered by hierarchical clustering which could distinguish false fibers from true fibers in one tract. In our experiment, the corticospinal tract (CST) in one case of human data in vivo was used to validate our method. Our method showed reliable capability in decreasing the false fibers in one tract. In conclusion, our method could effectively optimize the visualization of fiber bundles and would help a lot in the field of fiber evaluation.

  13. Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis.

    Science.gov (United States)

    Liao, Minlei; Li, Yunfeng; Kianifard, Farid; Obi, Engels; Arcona, Stephen

    2016-03-02

    Cluster analysis (CA) is a frequently used applied statistical technique that helps to reveal hidden structures and "clusters" found in large data sets. However, this method has not been widely used in large healthcare claims databases where the distribution of expenditure data is commonly severely skewed. The purpose of this study was to identify cost change patterns of patients with end-stage renal disease (ESRD) who initiated hemodialysis (HD) by applying different clustering methods. A retrospective, cross-sectional, observational study was conducted using the Truven Health MarketScan® Research Databases. Patients aged ≥18 years with ≥2 ESRD diagnoses who initiated HD between 2008 and 2010 were included. The K-means CA method and hierarchical CA with various linkage methods were applied to all-cause costs within baseline (12-months pre-HD) and follow-up periods (12-months post-HD) to identify clusters. Demographic, clinical, and cost information was extracted from both periods, and then examined by cluster. A total of 18,380 patients were identified. Meaningful all-cause cost clusters were generated using K-means CA and hierarchical CA with either flexible beta or Ward's methods. Based on cluster sample sizes and change of cost patterns, the K-means CA method and 4 clusters were selected: Cluster 1: Average to High (n = 113); Cluster 2: Very High to High (n = 89); Cluster 3: Average to Average (n = 16,624); or Cluster 4: Increasing Costs, High at Both Points (n = 1554). Median cost changes in the 12-month pre-HD and post-HD periods increased from $185,070 to $884,605 for Cluster 1 (Average to High), decreased from $910,930 to $157,997 for Cluster 2 (Very High to High), were relatively stable and remained low from $15,168 to $13,026 for Cluster 3 (Average to Average), and increased from $57,909 to $193,140 for Cluster 4 (Increasing Costs, High at Both Points). Relatively stable costs after starting HD were associated with more stable scores

  14. Deciphering complex patterns of class-I HLA-peptide cross-reactivity via hierarchical grouping.

    Science.gov (United States)

    Mukherjee, Sumanta; Warwicker, Jim; Chandra, Nagasuma

    2015-07-01

    T-cell responses in humans are initiated by the binding of a peptide antigen to a human leukocyte antigen (HLA) molecule. The peptide-HLA complex then recruits an appropriate T cell, leading to cell-mediated immunity. More than 2000 HLA class-I alleles are known in humans, and they vary only in their peptide-binding grooves. The polymorphism they exhibit enables them to bind a wide range of peptide antigens from diverse sources. HLA molecules and peptides present a complex molecular recognition pattern, as many peptides bind to a given allele and a given peptide can be recognized by many alleles. A powerful grouping scheme that not only provides an insightful classification, but is also capable of dissecting the physicochemical basis of recognition specificity is necessary to address this complexity. We present a hierarchical classification of 2010 class-I alleles by using a systematic divisive clustering method. All-pair distances of alleles were obtained by comparing binding pockets in the structural models. By varying the similarity thresholds, a multilevel classification was obtained, with 7 supergroups, each further subclassifying to yield 72 groups. An independent clustering performed based only on similarities in their epitope pools correlated highly with pocket-based clustering. Physicochemical feature combinations that best explain the basis of clustering are identified. Mutual information calculated for the set of peptide ligands enables identification of binding site residues contributing to peptide specificity. The grouping of HLA molecules achieved here will be useful for rational vaccine design, understanding disease susceptibilities and predicting risk of organ transplants.

  15. Virtual timers in hierarchical real-time systems

    NARCIS (Netherlands)

    Heuvel, van den M.M.H.P.; Holenderski, M.J.; Cools, W.A.; Bril, R.J.; Lukkien, J.J.; Zhu, D.

    2009-01-01

    Hierarchical scheduling frameworks (HSFs) provide means for composing complex real-time systems from welldefined subsystems. This paper describes an approach to provide hierarchically scheduled real-time applications with virtual event timers, motivated by the need for integrating priority

  16. Order from the disorder: hierarchical nanostructures self-assembled from the gas phase (Conference Presentation)

    Science.gov (United States)

    Di Fonzo, Fabio

    2017-02-01

    The assembly of nanoscale building blocks in engineered mesostructures is one of the fundamental goals of nanotechnology. Among the various processes developed to date, self-assembly emerges as one of the most promising, since it relays solely on basic physico-chemical forces. Our research is focused on a new type of self-assembly strategy from the gas-phase: Scattered Ballistic Deposition (SBD). SBD arises from the interaction of a supersonic molecular beam with a static gas and enables the growth of quasi-1D hierarchical mesostructures. Overall, they resemble a forest composed of individual, high aspect-ratio, tree-like structures, assembled from amorphous or crystalline nanoparticles. SBD is a general occurring phenomenon and can be obtained with different vapour or cluster sources. In particular, SBD by Pulsed Laser Deposition is a convenient physical vapor technique that allows the generation of supersonic plasma jets from any inorganic material irrespective of melting temperature, preserving even the most complex stoichiometries. One of the advantages of PLD over other vapour deposition techniques is extremely wide operational pressure range, from UHV to ambient pressure. These characteristics allowed us to develop quasi-1D hierarchical nanostructures from different transition metal oxides, semiconductors and metals. The precise control offered by the SBD-PLD technique over material properties at the nanoscale allowed us to fabricate ultra-thin, high efficiency hierarchical porous photonic crystals with Bragg reflectivity up to 85%. In this communication we will discuss the application of these materials to solar energy harvesting and storage, stimuli responsive photonic crystals and smart surfaces with digital control of their wettability behaviour.

  17. The identification of high potential archers based on fitness and motor ability variables: A Support Vector Machine approach.

    Science.gov (United States)

    Taha, Zahari; Musa, Rabiu Muazu; P P Abdul Majeed, Anwar; Alim, Muhammad Muaz; Abdullah, Mohamad Razali

    2018-02-01

    Support Vector Machine (SVM) has been shown to be an effective learning algorithm for classification and prediction. However, the application of SVM for prediction and classification in specific sport has rarely been used to quantify/discriminate low and high-performance athletes. The present study classified and predicted high and low-potential archers from a set of fitness and motor ability variables trained on different SVMs kernel algorithms. 50 youth archers with the mean age and standard deviation of 17.0 ± 0.6 years drawn from various archery programmes completed a six arrows shooting score test. Standard fitness and ability measurements namely hand grip, vertical jump, standing broad jump, static balance, upper muscle strength and the core muscle strength were also recorded. Hierarchical agglomerative cluster analysis (HACA) was used to cluster the archers based on the performance variables tested. SVM models with linear, quadratic, cubic, fine RBF, medium RBF, as well as the coarse RBF kernel functions, were trained based on the measured performance variables. The HACA clustered the archers into high-potential archers (HPA) and low-potential archers (LPA), respectively. The linear, quadratic, cubic, as well as the medium RBF kernel functions models, demonstrated reasonably excellent classification accuracy of 97.5% and 2.5% error rate for the prediction of the HPA and the LPA. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from a combination of the selected few measured fitness and motor ability performance variables examined which would consequently save cost, time and effort during talent identification programme. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Cluster Analysis as an Analytical Tool of Population Policy

    Directory of Open Access Journals (Sweden)

    Oksana Mikhaylovna Shubat

    2017-12-01

    Full Text Available The predicted negative trends in Russian demography (falling birth rates, population decline actualize the need to strengthen measures of family and population policy. Our research purpose is to identify groups of Russian regions with similar characteristics in the family sphere using cluster analysis. The findings should make an important contribution to the field of family policy. We used hierarchical cluster analysis based on the Ward method and the Euclidean distance for segmentation of Russian regions. Clustering is based on four variables, which allowed assessing the family institution in the region. The authors used the data of Federal State Statistics Service from 2010 to 2015. Clustering and profiling of each segment has allowed forming a model of Russian regions depending on the features of the family institution in these regions. The authors revealed four clusters grouping regions with similar problems in the family sphere. This segmentation makes it possible to develop the most relevant family policy measures in each group of regions. Thus, the analysis has shown a high degree of differentiation of the family institution in the regions. This suggests that a unified approach to population problems’ solving is far from being effective. To achieve greater results in the implementation of family policy, a differentiated approach is needed. Methods of multidimensional data classification can be successfully applied as a relevant analytical toolkit. Further research could develop the adaptation of multidimensional classification methods to the analysis of the population problems in Russian regions. In particular, the algorithms of nonparametric cluster analysis may be of relevance in future studies.

  19. Critérios de formação de carteiras de ativos por meio de Hierarchical Clusters

    Directory of Open Access Journals (Sweden)

    Pierre Lucena

    2010-04-01

    Full Text Available Este artigo tem como objetivo principal apresentar e testar uma ferramenta de estatística multivariada em modelos financeiros. Essa metodologia, conhecida como análise de clusters, separa as observações em grupos com suas determinadas características, em contraste com a metodologia tradicional, que é somente a ordem com os quantis. Foi aplicada essa ferramenta em 213 ações negociadas na Bolsa de São Paulo (Bovespa, separando os grupos por tamanho e book-tomarket. Depois, as novas carteiras foram aplicadas no modelo de Fama e French (1996, comparando os resultados numa formação de carteira para quantil e análise de cluster. Foram encontrados melhores resultados na segunda metodologia. Os autores concluem que a análise de cluster pode ser mais adequada porque tende a formar grupos mais homogeneizados, sendo sua aplicação útil para a formação de carteiras e para a teoria financeira.

  20. Hierarchical modeling and analysis for spatial data

    CERN Document Server

    Banerjee, Sudipto; Gelfand, Alan E

    2003-01-01

    Among the many uses of hierarchical modeling, their application to the statistical analysis of spatial and spatio-temporal data from areas such as epidemiology And environmental science has proven particularly fruitful. Yet to date, the few books that address the subject have been either too narrowly focused on specific aspects of spatial analysis, or written at a level often inaccessible to those lacking a strong background in mathematical statistics.Hierarchical Modeling and Analysis for Spatial Data is the first accessible, self-contained treatment of hierarchical methods, modeling, and dat