WorldWideScience

Sample records for hierarchical clustering methods

  1. Hybridization of Fuzzy Clustering and Hierarchical Method for Link Discovery

    Directory of Open Access Journals (Sweden)

    Enseih Davoodi Jam

    2012-11-01

    Full Text Available Clustering is an active research topic in data mining and different methods have been proposed in the literature. Most of these methods are based on numerical attributes. Recently, there have been several proposals to develop clustering methods that support mixed attributes. There are
    three basic groups of clustering methods: partitional methods, hierarchical methods and densitybased methods. This paper proposes a hybrid clustering algorithm that combines the advantages of hierarchical clustering and fuzzy clustering techniques and considers mixed attributes. The proposed algorithms improve the fuzzy algorithm by making it less dependent on the initial parameters such as randomly chosen initial cluster centers, and it can determine the number of clusters based on the complexity of cluster structure. Our approach is organized in two phases: first, the division of data in two clusters; then the determination of the worst cluster and splitting. The number of clusters is unknown, but our algorithms can find this parameter based on the complexity of cluster structure. We demonstrate the effectiveness of the clustering approach by evaluating datasets of linked data. We applied the proposed algorithms on three different datasets. Experimental results the proposed algorithm is suitable for link discovery between datasets of linked data. Clustering can decrease the number of comparisons before link discovery.

  2. Hierarchical clustering for graph visualization

    CERN Document Server

    Clémençon, Stéphan; Rossi, Fabrice; Tran, Viet Chi

    2012-01-01

    This paper describes a graph visualization methodology based on hierarchical maximal modularity clustering, with interactive and significant coarsening and refining possibilities. An application of this method to HIV epidemic analysis in Cuba is outlined.

  3. Non-Hierarchical Clustering as a method to analyse an open-ended ...

    African Journals Online (AJOL)

    Non-Hierarchical Clustering as a method to analyse an open-ended questionnaire on algebraic thinking. ... Student responses to written questions and multiple-choice tests have been characterised and studied using several qualitative and/or quantitative analysis methods. However, there are inherent difficulties in the ...

  4. Non-Hierarchical Clustering as a method to analyse an open-ended ...

    African Journals Online (AJOL)

    Apple

    South African Journal of Education, Volume 36, Number 1, February 2016. 1. Art. # 1142, 13 pages, doi: 10.15700/saje.v36n1a1142. Non-Hierarchical Clustering as a method to analyse an open-ended questionnaire on algebraic thinking. Benedetto Di Paola. Department of Mathematics and Informatics, University of ...

  5. Statistical significance for hierarchical clustering.

    Science.gov (United States)

    Kimes, Patrick K; Liu, Yufeng; Neil Hayes, David; Marron, James Stephen

    2017-09-01

    Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high-dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this article, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. © 2017, The International Biometric Society.

  6. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Science.gov (United States)

    Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.

    2015-11-01

    In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs) by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio-hydro-atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen-Rocky Mountain Biogenic Aerosol Study) ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of

  7. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Directory of Open Access Journals (Sweden)

    I. Crawford

    2015-11-01

    Full Text Available In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the

  8. Hierarchical and Non-Hierarchical Linear and Non-Linear Clustering Methods to “Shakespeare Authorship Question”

    Directory of Open Access Journals (Sweden)

    Refat Aljumily

    2015-09-01

    Full Text Available A few literary scholars have long claimed that Shakespeare did not write some of his best plays (history plays and tragedies and proposed at one time or another various suspect authorship candidates. Most modern-day scholars of Shakespeare have rejected this claim, arguing that strong evidence that Shakespeare wrote the plays and poems being his name appears on them as the author. This has caused and led to an ongoing scholarly academic debate for quite some long time. Stylometry is a fast-growing field often used to attribute authorship to anonymous or disputed texts. Stylometric attempts to resolve this literary puzzle have raised interesting questions over the past few years. The following paper contributes to “the Shakespeare authorship question” by using a mathematically-based methodology to examine the hypothesis that Shakespeare wrote all the disputed plays traditionally attributed to him. More specifically, the mathematically based methodology used here is based on Mean Proximity, as a linear hierarchical clustering method, and on Principal Components Analysis, as a non-hierarchical linear clustering method. It is also based, for the first time in the domain, on Self-Organizing Map U-Matrix and Voronoi Map, as non-linear clustering methods to cover the possibility that our data contains significant non-linearities. Vector Space Model (VSM is used to convert texts into vectors in a high dimensional space. The aim of which is to compare the degrees of similarity within and between limited samples of text (the disputed plays. The various works and plays assumed to have been written by Shakespeare and possible authors notably, Sir Francis Bacon, Christopher Marlowe, John Fletcher, and Thomas Kyd, where “similarity” is defined in terms of correlation/distance coefficient measure based on the frequency of usage profiles of function words, word bi-grams, and character triple-grams. The claim that Shakespeare authored all the disputed

  9. Prioritizing the risk of plant pests by clustering methods; self-organising maps, k-means and hierarchical clustering

    Directory of Open Access Journals (Sweden)

    Susan Worner

    2013-09-01

    Full Text Available For greater preparedness, pest risk assessors are required to prioritise long lists of pest species with potential to establish and cause significant impact in an endangered area. Such prioritization is often qualitative, subjective, and sometimes biased, relying mostly on expert and stakeholder consultation. In recent years, cluster based analyses have been used to investigate regional pest species assemblages or pest profiles to indicate the risk of new organism establishment. Such an approach is based on the premise that the co-occurrence of well-known global invasive pest species in a region is not random, and that the pest species profile or assemblage integrates complex functional relationships that are difficult to tease apart. In other words, the assemblage can help identify and prioritise species that pose a threat in a target region. A computational intelligence method called a Kohonen self-organizing map (SOM, a type of artificial neural network, was the first clustering method applied to analyse assemblages of invasive pests. The SOM is a well known dimension reduction and visualization method especially useful for high dimensional data that more conventional clustering methods may not analyse suitably. Like all clustering algorithms, the SOM can give details of clusters that identify regions with similar pest assemblages, possible donor and recipient regions. More important, however SOM connection weights that result from the analysis can be used to rank the strength of association of each species within each regional assemblage. Species with high weights that are not already established in the target region are identified as high risk. However, the SOM analysis is only the first step in a process to assess risk to be used alongside or incorporated within other measures. Here we illustrate the application of SOM analyses in a range of contexts in invasive species risk assessment, and discuss other clustering methods such as k

  10. Using hierarchical clustering methods to classify motor activities of COPD patients from wearable sensor data

    Directory of Open Access Journals (Sweden)

    Reilly John J

    2005-06-01

    Full Text Available Abstract Background Advances in miniature sensor technology have led to the development of wearable systems that allow one to monitor motor activities in the field. A variety of classifiers have been proposed in the past, but little has been done toward developing systematic approaches to assess the feasibility of discriminating the motor tasks of interest and to guide the choice of the classifier architecture. Methods A technique is introduced to address this problem according to a hierarchical framework and its use is demonstrated for the application of detecting motor activities in patients with chronic obstructive pulmonary disease (COPD undergoing pulmonary rehabilitation. Accelerometers were used to collect data for 10 different classes of activity. Features were extracted to capture essential properties of the data set and reduce the dimensionality of the problem at hand. Cluster measures were utilized to find natural groupings in the data set and then construct a hierarchy of the relationships between clusters to guide the process of merging clusters that are too similar to distinguish reliably. It provides a means to assess whether the benefits of merging for performance of a classifier outweigh the loss of resolution incurred through merging. Results Analysis of the COPD data set demonstrated that motor tasks related to ambulation can be reliably discriminated from tasks performed in a seated position with the legs in motion or stationary using two features derived from one accelerometer. Classifying motor tasks within the category of activities related to ambulation requires more advanced techniques. While in certain cases all the tasks could be accurately classified, in others merging clusters associated with different motor tasks was necessary. When merging clusters, it was found that the proposed method could lead to more than 12% improvement in classifier accuracy while retaining resolution of 4 tasks. Conclusion Hierarchical

  11. Agglomerative hierarchical cluster method to analyze landslide displacements and assess risk scenarios

    Science.gov (United States)

    Bossi, Giulia; Crema, Stefano; Mantovani, Matteo; Schenato, Luca; Cavalli, Marco; Marcato, Gianluca; Frigerio, Simone; Pasuto, Alessandro

    2017-04-01

    In the Rotolon catchment (eastern Italian Alps) a large Deep-seated Gravitational Slope Deformation (DGSD) induces secondary phenomena that are threatening the local population. In 2010 a mass of 340.000 m3 detached from the frontal part of the DGSD and then flow into the draining channel in the form of a debris flow, damaging a bridge and almost over-flooding, endangering the houses located 3 km downstream. For this reason, an Automated Total Station (ATS) has been installed in 2012 to monitor surface displacements so as to identify the most active regions of the slope in order to estimate the volume of material that could be mobilized in the next paroxysmal event and to assess the related risk. 42 benchmarks (5 stable control points and 37 on the active slope) have been monitored for two periods: the first one of 22 months between 2012 and 2014 and the second one for 12 months between 2015 and 2016. Analyzing the time series of displacements with the agglomerative hierarchical cluster method calculated with a simple single linkage algorithm, groups of similarly moving benchmarks have been clustered. For these groups the trend of acceleration and deceleration of displacements follows similar patterns. Even though the methodology does not take into account the position of the benchmarks, matching patterns are found in contiguous benchmarks within the groups, thus confirming the effectiveness of the approach. The possibility to identify areas with homogeneous behavior is fundamental to delineate the volume of possible new debris flow phenomena and therefore to produce reliable risk scenarios.

  12. MtHc: a motif-based hierarchical method for clustering massive 16S rRNA sequences into OTUs.

    Science.gov (United States)

    Wei, Ze-Gang; Zhang, Shao-Wu

    2015-07-01

    The recent sequencing revolution driven by high-throughput technologies has led to rapid accumulation of 16S rRNA sequences for microbial communities. Clustering short sequences into operational taxonomic units (OTUs) is an initial crucial process in analyzing metagenomic data. Although many methods have been proposed for OTU inferences, a major challenge is the balance between inference accuracy and computational efficiency. To address these challenges, we present a novel motif-based hierarchical method (namely MtHc) for clustering massive 16S rRNA sequences into OTUs with high clustering accuracy and low memory usage. Suppose all the 16S rRNA sequences can be used to construct a complete weighted network, where sequences are viewed as nodes, each pair of sequences is connected by an imaginary edge, and the distance of a pair of sequences represents the weight of the edge. MtHc consists of three main phrases. First, heuristically search the motif that is defined as n-node sub-graph (in the present study, n = 3, 4, 5), in which the distance between any two nodes is less than a threshold. Second, use the motif as a seed to form candidate clusters by computing the distances of other sequences with the motif. Finally, hierarchically merge the candidate clusters to generate the OTUs by only calculating the distances of motifs between two clusters. Compared with the existing methods on several simulated and real-life metagenomic datasets, we demonstrate that MtHc has higher clustering performance, less memory usage and robustness for setting parameters, and that it is more effective to handle the large-scale metagenomic datasets. The MtHC software can be freely download from for academic users.

  13. Convex Clustering: An Attractive Alternative to Hierarchical Clustering

    Science.gov (United States)

    Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth

    2015-01-01

    The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340

  14. Convex clustering: an attractive alternative to hierarchical clustering.

    Directory of Open Access Journals (Sweden)

    Gary K Chen

    2015-05-01

    Full Text Available The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/.

  15. Convex clustering: an attractive alternative to hierarchical clustering.

    Science.gov (United States)

    Chen, Gary K; Chi, Eric C; Ranola, John Michael O; Lange, Kenneth

    2015-05-01

    The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/.

  16. Hierarchical clustering of RGB surface water images based on MIA ...

    African Journals Online (AJOL)

    Thus characterised images were partitioned into clusters of similar images using hierarchical clustering. The best defined clusters were obtained when the Ward's method was applied. Images were partitioned into the 2 main clusters in terms of similar colours of displayed objects. Each main cluster was further partitioned ...

  17. Hierarchical Control for Multiple DC Microgrids Clusters

    DEFF Research Database (Denmark)

    Shafiee, Qobad; Dragicevic, Tomislav; Vasquez, Juan Carlos

    2014-01-01

    . Another distributed policy is employed then to regulate the power flow among the MGs according to their local SOCs. The proposed distributed controllers on each MG communicate with only the neighbor MGs through a communication infrastructure. Finally, the small signal model is expanded for dc MG clusters......This paper presents a distributed hierarchical control framework to ensure reliable operation of dc Microgrid (MG) clusters. In this hierarchy, primary control is used to regulate the common bus voltage inside each MG locally. An adaptive droop method is proposed for this level which determines...

  18. Tanzania: A Hierarchical Cluster Analysis Approach | Ngaruko ...

    African Journals Online (AJOL)

    Using survey data from Kibondo district, west Tanzania, we use hierarchical cluster analysis to classify borrower farmers according to their borrowing behaviour into four distinctive clusters. The appreciation of the existence of heterogeneous farmer clusters is vital in forging credit delivery policies that are not only ...

  19. A comparison of hierarchical cluster analysis and league table rankings as methods for analysis and presentation of district health system performance data in Uganda.

    Science.gov (United States)

    Tashobya, Christine K; Dubourg, Dominique; Ssengooba, Freddie; Speybroeck, Niko; Macq, Jean; Criel, Bart

    2016-03-01

    In 2003, the Uganda Ministry of Health introduced the district league table for district health system performance assessment. The league table presents district performance against a number of input, process and output indicators and a composite index to rank districts. This study explores the use of hierarchical cluster analysis for analysing and presenting district health systems performance data and compares this approach with the use of the league table in Uganda. Ministry of Health and district plans and reports, and published documents were used to provide information on the development and utilization of the Uganda district league table. Quantitative data were accessed from the Ministry of Health databases. Statistical analysis using SPSS version 20 and hierarchical cluster analysis, utilizing Wards' method was used. The hierarchical cluster analysis was conducted on the basis of seven clusters determined for each year from 2003 to 2010, ranging from a cluster of good through moderate-to-poor performers. The characteristics and membership of clusters varied from year to year and were determined by the identity and magnitude of performance of the individual variables. Criticisms of the league table include: perceived unfairness, as it did not take into consideration district peculiarities; and being oversummarized and not adequately informative. Clustering organizes the many data points into clusters of similar entities according to an agreed set of indicators and can provide the beginning point for identifying factors behind the observed performance of districts. Although league table ranking emphasize summation and external control, clustering has the potential to encourage a formative, learning approach. More research is required to shed more light on factors behind observed performance of the different clusters. Other countries especially low-income countries that share many similarities with Uganda can learn from these experiences. © The Author 2015

  20. Graphical Evaluation of Hierarchical Clustering Schemes. Technical Report No. 1.

    Science.gov (United States)

    Halff, Henry M.

    Graphical methods for evaluating the fit of Johnson's hierarchical clustering schemes are presented together with an example. These evaluation methods examine the extent to which the clustering algorithm can minimize the overlap of the distributions of intracluster and intercluster distances. (Author)

  1. Technique for fast and efficient hierarchical clustering

    Science.gov (United States)

    Stork, Christopher

    2013-10-08

    A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.

  2. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    Science.gov (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  3. Star Cluster Structure from Hierarchical Star Formation

    Science.gov (United States)

    Grudic, Michael; Hopkins, Philip; Murray, Norman; Lamberts, Astrid; Guszejnov, David; Schmitz, Denise; Boylan-Kolchin, Michael

    2018-01-01

    Young massive star clusters (YMCs) spanning 104-108 M⊙ in mass generally have similar radial surface density profiles, with an outer power-law index typically between -2 and -3. This similarity suggests that they are shaped by scale-free physics at formation. Recent multi-physics MHD simulations of YMC formation have also produced populations of YMCs with this type of surface density profile, allowing us to narrow down the physics necessary to form a YMC with properties as observed. We show that the shallow density profiles of YMCs are a natural result of phase-space mixing that occurs as they assemble from the clumpy, hierarchically-clustered configuration imprinted by the star formation process. We develop physical intuition for this process via analytic arguments and collisionless N-body experiments, elucidating the connection between star formation physics and star cluster structure. This has implications for the early-time structure and evolution of proto-globular clusters, and prospects for simulating their formation in the FIRE cosmological zoom-in simulations.

  4. Global considerations in hierarchical clustering reveal meaningful patterns in data.

    Directory of Open Access Journals (Sweden)

    Roy Varshavsky

    Full Text Available BACKGROUND: A hierarchy, characterized by tree-like relationships, is a natural method of organizing data in various domains. When considering an unsupervised machine learning routine, such as clustering, a bottom-up hierarchical (BU, agglomerative algorithm is used as a default and is often the only method applied. METHODOLOGY/PRINCIPAL FINDINGS: We show that hierarchical clustering that involve global considerations, such as top-down (TD, divisive, or glocal (global-local algorithms are better suited to reveal meaningful patterns in the data. This is demonstrated, by testing the correspondence between the results of several algorithms (TD, glocal and BU and the correct annotations provided by experts. The correspondence was tested in multiple domains including gene expression experiments, stock trade records and functional protein families. The performance of each of the algorithms is evaluated by statistical criteria that are assigned to clusters (nodes of the hierarchy tree based on expert-labeled data. Whereas TD algorithms perform better on global patterns, BU algorithms perform well and are advantageous when finer granularity of the data is sought. In addition, a novel TD algorithm that is based on genuine density of the data points is presented and is shown to outperform other divisive and agglomerative methods. Application of the algorithm to more than 500 protein sequences belonging to ion-channels illustrates the potential of the method for inferring overlooked functional annotations. ClustTree, a graphical Matlab toolbox for applying various hierarchical clustering algorithms and testing their quality is made available. CONCLUSIONS: Although currently rarely used, global approaches, in particular, TD or glocal algorithms, should be considered in the exploratory process of clustering. In general, applying unsupervised clustering methods can leverage the quality of manually-created mapping of proteins families. As demonstrated, it can

  5. Global considerations in hierarchical clustering reveal meaningful patterns in data.

    Science.gov (United States)

    Varshavsky, Roy; Horn, David; Linial, Michal

    2008-05-21

    A hierarchy, characterized by tree-like relationships, is a natural method of organizing data in various domains. When considering an unsupervised machine learning routine, such as clustering, a bottom-up hierarchical (BU, agglomerative) algorithm is used as a default and is often the only method applied. We show that hierarchical clustering that involve global considerations, such as top-down (TD, divisive), or glocal (global-local) algorithms are better suited to reveal meaningful patterns in the data. This is demonstrated, by testing the correspondence between the results of several algorithms (TD, glocal and BU) and the correct annotations provided by experts. The correspondence was tested in multiple domains including gene expression experiments, stock trade records and functional protein families. The performance of each of the algorithms is evaluated by statistical criteria that are assigned to clusters (nodes of the hierarchy tree) based on expert-labeled data. Whereas TD algorithms perform better on global patterns, BU algorithms perform well and are advantageous when finer granularity of the data is sought. In addition, a novel TD algorithm that is based on genuine density of the data points is presented and is shown to outperform other divisive and agglomerative methods. Application of the algorithm to more than 500 protein sequences belonging to ion-channels illustrates the potential of the method for inferring overlooked functional annotations. ClustTree, a graphical Matlab toolbox for applying various hierarchical clustering algorithms and testing their quality is made available. Although currently rarely used, global approaches, in particular, TD or glocal algorithms, should be considered in the exploratory process of clustering. In general, applying unsupervised clustering methods can leverage the quality of manually-created mapping of proteins families. As demonstrated, it can also provide insights in erroneous and missed annotations.

  6. Robust Pseudo-Hierarchical Support Vector Clustering

    DEFF Research Database (Denmark)

    Hansen, Michael Sass; Sjöstrand, Karl; Olafsdóttir, Hildur

    2007-01-01

    Support vector clustering (SVC) has proven an efficient algorithm for clustering of noisy and high-dimensional data sets, with applications within many fields of research. An inherent problem, however, has been setting the parameters of the SVC algorithm. Using the recent emergence of a method fo...

  7. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  8. Improved gravitation field algorithm and its application in hierarchical clustering.

    Directory of Open Access Journals (Sweden)

    Ming Zheng

    Full Text Available BACKGROUND: Gravitation field algorithm (GFA is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. METHOD: An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. RESULTS: Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm and SA (simulated annealing. Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved.

  9. Fingerprint analysis of Hibiscus mutabilis L. leaves based on ultra performance liquid chromatography with photodiode array detector combined with similarity analysis and hierarchical clustering analysis methods.

    Science.gov (United States)

    Liang, Xianrui; Ma, Meiling; Su, Weike

    2013-07-01

    A method for chemical fingerprint analysis of Hibiscus mutabilis L. leaves was developed based on ultra performance liquid chromatography with photodiode array detector (UPLC-PAD) combined with similarity analysis (SA) and hierarchical clustering analysis (HCA). 10 batches of Hibiscus mutabilis L. leaves samples were collected from different regions of China. UPLC-PAD was employed to collect chemical fingerprints of Hibiscus mutabilis L. leaves. The relative standard deviations (RSDs) of the relative retention times (RRT) and relative peak areas (RPA) of 10 characteristic peaks (one of them was identified as rutin) in precision, repeatability and stability test were less than 3%, and the method of fingerprint analysis was validated to be suitable for the Hibiscus mutabilis L. leaves. The chromatographic fingerprints showed abundant diversity of chemical constituents qualitatively in the 10 batches of Hibiscus mutabilis L. leaves samples from different locations by similarity analysis on basis of calculating the correlation coefficients between each two fingerprints. Moreover, the HCA method clustered the samples into four classes, and the HCA dendrogram showed the close or distant relations among the 10 samples, which was consistent to the SA result to some extent.

  10. Applied Bayesian hierarchical methods

    National Research Council Canada - National Science Library

    Congdon, P

    2010-01-01

    ... . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Posterior Inference from Bayes Formula . . . . . . . . . . . . 1.3 Markov Chain Monte Carlo Sampling in Relation to Monte Carlo Methods: Obtaining Posterior...

  11. Hierarchical clustering using correlation metric and spatial continuity constraint

    Science.gov (United States)

    Stork, Christopher L.; Brewer, Luke N.

    2012-10-02

    Large data sets are analyzed by hierarchical clustering using correlation as a similarity measure. This provides results that are superior to those obtained using a Euclidean distance similarity measure. A spatial continuity constraint may be applied in hierarchical clustering analysis of images.

  12. A novel hierarchical clustering algorithm for gene sequences

    Directory of Open Access Journals (Sweden)

    Wei Dan

    2012-07-01

    Full Text Available Abstract Background Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of k-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors. Results The proposed distance measure and clustering method are evaluated by clustering functionally related genes and by phylogenetic analysis. This method is also compared with BlastClust, CD-HIT-EST and some others. The experimental results show our method is effective in classifying DNA sequences with similar biological characteristics and in discovering the underlying relationship among the sequences. Conclusions We introduced a novel clustering algorithm which is based on a new sequence similarity measure. It is effective in classifying DNA sequences with similar biological characteristics and in discovering the relationship among the sequences.

  13. Dynamic networks from hierarchical bayesian graph clustering.

    Directory of Open Access Journals (Sweden)

    Yongjin Park

    Full Text Available Biological networks change dynamically as protein components are synthesized and degraded. Understanding the time-dependence and, in a multicellular organism, tissue-dependence of a network leads to insight beyond a view that collapses time-varying interactions into a single static map. Conventional algorithms are limited to analyzing evolving networks by reducing them to a series of unrelated snapshots.Here we introduce an approach that groups proteins according to shared interaction patterns through a dynamical hierarchical stochastic block model. Protein membership in a block is permitted to evolve as interaction patterns shift over time and space, representing the spatial organization of cell types in a multicellular organism. The spatiotemporal evolution of the protein components are inferred from transcript profiles, using Arabidopsis root development (5 tissues, 3 temporal stages as an example.The new model requires essentially no parameter tuning, out-performs existing snapshot-based methods, identifies protein modules recruited to specific cell types and developmental stages, and could have broad application to social networks and other similar dynamic systems.

  14. Hierarchical Clustering and the Concept of Space Distortion.

    Science.gov (United States)

    Hubert, Lawrence; Schultz, James

    An empirical assesssment of the space distortion properties of two prototypic hierarchical clustering procedures is given in terms of an occupancy model developed from combinatorics. Using one simple example, the single-link and complete-link clustering strategies now in common use in the behavioral sciences are empirically shown to be space…

  15. The Hierarchical Clustering of Tax Burden in the EU27

    Directory of Open Access Journals (Sweden)

    Simkova Nikola

    2015-09-01

    Full Text Available The issue of taxation has become more important due to a significant share of the government revenue. There are several ways of expressing the tax burden of countries. This paper describes the traditional approach as a share of tax revenue to GDP which is applied to the total taxation and the capital taxation as a part of tax systems affecting investment decisions. The implicit tax rate on capital created by Eurostat also offers a possible explanation of the tax burden on capital, so its components are analysed in detail. This study uses one of the econometric methods called the hierarchical clustering. The data on which the clustering is based comprises countries in the EU27 for the period of 1995 – 2012. The aim of this paper is to reveal clusters of countries in the EU27 with similar tax burden or tax changes. The findings suggest that mainly newly acceding countries (2004 and 2007 are in a group of countries with a low tax burden which tried to encourage investors by favourable tax rates. On the other hand, there are mostly countries from the original EU15. Some clusters may be explained by similar historical development, geographic and demographic characteristics.

  16. Hierarchical Star Formation with Young Stellar Clusters with LEGUS

    Science.gov (United States)

    Grasha, Kathryn; Calzetti, Daniela

    2018-01-01

    We present findings on using young star clusters to trace the unbound hierarchical star-forming structures in nearby galaxies drawn from the Legacy ExtraGalactic UV Survey (LEGUS) program. LEGUS is a cycle 21 Hubble Space Telescope Treasury program designed to characterization the link between star formation of individual stars, stellar clusters and associations on parsec scales, and that of galaxy disks on kiloparsec scales. We find that star clusters are strongly clustered with respect to each other and that the spatial clustering disappears rapidly across all galaxies for ages as young as a few tens of Myr. This indicates that most, if not all, recent star formation occurs within rapidly dispersing hierarchical complexes. The observed correlations are consistent with arising in a turbulent-driven interstellar medium. We also present recent investigations of correlating CO gas to the star clusters to better understand the environmental connection between gas and recent star formation. We find a clear excess of young star clusters being spatially located near their natal molecular gas, disassociating in as little as 4-6 Myr. Lastly, we find that the spatial clustering of GMCs is significantly suppressed compared to that of star clusters. This suggests that GMCs must produce more than one star cluster, improving our ability to link the products of star formation to the properties of natal gas from which they originate.

  17. Splitting Methods for Convex Clustering.

    Science.gov (United States)

    Chi, Eric C; Lange, Kenneth

    Clustering is a fundamental problem in many scientific applications. Standard methods such as k-means, Gaussian mixture models, and hierarchical clustering, however, are beset by local minima, which are sometimes drastically suboptimal. Recently introduced convex relaxations of k-means and hierarchical clustering shrink cluster centroids toward one another and ensure a unique global minimizer. In this work we present two splitting methods for solving the convex clustering problem. The first is an instance of the alternating direction method of multipliers (ADMM); the second is an instance of the alternating minimization algorithm (AMA). In contrast to previously considered algorithms, our ADMM and AMA formulations provide simple and unified frameworks for solving the convex clustering problem under the previously studied norms and open the door to potentially novel norms. We demonstrate the performance of our algorithm on both simulated and real data examples. While the differences between the two algorithms appear to be minor on the surface, complexity analysis and numerical experiments show AMA to be significantly more efficient. This article has supplemental materials available online.

  18. Evaluation by hierarchical clustering of multiple cytokine expression after phytohemagglutinin stimulation

    Directory of Open Access Journals (Sweden)

    Yang Chunhe

    2016-01-01

    Full Text Available The hierarchical clustering method has been used for exploration of gene expression and proteomic profiles; however, little research into its application in the examination of expression of multiplecytokine/chemokine responses to stimuli has been reported. Thus, little progress has been made on how phytohemagglutinin(PHA affects cytokine expression profiling on a large scale in the human hematological system. To investigate the characteristic expression pattern under PHA stimulation, Luminex, a multiplex bead-based suspension array, was performed. The data set collected from human peripheral blood mononuclear cells (PBMC was analyzed using the hierarchical clustering method. It was revealed that two specific chemokines (CCL3 andCCL4 underwent significantly greater quantitative changes during induction of expression than other tested cytokines/chemokines after PHA stimulation. This result indicates that hierarchical clustering is a useful tool for detecting fine patterns during exploration of biological data, and that it can play an important role in comparative studies.

  19. A rapid ATR-FTIR spectroscopic method for detection of sibutramine adulteration in tea and coffee based on hierarchical cluster and principal component analyses.

    Science.gov (United States)

    Cebi, Nur; Yilmaz, Mustafa Tahsin; Sagdic, Osman

    2017-08-15

    Sibutramine may be illicitly included in herbal slimming foods and supplements marketed as "100% natural" to enhance weight loss. Considering public health and legal regulations, there is an urgent need for effective, rapid and reliable techniques to detect sibutramine in dietetic herbal foods, teas and dietary supplements. This research comprehensively explored, for the first time, detection of sibutramine in green tea, green coffee and mixed herbal tea using ATR-FTIR spectroscopic technique combined with chemometrics. Hierarchical cluster analysis and PCA principle component analysis techniques were employed in spectral range (2746-2656cm -1 ) for classification and discrimination through Euclidian distance and Ward's algorithm. Unadulterated and adulterated samples were classified and discriminated with respect to their sibutramine contents with perfect accuracy without any false prediction. The results suggest that existence of the active substance could be successfully determined at the levels in the range of 0.375-12mg in totally 1.75g of green tea, green coffee and mixed herbal tea by using FTIR-ATR technique combined with chemometrics. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Improved gravitation field algorithm and its application in hierarchical clustering.

    Science.gov (United States)

    Zheng, Ming; Sun, Ying; Liu, Gui-Xia; Zhou, You; Zhou, Chun-Guang

    2012-01-01

    Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved.

  1. Hierarchical cluster-tendency analysis of the group structure in the foreign exchange market

    Science.gov (United States)

    Wu, Xin-Ye; Zheng, Zhi-Gang

    2013-08-01

    A hierarchical cluster-tendency (HCT) method in analyzing the group structure of networks of the global foreign exchange (FX) market is proposed by combining the advantages of both the minimal spanning tree (MST) and the hierarchical tree (HT). Fifty currencies of the top 50 World GDP in 2010 according to World Bank's database are chosen as the underlying system. By using the HCT method, all nodes in the FX market network can be "colored" and distinguished. We reveal that the FX networks can be divided into two groups, i.e., the Asia-Pacific group and the Pan-European group. The results given by the hierarchical cluster-tendency method agree well with the formerly observed geographical aggregation behavior in the FX market. Moreover, an oil-resource aggregation phenomenon is discovered by using our method. We find that gold could be a better numeraire for the weekly-frequency FX data.

  2. APPLICATION OF HIERARCHICAL CLUSTERING ALGORITHM FOR STRUCTURAL CHARACTERISTIC OF MOVING PHYSICAL OBJECTS

    Directory of Open Access Journals (Sweden)

    N.I. Babenko

    2014-04-01

    Full Text Available Approach to the development of management solutions using the cluster analysis can qualitatively improve the management system by moving objects through the adequate response to the impact of the key factors influencing the characteristics of physical objects. The aim is to attempt to solve the problem of identifying key factors and physical signs of moving physical objects needed to make appropriate management decisions by using cluster analysis. The article defines the types of clustering algorithms; the system of information parameters directly or indirectly characterizing the analyzed characteristics is emphasized, hierarchical and non-hierarchical cluster analysis methods are considered. The research finding is the construction of tree diagram using the program STATISTICA 8, which gives the idea of possible clusters’ number combining physical indicators under the dynamic changes of moving objects. The advantage of cluster analysis usage is the use of factors relating to both internal and external environments of the physical properties’ interaction of moving objects.

  3. Automated tetraploid genotype calling by hierarchical clustering

    Science.gov (United States)

    SNP arrays are transforming breeding and genetics research for autotetraploids. To fully utilize these arrays, however, the relationship between signal intensity and allele dosage must be inferred independently for each marker. We developed an improved computational method to automate this process, ...

  4. Unsupervised learning of electrocorticography motifs with binary descriptors of wavelet features and hierarchical clustering.

    Science.gov (United States)

    Pluta, Tim; Bernardo, Roman; Shin, Hae Won; Bernardo, D R

    2014-01-01

    We describe a novel method for data mining spectro-spatiotemporal network motifs from electrocorticographic (ECoG) data. The method utilizes wavelet feature extraction from ECoG data, generation of compact binary vectors from these features, and binary vector hierarchical clustering. The potential utility of this method in the discovery of recurring neural patterns is demonstrated in an example showing clustering of ictal and post-ictal gamma activity patterns. The method allows for the efficient and scalable retrieval and clustering of neural motifs occurring in massive amounts of neural data, such as in prolonged EEG/ECoG recordings and in brain computer interfaces.

  5. Hierarchical trie packet classification algorithm based on expectation-maximization clustering.

    Science.gov (United States)

    Bi, Xia-An; Zhao, Junxia

    2017-01-01

    With the development of computer network bandwidth, packet classification algorithms which are able to deal with large-scale rule sets are in urgent need. Among the existing algorithms, researches on packet classification algorithms based on hierarchical trie have become an important packet classification research branch because of their widely practical use. Although hierarchical trie is beneficial to save large storage space, it has several shortcomings such as the existence of backtracking and empty nodes. This paper proposes a new packet classification algorithm, Hierarchical Trie Algorithm Based on Expectation-Maximization Clustering (HTEMC). Firstly, this paper uses the formalization method to deal with the packet classification problem by means of mapping the rules and data packets into a two-dimensional space. Secondly, this paper uses expectation-maximization algorithm to cluster the rules based on their aggregate characteristics, and thereby diversified clusters are formed. Thirdly, this paper proposes a hierarchical trie based on the results of expectation-maximization clustering. Finally, this paper respectively conducts simulation experiments and real-environment experiments to compare the performances of our algorithm with other typical algorithms, and analyzes the results of the experiments. The hierarchical trie structure in our algorithm not only adopts trie path compression to eliminate backtracking, but also solves the problem of low efficiency of trie updates, which greatly improves the performance of the algorithm.

  6. Extending stability through hierarchical clusters in Echo State Networks

    Directory of Open Access Journals (Sweden)

    Sarah Jarvis

    2010-07-01

    Full Text Available Echo State Networks (ESN are reservoir networks that satisfy well-established criteria for stability when constructed as feedforward networks. Recent evidence suggests that stability criteria are altered in the presence of reservoir substructures, such as clusters. Understanding how the reservoir architecture affects stability is thus important for the appropriate design of any ESN. To quantitatively determine the influence of the most relevant network parameters, we analysed the impact of reservoir substructures on stability in hierarchically clustered ESNs (HESN, as they allow a smooth transition from highly structured to increasingly homogeneous reservoirs. Previous studies used the largest eigenvalue of the reservoir connectivity matrix (spectral radius as a predictor for stable network dynamics. Here, we evaluate the impact of clusters, hierarchy and intercluster connectivity on the predictive power of the spectral radius for stability. Both hierarchy and low relative cluster sizes extend the range of spectral radius values, leading to stable networks, while increasing intercluster connectivity decreased maximal spectral radius.

  7. Multi-mode clustering model for hierarchical wireless sensor networks

    Science.gov (United States)

    Hu, Xiangdong; Li, Yongfu; Xu, Huifen

    2017-03-01

    The topology management, i.e., clusters maintenance, of wireless sensor networks (WSNs) is still a challenge due to its numerous nodes, diverse application scenarios and limited resources as well as complex dynamics. To address this issue, a multi-mode clustering model (M2 CM) is proposed to maintain the clusters for hierarchical WSNs in this study. In particular, unlike the traditional time-trigger model based on the whole-network and periodic style, the M2 CM is proposed based on the local and event-trigger operations. In addition, an adaptive local maintenance algorithm is designed for the broken clusters in the WSNs using the spatial-temporal demand changes accordingly. Numerical experiments are performed using the NS2 network simulation platform. Results validate the effectiveness of the proposed model with respect to the network maintenance costs, node energy consumption and transmitted data as well as the network lifetime.

  8. Mapping informative clusters in a hierarchical [corrected] framework of FMRI multivariate analysis.

    Directory of Open Access Journals (Sweden)

    Rui Xu

    Full Text Available Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies.

  9. The multiple outliers detection using agglomerative hierarchical methods in circular regression model

    Science.gov (United States)

    Zanariah Satari, Siti; Di, Nur Faraidah Muhammad; Zakaria, Roslinazairimah

    2017-09-01

    Two agglomerative hierarchical clustering algorithms for identifying multiple outliers in circular regression model have been developed in this study. The agglomerative hierarchical clustering algorithm starts with every single data in a single cluster and it continues to merge with the closest pair of clusters according to some similarity criterion until all the data are grouped in one cluster. The single-linkage method is one of the simplest agglomerative hierarchical methods that is commonly used to detect outlier. In this study, we compared the performance of single-linkage method with another agglomerative hierarchical method, namely average linkage for detecting outlier in circular regression model. The performances of both methods were examined via simulation studies by measuring their “success” probability, masking effect, and swamping effect with different number of sample sizes and level of contaminations. The results show that the single-linkage method performs very well in detecting the multiple outliers with lower masking and swamping effects.

  10. The clustergram: A graph for visualizing hierarchical and nonhierarchical cluster analyses

    OpenAIRE

    Matthias Schonlau

    2002-01-01

    In hierarchical cluster analysis, dendrograms are used to visualize how clusters are formed. I propose an alternative graph called a "clustergram" to examine how cluster members are assigned to clusters as the number of clusters increases. This graph is useful in exploratory analysis for nonhierarchical clustering algorithms such as k means and for hierarchical cluster algorithms when the number of observations is large enough to make dendrograms impractical. I present the Stata code and give...

  11. Unsupervised active learning based on hierarchical graph-theoretic clustering.

    Science.gov (United States)

    Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve

    2009-10-01

    Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.

  12. Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.

    Directory of Open Access Journals (Sweden)

    Romeo Rizzi

    Full Text Available Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation, exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.

  13. Applied Hierarchical Cluster Analysis with Average Linkage Algoritm

    Directory of Open Access Journals (Sweden)

    Cindy Cahyaning Astuti

    2017-11-01

    Full Text Available This research was conducted in Sidoarjo District where source of data used from secondary data contained in the book "Kabupaten Sidoarjo Dalam Angka 2016" .In this research the authors chose 12 variables that can represent sub-district characteristics in Sidoarjo. The variable that represents the characteristics of the sub-district consists of four sectors namely geography, education, agriculture and industry. To determine the equitable geographical conditions, education, agriculture and industry each district, it would require an analysis to classify sub-districts based on the sub-district characteristics. Hierarchical cluster analysis is the analytical techniques used to classify or categorize the object of each case into a relatively homogeneous group expressed as a cluster. The results are expected to provide information about dominant sub-district characteristics and non-dominant sub-district characteristics in four sectors based on the results of the cluster is formed.

  14. Globular cluster formation with multiple stellar populations from hierarchical star cluster complexes

    Science.gov (United States)

    Bekki, Kenji

    2017-05-01

    Most old globular clusters (GCs) in the Galaxy are observed to have internal chemical abundance spreads in light elements. We discuss a new GC formation scenario based on hierarchical star formation within fractal molecular clouds. In the new scenario, a cluster of bound and unbound star clusters ('star cluster complex', SCC) that have a power-law cluster mass function with a slope (β) of 2 is first formed from a massive gas clump developed in a dwarf galaxy. Such cluster complexes and β = 2 are observed and expected from hierarchical star formation. The most massive star cluster ('main cluster'), which is the progenitor of a GC, can accrete gas ejected from asymptotic giant branch (AGB) stars initially in the cluster and other low-mass clusters before the clusters are tidally stripped or destroyed to become field stars in the dwarf. The SCC is initially embedded in a giant gas hole created by numerous supernovae of the SCC so that cold gas outside the hole can be accreted on to the main cluster later. New stars formed from the accreted gas have chemical abundances that are different from those of the original SCC. Using hydrodynamical simulations of GC formation based on this scenario, we show that the main cluster with the initial mass as large as [2-5] × 105 M⊙ can accrete more than 105 M⊙ gas from AGB stars of the SCC. We suggest that merging of hierarchical SSCs can play key roles in stellar halo formation around GCs and self-enrichment processes in the early phase of GC formation.

  15. Time-Hierarchical Clustering and Visualization of Weather Forecast Ensembles.

    Science.gov (United States)

    Ferstl, Florian; Kanzler, Mathias; Rautenhaus, Marc; Westermann, Rudiger

    2017-01-01

    We propose a new approach for analyzing the temporal growth of the uncertainty in ensembles of weather forecasts which are started from perturbed but similar initial conditions. As an alternative to traditional approaches in meteorology, which use juxtaposition and animation of spaghetti plots of iso-contours, we make use of contour clustering and provide means to encode forecast dynamics and spread in one single visualization. Based on a given ensemble clustering in a specified time window, we merge clusters in time-reversed order to indicate when and where forecast trajectories start to diverge. We present and compare different visualizations of the resulting time-hierarchical grouping, including space-time surfaces built by connecting cluster representatives over time, and stacked contour variability plots. We demonstrate the effectiveness of our visual encodings with forecast examples of the European Centre for Medium-Range Weather Forecasts, which convey the evolution of specific features in the data as well as the temporally increasing spatial variability.

  16. Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

    Science.gov (United States)

    Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

    2015-11-05

    Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (Pgait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Hierarchical clustering of RGB surface water images based on MIA ...

    African Journals Online (AJOL)

    2009-11-25

    Nov 25, 2009 ... found that the MIA-LSI approach complemented with a suitable clustering method is able to recognise the similar images of ... Keywords: multivariate image analysis (MIA), latent semantic indexing (LSI), RGB image, Ward's clustering, ..... for vision-based surveillance in heavy industry – convergence.

  18. 3D reconstruction from non-uniform point clouds via local hierarchical clustering

    Science.gov (United States)

    Yang, Jiaqi; Li, Ruibo; Xiao, Yang; Cao, Zhiguo

    2017-07-01

    Raw scanned 3D point clouds are usually irregularly distributed due to the essential shortcomings of laser sensors, which therefore poses a great challenge for high-quality 3D surface reconstruction. This paper tackles this problem by proposing a local hierarchical clustering (LHC) method to improve the consistency of point distribution. Specifically, LHC consists of two steps: 1) adaptive octree-based decomposition of 3D space, and 2) hierarchical clustering. The former aims at reducing the computational complexity and the latter transforms the non-uniform point set into uniform one. Experimental results on real-world scanned point clouds validate the effectiveness of our method from both qualitative and quantitative aspects.

  19. Exploiting Homogeneity of Density in Incremental Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Dwi H. Widiyantoro

    2006-11-01

    Full Text Available Hierarchical clustering is an important tool in many applications. As it involves a large data set that proliferates over time, reclustering the data set periodically is not an efficient process. Therefore, the ability to incorporate a new data set incrementally into an existing hierarchy becomes increasingly demanding. This article describes Homogen, a system that employs a new algorithm for generating a hierarchy of concepts and clusters incrementally from a stream of observations. The system aims to construct a hierarchy that satisfies the homogeneity and the monotonicity properties. Working in a bottom-up fashion, a new observation is placed in the hierarchy and a sequence of hierarchy restructuring processes is performed only in regions that have been affected by the presence of the new observation. Additionally, it combines multiple restructuring techniques that address different restructuring objectives to get a synergistic effect. The system has been tested on a variety of domains including structured and unstructured data sets. The experimental results reveal that the system is able to construct a concept hierarchy that is consistent regardless of the input data order and whose quality is comparable to the quality of those produced by non incremental clustering algorithms.

  20. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling

    Science.gov (United States)

    Rahman, Md. Habibur; Matin, M. A.; Salma, Umma

    2017-12-01

    The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.

  1. Novel density-based and hierarchical density-based clustering algorithms for uncertain data.

    Science.gov (United States)

    Zhang, Xianchao; Liu, Han; Zhang, Xiaotong

    2017-09-01

    Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing

  2. Clinical fracture risk evaluated by hierarchical agglomerative clustering

    DEFF Research Database (Denmark)

    Kruse, Christian; Eiken, P; Vestergaard, P

    2017-01-01

    Clustering analysis can identify subgroups of patients based on similarities of traits. From data on 10,775 subjects, we document nine patient clusters of different fracture risks. Differences emerged after age 60 and treatment compliance differed by hip and lumbar spine bone mineral density...... as low fracture risk with high to very high BMD. A mean age of 60 years was the earliest that allowed for separation of high-risk clusters. DXA scan results could identify high-risk subjects with different antiresorptive treatment compliance levels based on similarities and differences in lumbar spine...... profiles. INTRODUCTION: The purposes of this study were to establish and quantify patient clusters of high, average and low fracture risk using an unsupervised machine learning algorithm. METHODS: Regional and national Danish patient data on dual-energy X-ray absorptiometry (DXA) scans, medication...

  3. Graph kernels, hierarchical clustering, and network community structure: experiments and comparative analysis

    Science.gov (United States)

    Zhang, S.; Ning, X.-M.; Zhang, X.-S.

    2007-05-01

    There has been a quickly growing interest in properties of complex networks, such as the small world property, power-law degree distribution, network transitivity, and community structure, which seem to be common to many real world networks. In this study, we consider the community property which is also found in many real networks. Based on the diffusion kernels of networks, a hierarchical clustering approach is proposed to uncover the community structure of different extent of complex networks. We test the method on some networks with known community structures and find that it can detect significant community structure in these networks. Comparison with related methods shows the effectiveness of the method.

  4. Hierarchical clusters in families with type 2 diabetes

    Science.gov (United States)

    García-Solano, Beatriz; Gallegos-Cabriales, Esther C; Gómez-Meza, Marco V; García-Madrid, Guillermina; Flores-Merlo, Marcela; García-Solano, Mauro

    2015-01-01

    Families represent more than a set of individuals; family is more than a sum of its individual members. With this classification, nurses can identify the family health-illness beliefs obey family as a unit concept, and plan family inclusion into the type 2 diabetes treatment, whom is not considered in public policy, despite families share diet, exercise, and self-monitoring with a member who suffers type 2 diabetes. The aim of this study was to determine whether the characteristics, functionality, routines, and family and individual health in type 2 diabetes describes the differences and similarities between families to consider them as a unit. We performed an exploratory, descriptive hierarchical cluster analysis of 61 families using three instruments and a questionnaire, in addition to weight, height, body fat percentage, hemoglobin A1c, total cholesterol, triglycerides, low-density lipoprotein and high-density lipoprotein. The analysis produced three groups of families. Wilk’s lambda demonstrated statistically significant differences provided by age (Λ = 0.778, F = 2.098, p = 0.010) and family health (Λ = 0.813, F = 2.650, p = 0.023). A post hoc Tukey test coincided with the three subsets. Families with type 2 diabetes have common elements that make them similar, while sharing differences that make them unique. PMID:27347419

  5. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

    Science.gov (United States)

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

    2017-07-01

    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  6. Applying of hierarchical clustering to analysis of protein patterns in the human cancer-associated liver.

    Science.gov (United States)

    Petushkova, Natalia A; Pyatnitskiy, Mikhail A; Rudenko, Vladislav A; Larina, Olesya V; Trifonova, Oxana P; Kisrieva, Julya S; Samenkova, Natalia F; Kuznetsova, Galina P; Karuzina, Irina I; Lisitsa, Andrey V

    2014-01-01

    There are two ways that statistical methods can learn from biomedical data. One way is to learn classifiers to identify diseases and to predict outcomes using the training dataset with established diagnosis for each sample. When the training dataset is not available the task can be to mine for presence of meaningful groups (clusters) of samples and to explore underlying data structure (unsupervised learning). We investigated the proteomic profiles of the cytosolic fraction of human liver samples using two-dimensional electrophoresis (2DE). Samples were resected upon surgical treatment of hepatic metastases in colorectal cancer. Unsupervised hierarchical clustering of 2DE gel images (n = 18) revealed a pair of clusters, containing 11 and 7 samples. Previously we used the same specimens to measure biochemical profiles based on cytochrome P450-dependent enzymatic activities and also found that samples were clearly divided into two well-separated groups by cluster analysis. It turned out that groups by enzyme activity almost perfectly match to the groups identified from proteomic data. Of the 271 reproducible spots on our 2DE gels, we selected 15 to distinguish the human liver cytosolic clusters. Using MALDI-TOF peptide mass fingerprinting, we identified 12 proteins for the selected spots, including known cancer-associated species. Our results highlight the importance of hierarchical cluster analysis of proteomic data, and showed concordance between results of biochemical and proteomic approaches. Grouping of the human liver samples and/or patients into differing clusters may provide insights into possible molecular mechanism of drug metabolism and creates a rationale for personalized treatment.

  7. Applying of hierarchical clustering to analysis of protein patterns in the human cancer-associated liver.

    Directory of Open Access Journals (Sweden)

    Natalia A Petushkova

    Full Text Available There are two ways that statistical methods can learn from biomedical data. One way is to learn classifiers to identify diseases and to predict outcomes using the training dataset with established diagnosis for each sample. When the training dataset is not available the task can be to mine for presence of meaningful groups (clusters of samples and to explore underlying data structure (unsupervised learning.We investigated the proteomic profiles of the cytosolic fraction of human liver samples using two-dimensional electrophoresis (2DE. Samples were resected upon surgical treatment of hepatic metastases in colorectal cancer. Unsupervised hierarchical clustering of 2DE gel images (n = 18 revealed a pair of clusters, containing 11 and 7 samples. Previously we used the same specimens to measure biochemical profiles based on cytochrome P450-dependent enzymatic activities and also found that samples were clearly divided into two well-separated groups by cluster analysis. It turned out that groups by enzyme activity almost perfectly match to the groups identified from proteomic data. Of the 271 reproducible spots on our 2DE gels, we selected 15 to distinguish the human liver cytosolic clusters. Using MALDI-TOF peptide mass fingerprinting, we identified 12 proteins for the selected spots, including known cancer-associated species.Our results highlight the importance of hierarchical cluster analysis of proteomic data, and showed concordance between results of biochemical and proteomic approaches. Grouping of the human liver samples and/or patients into differing clusters may provide insights into possible molecular mechanism of drug metabolism and creates a rationale for personalized treatment.

  8. Using Dynamic Quantum Clustering to Analyze Hierarchically Heterogeneous Samples on the Nanoscale

    Energy Technology Data Exchange (ETDEWEB)

    Hume, Allison; /Princeton U. /SLAC

    2012-09-07

    Dynamic Quantum Clustering (DQC) is an unsupervised, high visual data mining technique. DQC was tested as an analysis method for X-ray Absorption Near Edge Structure (XANES) data from the Transmission X-ray Microscopy (TXM) group. The TXM group images hierarchically heterogeneous materials with nanoscale resolution and large field of view. XANES data consists of energy spectra for each pixel of an image. It was determined that DQC successfully identifies structure in data of this type without prior knowledge of the components in the sample. Clusters and sub-clusters clearly reflected features of the spectra that identified chemical component, chemical environment, and density in the image. DQC can also be used in conjunction with the established data analysis technique, which does require knowledge of components present.

  9. THE EVOLUTION OF BRIGHTEST CLUSTER GALAXIES IN A HIERARCHICAL UNIVERSE

    Energy Technology Data Exchange (ETDEWEB)

    Tonini, Chiara; Bernyk, Maksym; Croton, Darren [Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Melbourne, VIC 3122 (Australia); Maraston, Claudia; Thomas, Daniel [Institute of Cosmology and Gravitation, University of Portsmouth, Portsmouth PO1 3FX (United Kingdom)

    2012-11-01

    We investigate the evolution of brightest cluster galaxies (BCGs) from redshift z {approx} 1.6 to z = 0. We upgrade the hierarchical semi-analytic model of Croton et al. with a new spectro-photometric model that produces realistic galaxy spectra, making use of the Maraston stellar populations and a new recipe for the dust extinction. We compare the model predictions of the K-band luminosity evolution and the J - K, V - I, and I - K color evolution with a series of data sets, including those of Collins et al. who argued that semi-analytic models based on the Millennium simulation cannot reproduce the red colors and high luminosity of BCGs at z > 1. We show instead that the model is well in range of the observed luminosity and correctly reproduces the color evolution of BCGs in the whole redshift range up to z {approx} 1.6. We argue that the success of the semi-analytic model is in large part due to the implementation of a more sophisticated spectro-photometric model. An analysis of the model BCGs shows an increase in mass by a factor of 2-3 since z {approx} 1, and star formation activity down to low redshifts. While the consensus regarding BCGs is that they are passively evolving, we argue that this conclusion is affected by the degeneracy between star formation history and stellar population models used in spectral energy distribution fitting, and by the inefficacy of toy models of passive evolution to capture the complexity of real galaxies, especially those with rich merger histories like BCGs. Following this argument, we also show that in the semi-analytic model the BCGs show a realistic mix of stellar populations, and that these stellar populations are mostly old. In addition, the age-redshift relation of the model BCGs follows that of the universe, meaning that given their merger history and star formation history, the ageing of BCGs is always dominated by the ageing of their stellar populations. In a {Lambda}CDM universe, we define such evolution as &apos

  10. A data-driven approach to estimating the number of clusters in hierarchical clustering [version 1; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Antoine E. Zambelli

    2016-12-01

    Full Text Available DNA microarray and gene expression problems often require a researcher to perform clustering on their data in a bid to better understand its structure. In cases where the number of clusters is not known, one can resort to hierarchical clustering methods. However, there currently exist very few automated algorithms for determining the true number of clusters in the data. We propose two new methods (mode and maximum difference for estimating the number of clusters in a hierarchical clustering framework to create a fully automated process with no human intervention. These methods are compared to the established elbow and gap statistic algorithms using simulated datasets and the Biobase Gene ExpressionSet. We also explore a data mixing procedure inspired by cross validation techniques. We find that the overall performance of the maximum difference method is comparable or greater to that of the gap statistic in multi-cluster scenarios, and achieves that performance at a fraction of the computational cost. This method also responds well to our mixing procedure, which opens the door to future research. We conclude that both the mode and maximum difference methods warrant further study related to their mixing and cross-validation potential. We particularly recommend the use of the maximum difference method in multi-cluster scenarios given its accuracy and execution times, and present it as an alternative to existing algorithms.

  11. Hierarchical Information-based Clustering for Connectivity-based Cortex Parcellation

    Directory of Open Access Journals (Sweden)

    Nico Stephan Gorbach

    2011-09-01

    Full Text Available One of the most promising avenues for compiling connectivity data originates from the notion that individual brain regions maintain individual connectivity profiles; the functional repertoire of a cortical area ("the functional fingerprint" is closely related to its anatomical connections ("the connectional fingerprint" and, hence, a segregated cortical area may be characterized by a highly coherent connectivity pattern.Diffusion tractography can be used to identify borders between such cortical areas. Each cortical area is defined based upon a unique probabilistic tractogram and such a tractogram is representative of a group of tractograms, thereby forming the cortical area. The underlying methodology is called connectivity-based cortex parcellation, and requires essentially clustering or grouping of similar diffusion tractograms.Despite the relative success of this technique in producing anatomically sensible results, existing clustering techniques in the context of connectivity-based parcellation typically depend on several nontrivial assumptions. In this paper, we embody an unsupervised hierarchical information-based framework to clustering probabilistic tractograms that avoids many drawbacks offered by previous methods.Cortex parcellation of the inferior frontal gyrus together with the precentral gyrus demonstrates a proof of concept of the proposed method: The automatic parcellation reveals cortical subunits consistent with cytoarchitectonic maps and previous studies including connectivity-based parcellation. Further insight into the hierarchically modular architecture of cortical subunits is given by revealing coarser cortical structures that differentiate between primary as well as pre-motoric areas and those associated with pre-frontal areas.

  12. Dynamic stopping criteria of turbo codes for clustered set partitioning in hierarchical trees encoded image transmission

    Science.gov (United States)

    Fang, Jiunn-Tsair; Wu, Cheng-Shong

    2011-10-01

    Turbo codes adopt iterative decoding to increase the ability of error correction. However, the iterative method increases the decoding delay and power consumption. An effective approach is to decrease the number of iterations while tolerating slight performance degradation. We apply the clustered set partitioning in hierarchical trees for image coding. Different from other early stop criteria, we use the bit-error sensitivities from the image data. Then, the stop criterion is directly determined by the importance of image data. Simulation results show that our scheme can reduce more number of iterations with less degradation for peak-signal-to-noise ratio or structure similar performance.

  13. Climate Regionalization through Hierarchical Clustering: Options and Recommendations for Africa

    Science.gov (United States)

    Badr, H. S.; Zaitchik, B. F.; Dezfuli, A. K.

    2014-12-01

    Climate regionalization is an important but often under-emphasized step in studies of climate variability and predictions. While most investigations of regional climate or statistical/dynamical predictions do make at least an implicit attempt to focus on a study region or sub-regions that are climatically coherent in some respect, rigorous climate regionalization--in which the study area is divided on the basis of the most relevant climate metrics and at a resolution most appropriate to the data and the scientific question--has the potential to enhance the precision and explanatory power of climate studies in many cases. This is particularly true for climatically complex regions such as the Greater Horn of Africa (GHA) and Equatorial West Africa. Here we present an improved clustering method and a flexible, open-source software tool (R package "HiClimR") designed specifically for climate regionalization. As a demonstration, we apply HiClimR to regionalize the GHA on the basis of interannual precipitation variability in each calendar month and for three-month running seasons. Different clustering methods are tested to show the behavior of each method and provide recommendations for specific problems. This would underscore the applicability of our work to a wide range of climate issues, and enable researchers to easily and quickly learn how to apply our tools to their own problems. Both the proposed methodology and the R package can be easily used for a broad range of climate applications.

  14. Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis.

    Science.gov (United States)

    Fernández-Arjona, María Del Mar; Grondona, Jesús M; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D

    2017-01-01

    It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable

  15. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    Science.gov (United States)

    Grasha, K.; Calzetti, D.; Adamo, A.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Dale, D. A.; Fumagalli, M.; Grebel, E. K.; Johnson, K. E.; Kahre, L.; Kennicutt, R. C.; Messa, M.; Pellerin, A.; Ryon, J. E.; Smith, L. J.; Shabani, F.; Thilker, D.; Ubeda, L.

    2017-05-01

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3-15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ˜40-60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  16. The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies

    Energy Technology Data Exchange (ETDEWEB)

    Grasha, K.; Calzetti, D. [Astronomy Department, University of Massachusetts, Amherst, MA 01003 (United States); Adamo, A.; Messa, M. [Dept. of Astronomy, The Oskar Klein Centre, Stockholm University, Stockholm (Sweden); Kim, H. [Gemini Observatory, La Serena (Chile); Elmegreen, B. G. [IBM Research Division, T.J. Watson Research Center, Yorktown Hts., NY (United States); Gouliermis, D. A. [Zentrum für Astronomie der Universität Heidelberg, Institut für Theoretische Astrophysik, Albert-Ueberle-Str. 2, D-69120 Heidelberg (Germany); Dale, D. A. [Dept. of Physics and Astronomy, University of Wyoming, Laramie, WY (United States); Fumagalli, M. [Institute for Computational Cosmology and Centre for Extragalactic Astronomy, Durham University, Durham (United Kingdom); Grebel, E. K.; Shabani, F. [Astronomisches Rechen-Institut, Zentrum für Astronomie der Universität Heidelberg, Mönchhofstr. 12-14, D-69120 Heidelberg (Germany); Johnson, K. E. [Dept. of Astronomy, University of Virginia, Charlottesville, VA (United States); Kahre, L. [Dept. of Astronomy, New Mexico State University, Las Cruces, NM (United States); Kennicutt, R. C. [Institute of Astronomy, University of Cambridge, Cambridge (United Kingdom); Pellerin, A. [Dept. of Physics and Astronomy, State University of New York at Geneseo, Geneseo NY (United States); Ryon, J. E.; Ubeda, L. [Space Telescope Science Institute, Baltimore, MD (United States); Smith, L. J. [European Space Agency/Space Telescope Science Institute, Baltimore, MD (United States); Thilker, D., E-mail: kgrasha@astro.umass.edu [Dept. of Physics and Astronomy, The Johns Hopkins University, Baltimore, MD (United States)

    2017-05-10

    We present a study of the hierarchical clustering of the young stellar clusters in six local (3–15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ∼40–60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.

  17. Multidimensional Dynamic Programming Algorithm for N-Level Batching with Hierarchical Clustering Structure

    Directory of Open Access Journals (Sweden)

    Seung-Kil Lim

    2017-01-01

    Full Text Available This study focuses on the N-level batching problem with a hierarchical clustering structure. Clustering is the task of grouping a set of item types in such a way that item types in the same cluster are more similar (in some sense or another to each other than to those in other clusters. In hierarchical clustering structure, more and more different item types are clustered together as the level of the hierarchy increases. N-level batching is the process by which items with different types are grouped into several batches passed from level 1 to level N sequentially for given hierarchical clustering structure such that batches in each level should satisfy the maximum and minimum batch size requirements of the level. We consider two types of processing costs of the batches: unit processing cost and batch processing cost. We formulate the N-level batching problem with a hierarchical clustering structure as a nonlinear integer programming model with the objective of minimizing the total processing cost. To solve the problem optimally, we propose a multidimensional dynamic programming algorithm with an example.

  18. Non-perturbative Methods For Hierarchical Models

    CERN Document Server

    Oktay, M B

    2001-01-01

    The goal of this thesis is to provide a practical method to calculate, in scalar field theory, accurate numerical values of the renormalized quantities which could be used to test any kind of approximate calculation. We use finite truncations of the Fourier transform of the recursion formula for Dyson's hierarchical model in the symmetric and broken phases to perform high precision calculations of the Green's functions at zero momentum. We use the well-known correspondence between statistical mechanics and field theory in which the large cut-off limit is obtained by letting β reach a critical value βc. We show that the round-off errors on the magnetic susceptibility grow like (βc − β) −1 near criticality. We show that the systematic errors (finite truncation and volume) can be controlled with an exponential precision and reduced to a level lower than numerical errors. We probe the numerical errors made in Renormalization Group (RG) calculations by varyin...

  19. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    Science.gov (United States)

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  20. Energy Efficient Hierarchical Clustering Approaches in Wireless Sensor Networks: A Survey

    Directory of Open Access Journals (Sweden)

    Bilal Jan

    2017-01-01

    Full Text Available Wireless sensor networks (WSN are one of the significant technologies due to their diverse applications such as health care monitoring, smart phones, military, disaster management, and other surveillance systems. Sensor nodes are usually deployed in large number that work independently in unattended harsh environments. Due to constraint resources, typically the scarce battery power, these wireless nodes are grouped into clusters for energy efficient communication. In clustering hierarchical schemes have achieved great interest for minimizing energy consumption. Hierarchical schemes are generally categorized as cluster-based and grid-based approaches. In cluster-based approaches, nodes are grouped into clusters, where a resourceful sensor node is nominated as a cluster head (CH while in grid-based approach the network is divided into confined virtual grids usually performed by the base station. This paper highlights and discusses the design challenges for cluster-based schemes, the important cluster formation parameters, and classification of hierarchical clustering protocols. Moreover, existing cluster-based and grid-based techniques are evaluated by considering certain parameters to help users in selecting appropriate technique. Furthermore, a detailed summary of these protocols is presented with their advantages, disadvantages, and applicability in particular cases.

  1. Hierarchical Control for Multiple DC-Microgrids Clusters

    DEFF Research Database (Denmark)

    Shafiee, Qobad; Dragicevic, Tomislav; Vasquez, Juan Carlos

    2014-01-01

    DC microgrids (MGs) have gained research interest during the recent years because of many potential advantages as compared to the ac system. To ensure reliable operation of a low-voltage dc MG as well as its intelligent operation with the other DC MGs, a hierarchical control is proposed in this p...

  2. The method of parallel-hierarchical transformation for rapid recognition of dynamic images using GPGPU technology

    Science.gov (United States)

    Timchenko, Leonid; Yarovyi, Andrii; Kokriatskaya, Nataliya; Nakonechna, Svitlana; Abramenko, Ludmila; Ławicki, Tomasz; Popiel, Piotr; Yesmakhanova, Laura

    2016-09-01

    The paper presents a method of parallel-hierarchical transformations for rapid recognition of dynamic images using GPU technology. Direct parallel-hierarchical transformations based on cluster CPU-and GPU-oriented hardware platform. Mathematic models of training of the parallel hierarchical (PH) network for the transformation are developed, as well as a training method of the PH network for recognition of dynamic images. This research is most topical for problems on organizing high-performance computations of super large arrays of information designed to implement multi-stage sensing and processing as well as compaction and recognition of data in the informational structures and computer devices. This method has such advantages as high performance through the use of recent advances in parallelization, possibility to work with images of ultra dimension, ease of scaling in case of changing the number of nodes in the cluster, auto scan of local network to detect compute nodes.

  3. Classification of bioinformatics workflows using weighted versions of partitioning and hierarchical clustering algorithms.

    Science.gov (United States)

    Lord, Etienne; Diallo, Abdoulaye Baniré; Makarenkov, Vladimir

    2015-03-03

    Workflows, or computational pipelines, consisting of collections of multiple linked tasks are becoming more and more popular in many scientific fields, including computational biology. For example, simulation studies, which are now a must for statistical validation of new bioinformatics methods and software, are frequently carried out using the available workflow platforms. Workflows are typically organized to minimize the total execution time and to maximize the efficiency of the included operations. Clustering algorithms can be applied either for regrouping similar workflows for their simultaneous execution on a server, or for dispatching some lengthy workflows to different servers, or for classifying the available workflows with a view to performing a specific keyword search. In this study, we consider four different workflow encoding and clustering schemes which are representative for bioinformatics projects. Some of them allow for clustering workflows with similar topological features, while the others regroup workflows according to their specific attributes (e.g. associated keywords) or execution time. The four types of workflow encoding examined in this study were compared using the weighted versions of k-means and k-medoids partitioning algorithms. The Calinski-Harabasz, Silhouette and logSS clustering indices were considered. Hierarchical classification methods, including the UPGMA, Neighbor Joining, Fitch and Kitsch algorithms, were also applied to classify bioinformatics workflows. Moreover, a novel pairwise measure of clustering solution stability, which can be computed in situations when a series of independent program runs is carried out, was introduced. Our findings based on the analysis of 220 real-life bioinformatics workflows suggest that the weighted clustering models based on keywords information or tasks execution times provide the most appropriate clustering solutions. Using datasets generated by the Armadillo and Taverna scientific workflow

  4. Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

    Science.gov (United States)

    Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2014-04-29

    To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  5. Hierarchical and non-hierarchical clustering and artificial neural networks for thechracterization of groups of feedlot-finished male cattle

    Directory of Open Access Journals (Sweden)

    Wignez Henrique

    2015-03-01

    Full Text Available The individual experimental results of 1,393 feedlot-finished cattle of different genetic groups obtained at different research institutions were collected. Exploratory multivariate hierarchical analysis was applied, which permitted the division of cattle into seven groups containing animals with similar performance patterns. The following variables were studied: weight of the animal at feedlot entry and exit, concentrate percentage, time spent in the feedlot, dry matter intake, weight gain, and feed efficiency. The data were submitted to non-hierarchical k-means cluster analysis, which revealed that all traits should be considered. In addition to the variables used in the previous analysis, the following variables were included: dietary nutrient content, crude protein and total digestible nutrient intake, hot carcass weight and yield, fat coverage, and loin eye area. Using all of these data, structures of 3 to 14 groups were formed which were analyzed using Kohonen self-organizing maps. Specimens of the Nellore breed, either intact or castrated, were diluted among groups in hierarchical and non-hierarchical analysis, as well as in the analysis of artificial neural networks. Nellore animals therefore cannot be characterized as having a single behavior when finished in feedlots, since they participate in groups formed with animals of other Zebu breeds (Gyr, Guzerá and with animals of European breeds (Hereford, Aberdeen Angus, Caracu that exhibit different performance potentials.

  6. Symptom Clusters in People Living with HIV Attending Five Palliative Care Facilities in Two Sub-Saharan African Countries: A Hierarchical Cluster Analysis.

    Science.gov (United States)

    Moens, Katrien; Siegert, Richard J; Taylor, Steve; Namisango, Eve; Harding, Richard

    2015-01-01

    Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries. Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores. Among the sample (N=217) the mean age was 36.5 (SD 9.0), 73.2% were female, and 49.1% were on antiretroviral therapy (ART). The cluster analysis produced five symptom clusters identified as: 1) dermatological; 2) generalised anxiety and elimination; 3) social and image; 4) persistently present; and 5) a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, pART (highest proportions for clusters two and three, p=0.012); global distress (F=26.8, p<0.001), physical distress (F=36.3, p<0.001) and psychological distress subscale (F=21.8, p<0.001) (all subscales worst for cluster one, best for cluster four). The greatest burden is associated with cluster one, and should be prioritised in clinical management. Further symptom cluster research in people living with HIV with longitudinally collected symptom data to test cluster

  7. Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture

    Science.gov (United States)

    Sanfilippo, Antonio [Richland, WA; Calapristi, Augustin J [West Richland, WA; Crow, Vernon L [Richland, WA; Hetzler, Elizabeth G [Kennewick, WA; Turner, Alan E [Kennewick, WA

    2009-12-22

    Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture are described. In one aspect, a document clustering method includes providing a document set comprising a plurality of documents, providing a cluster comprising a subset of the documents of the document set, using a plurality of terms of the documents, providing a cluster label indicative of subject matter content of the documents of the cluster, wherein the cluster label comprises a plurality of word senses, and selecting one of the word senses of the cluster label.

  8. Hierarchical cluster analysis of ignitable liquids based on the total ion spectrum.

    Science.gov (United States)

    Waddell, Erin E; Frisch-Daiello, Jessica L; Williams, Mary R; Sigman, Michael E

    2014-09-01

    Gas chromatography-mass spectrometry (GC-MS) data of ignitable liquids in the Ignitable Liquids Reference Collection (ILRC) database were processed to obtain 445 total ion spectra (TIS), that is, average mass spectra across the chromatographic profile. Hierarchical cluster analysis, an unsupervised learning technique, was applied to find features useful for classification of ignitable liquids. A combination of the correlation distance and average linkage was utilized for grouping ignitable liquids with similar chemical composition. This study evaluated whether hierarchical cluster analysis of the TIS would cluster together ignitable liquids of the same ASTM class assignment, as designated in the ILRC database. The ignitable liquids clustered based on their chemical composition, and the ignitable liquids within each cluster were predominantly from one ASTM E1618-11 class. These results reinforce use of the TIS as a tool to aid in forensic fire debris analysis. © 2014 American Academy of Forensic Sciences.

  9. Hierarchical Clustering for Development Equality Analysis: Indonesian Data of Educational Support (2011 - 2014)

    Science.gov (United States)

    Wijayanto, Feri

    2017-03-01

    Indonesia which contains more than 30 provinces with the decentralization system needs to identify its development equality. Because inequality in development performance will bring disparity among provinces. At present, the development monitoring is using the indicator’s values and comparing those values among provinces. There are no tools which could be used to identify the general development performance, moreover regarding the equality. This research wants to see the possibility of using hierarchical clustering to observe this equality, especially on educational support development. In result, the graph which is plotted using the dissimilarity values as a side result of hierarchical clustering could describe the trend of the equality.

  10. Semi-supervised clustering methods

    Science.gov (United States)

    Bair, Eric

    2013-01-01

    Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning that there is no outcome variable nor is anything known about the relationship between the observations in the data set. In many situations, however, information about the clusters is available in addition to the values of the features. For example, the cluster labels of some observations may be known, or certain observations may be known to belong to the same cluster. In other cases, one may wish to identify clusters that are associated with a particular outcome variable. This review describes several clustering algorithms (known as “semi-supervised clustering” methods) that can be applied in these situations. The majority of these methods are modifications of the popular k-means clustering method, and several of them will be described in detail. A brief description of some other semi-supervised clustering algorithms is also provided. PMID:24729830

  11. DATA CLASSIFICATION WITH NEURAL CLASSIFIER USING RADIAL BASIS FUNCTION WITH DATA REDUCTION USING HIERARCHICAL CLUSTERING

    Directory of Open Access Journals (Sweden)

    M. Safish Mary

    2012-04-01

    Full Text Available Classification of large amount of data is a time consuming process but crucial for analysis and decision making. Radial Basis Function networks are widely used for classification and regression analysis. In this paper, we have studied the performance of RBF neural networks to classify the sales of cars based on the demand, using kernel density estimation algorithm which produces classification accuracy comparable to data classification accuracy provided by support vector machines. In this paper, we have proposed a new instance based data selection method where redundant instances are removed with help of a threshold thus improving the time complexity with improved classification accuracy. The instance based selection of the data set will help reduce the number of clusters formed thereby reduces the number of centers considered for building the RBF network. Further the efficiency of the training is improved by applying a hierarchical clustering technique to reduce the number of clusters formed at every step. The paper explains the algorithm used for classification and for conditioning the data. It also explains the complexities involved in classification of sales data for analysis and decision-making.

  12. Modeling Hierarchically Clustered Longitudinal Survival Processes with Applications to Child Mortality and Maternal Health

    Directory of Open Access Journals (Sweden)

    Kuate-Defo, Bathélémy

    2001-01-01

    Full Text Available EnglishThis paper merges two parallel developments since the 1970s of newstatistical tools for data analysis: statistical methods known as hazard models that are used foranalyzing event-duration data and statistical methods for analyzing hierarchically clustered dataknown as multilevel models. These developments have rarely been integrated in research practice andthe formalization and estimation of models for hierarchically clustered survival data remain largelyuncharted. I attempt to fill some of this gap and demonstrate the merits of formulating and estimatingmultilevel hazard models with longitudinal data.FrenchCette étude intègre deux approches statistiques de pointe d'analyse des donnéesquantitatives depuis les années 70: les méthodes statistiques d'analyse desdonnées biographiques ou méthodes de survie et les méthodes statistiquesd'analyse des données hiérarchiques ou méthodes multi-niveaux. Ces deuxapproches ont été très peu mis en symbiose dans la pratique de recherche et parconséquent, la formulation et l'estimation des modèles appropriés aux donnéeslongitudinales et hiérarchiquement nichées demeure essentiellement un champd'investigation vierge. J'essaye de combler ce vide et j'utilise des données réellesen santé publique pour démontrer les mérites et contextes de formulation etd'estimation des modèles multi-niveaux et multi-états des données biographiqueset longitudinales.

  13. ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time.

    Science.gov (United States)

    Cai, Yunpeng; Sun, Yijun

    2011-08-01

    Taxonomy-independent analysis plays an essential role in microbial community analysis. Hierarchical clustering is one of the most widely employed approaches to finding operational taxonomic units, the basis for many downstream analyses. Most existing algorithms have quadratic space and computational complexities, and thus can be used only for small or medium-scale problems. We propose a new online learning-based algorithm that simultaneously addresses the space and computational issues of prior work. The basic idea is to partition a sequence space into a set of subspaces using a partition tree constructed using a pseudometric, then recursively refine a clustering structure in these subspaces. The technique relies on new methods for fast closest-pair searching and efficient dynamic insertion and deletion of tree nodes. To avoid exhaustive computation of pairwise distances between clusters, we represent each cluster of sequences as a probabilistic sequence, and define a set of operations to align these probabilistic sequences and compute genetic distances between them. We present analyses of space and computational complexity, and demonstrate the effectiveness of our new algorithm using a human gut microbiota data set with over one million sequences. The new algorithm exhibits a quasilinear time and space complexity comparable to greedy heuristic clustering algorithms, while achieving a similar accuracy to the standard hierarchical clustering algorithm.

  14. Hierarchical Clustering and Visualization of Aggregate Cyber Data

    Energy Technology Data Exchange (ETDEWEB)

    Patton, Robert M [ORNL; Beaver, Justin M [ORNL; Steed, Chad A [ORNL; Potok, Thomas E [ORNL; Treadwell, Jim N [ORNL

    2011-01-01

    Most commercial intrusion detections systems (IDS) can produce a very high volume of alerts, and are typically plagued by a high false positive rate. The approach described here uses Splunk to aggregate IDS alerts. The aggregated IDS alerts are retrieved from Splunk programmatically and are then clustered using text analysis and visualized using a sunburst diagram to provide an additional understanding of the data. The equivalent of what the cluster analysis and visualization provides would require numerous detailed queries using Splunk and considerable manual effort.

  15. Hierarchical clustering of HPV genotype patterns in the ASCUS-LSIL triage study

    Science.gov (United States)

    Wentzensen, Nicolas; Wilson, Lauren E.; Wheeler, Cosette M.; Carreon, Joseph D.; Gravitt, Patti E.; Schiffman, Mark; Castle, Philip E.

    2010-01-01

    Anogenital cancers are associated with about 13 carcinogenic HPV types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common which complicate the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the ASCUS-LSIL triage study using unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered using complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: Cluster 1 included all CIN3 histology with abnormal cytology; Cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion (HSIL) cytology; Cluster 3 included older women with normal or low grade histology/cytology and low viral load; Cluster 4 included younger women with low grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: Group 1 included only HPV16; Group 2 included nine carcinogenic types plus non-carcinogenic HPV53 and HPV66; and Group 3 included non-carcinogenic types plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and HSIL. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease using unsupervised hierarchical clustering can address complex genotype distributions on a population level. PMID:20959485

  16. [Study of the clinical phenotype of symptomatic chronic airways disease by hierarchical cluster analysis and two-step cluster analyses].

    Science.gov (United States)

    Ning, P; Guo, Y F; Sun, T Y; Zhang, H S; Chai, D; Li, X M

    2016-09-01

    To study the distinct clinical phenotype of chronic airway diseases by hierarchical cluster analysis and two-step cluster analysis. A population sample of adult patients in Donghuamen community, Dongcheng district and Qinghe community, Haidian district, Beijing from April 2012 to January 2015, who had wheeze within the last 12 months, underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, total serum IgE levels, blood eosinophil level and a peak flow diary. Nine variables were chosen as evaluating parameters, including pre-salbutamol forced expired volume in one second(FEV1)/forced vital capacity(FVC) ratio, pre-salbutamol FEV1, percentage of post-salbutamol change in FEV1, residual capacity, diffusing capacity of the lung for carbon monoxide/alveolar volume adjusted for haemoglobin level, peak expiratory flow(PEF) variability, serum IgE level, cumulative tobacco cigarette consumption (pack-years) and respiratory symptoms (cough and expectoration). Subjects' different clinical phenotype by hierarchical cluster analysis and two-step cluster analysis was identified. (1) Four clusters were identified by hierarchical cluster analysis. Cluster 1 was chronic bronchitis in smokers with normal pulmonary function. Cluster 2 was chronic bronchitis or mild chronic obstructive pulmonary disease (COPD) patients with mild airflow limitation. Cluster 3 included COPD patients with heavy smoking, poor quality of life and severe airflow limitation. Cluster 4 recognized atopic patients with mild airflow limitation, elevated serum IgE and clinical features of asthma. Significant differences were revealed regarding pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, maximal mid-expiratory flow curve(MMEF)% pred, carbon monoxide diffusing capacity per liter of alveolar(DLCO)/(VA)% pred, residual volume(RV)% pred, total serum IgE level, smoking history (pack-years), St.George's respiratory questionnaire

  17. 3D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure

    DEFF Research Database (Denmark)

    Suhaibah, A.; Uznir, U.; Antón Castro, Francesc/François

    2016-01-01

    , with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our...

  18. Prediction of in vitro and in vivo oestrogen receptor activity using hierarchical clustering

    Science.gov (United States)

    In this study, hierarchical clustering classification models were developed to predict in vitro and in vivo oestrogen receptor (ER) activity. Classification models were developed for binding, agonist, and antagonist in vitro ER activity and for mouse in vivo uterotrophic ER bindi...

  19. Hierarchical Spectral Consensus Clustering for Group Analysis of Functional Brain Networks.

    Science.gov (United States)

    Ozdemir, Alp; Bolaños, Marcos; Bernat, Edward; Aviyente, Selin

    2015-09-01

    A central question in cognitive neuroscience is how cognitive functions depend on the integration of specialized widely distributed brain regions. In recent years, graph theoretical methods have been used to characterize the structure of the brain functional connectivity. In order to understand the organization of functional connectivity networks, it is important to determine the community structure underlying these complex networks. Moreover, the study of brain functional networks is confounded by the fact that most neurophysiological studies consists of data collected from multiple subjects; thus, it is important to identify communities representative of all subjects. Typically, this problem is addressed by averaging the data across subjects which omits the variability across subjects or using voting methods, which requires a priori knowledge of cluster labels. In this paper, we propose a hierarchical consensus spectral clustering approach to address these problems. Furthermore, new information-theoretic criteria are introduced for selecting the optimal community structure. The proposed framework is applied to electroencephalogram data collected during a study of error-related negativity to better understand the community structure of functional networks involved in the cognitive control.

  20. An improved Pearson's correlation proximity-based hierarchical clustering for mining biological association between genes.

    Science.gov (United States)

    Booma, P M; Prabhakaran, S; Dhanalakshmi, R

    2014-01-01

    Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.

  1. Carbon composition with hierarchical porosity, and methods of preparation

    Science.gov (United States)

    Mayes, Richard T; Dai, Sheng

    2014-10-21

    A method for fabricating a porous carbon material possessing a hierarchical porosity, the method comprising subjecting a precursor composition to a curing step followed by a carbonization step, the precursor composition comprising: (i) a templating component comprised of a block copolymer, (ii) a phenolic component, (iii) a dione component in which carbonyl groups are adjacent, and (iv) an acidic component, wherein said carbonization step comprises heating the precursor composition at a carbonizing temperature for sufficient time to convert the precursor composition to a carbon material possessing a hierarchical porosity comprised of mesopores and macropores. Also described are the resulting hierarchical porous carbon material, a capacitive deionization device in which the porous carbon material is incorporated, as well as methods for desalinating water by use of said capacitive deionization device.

  2. Hierarchical modelling for the environmental sciences statistical methods and applications

    CERN Document Server

    Clark, James S

    2006-01-01

    New statistical tools are changing the way in which scientists analyze and interpret data and models. Hierarchical Bayes and Markov Chain Monte Carlo methods for analysis provide a consistent framework for inference and prediction where information is heterogeneous and uncertain, processes are complicated, and responses depend on scale. Nowhere are these methods more promising than in the environmental sciences.

  3. Principal component analysis vs. self-organizing maps combined with hierarchical clustering for pattern recognition in volcano seismic spectra

    Science.gov (United States)

    Unglert, K.; Radić, V.; Jellinek, A. M.

    2016-06-01

    Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.

  4. An Intrusion Detection System Based on Multi-Level Clustering for Hierarchical Wireless Sensor Networks

    Science.gov (United States)

    Butun, Ismail; Ra, In-Ho; Sankar, Ravi

    2015-01-01

    In this work, an intrusion detection system (IDS) framework based on multi-level clustering for hierarchical wireless sensor networks is proposed. The framework employs two types of intrusion detection approaches: (1) “downward-IDS (D-IDS)” to detect the abnormal behavior (intrusion) of the subordinate (member) nodes; and (2) “upward-IDS (U-IDS)” to detect the abnormal behavior of the cluster heads. By using analytical calculations, the optimum parameters for the D-IDS (number of maximum hops) and U-IDS (monitoring group size) of the framework are evaluated and presented. PMID:26593915

  5. An Intrusion Detection System Based on Multi-Level Clustering for Hierarchical Wireless Sensor Networks.

    Science.gov (United States)

    Butun, Ismail; Ra, In-Ho; Sankar, Ravi

    2015-11-17

    In this work, an intrusion detection system (IDS) framework based on multi-level clustering for hierarchical wireless sensor networks is proposed. The framework employs two types of intrusion detection approaches: (1) "downward-IDS (D-IDS)" to detect the abnormal behavior (intrusion) of the subordinate (member) nodes; and (2) "upward-IDS (U-IDS)" to detect the abnormal behavior of the cluster heads. By using analytical calculations, the optimum parameters for the D-IDS (number of maximum hops) and U-IDS (monitoring group size) of the framework are evaluated and presented.

  6. An Intrusion Detection System Based on Multi-Level Clustering for Hierarchical Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Ismail Butun

    2015-11-01

    Full Text Available In this work, an intrusion detection system (IDS framework based on multi-level clustering for hierarchical wireless sensor networks is proposed. The framework employs two types of intrusion detection approaches: (1 “downward-IDS (D-IDS” to detect the abnormal behavior (intrusion of the subordinate (member nodes; and (2 “upward-IDS (U-IDS” to detect the abnormal behavior of the cluster heads. By using analytical calculations, the optimum parameters for the D-IDS (number of maximum hops and U-IDS (monitoring group size of the framework are evaluated and presented.

  7. An Energy Efficient Cooperative Hierarchical MIMO Clustering Scheme for Wireless Sensor Networks

    Science.gov (United States)

    Nasim, Mehwish; Qaisar, Saad; Lee, Sungyoung

    2012-01-01

    In this work, we present an energy efficient hierarchical cooperative clustering scheme for wireless sensor networks. Communication cost is a crucial factor in depleting the energy of sensor nodes. In the proposed scheme, nodes cooperate to form clusters at each level of network hierarchy ensuring maximal coverage and minimal energy expenditure with relatively uniform distribution of load within the network. Performance is enhanced by cooperative multiple-input multiple-output (MIMO) communication ensuring energy efficiency for WSN deployments over large geographical areas. We test our scheme using TOSSIM and compare the proposed scheme with cooperative multiple-input multiple-output (CMIMO) clustering scheme and traditional multihop Single-Input-Single-Output (SISO) routing approach. Performance is evaluated on the basis of number of clusters, number of hops, energy consumption and network lifetime. Experimental results show significant energy conservation and increase in network lifetime as compared to existing schemes. PMID:22368459

  8. An energy efficient cooperative hierarchical MIMO clustering scheme for wireless sensor networks.

    Science.gov (United States)

    Nasim, Mehwish; Qaisar, Saad; Lee, Sungyoung

    2012-01-01

    In this work, we present an energy efficient hierarchical cooperative clustering scheme for wireless sensor networks. Communication cost is a crucial factor in depleting the energy of sensor nodes. In the proposed scheme, nodes cooperate to form clusters at each level of network hierarchy ensuring maximal coverage and minimal energy expenditure with relatively uniform distribution of load within the network. Performance is enhanced by cooperative multiple-input multiple-output (MIMO) communication ensuring energy efficiency for WSN deployments over large geographical areas. We test our scheme using TOSSIM and compare the proposed scheme with cooperative multiple-input multiple-output (CMIMO) clustering scheme and traditional multihop Single-Input-Single-Output (SISO) routing approach. Performance is evaluated on the basis of number of clusters, number of hops, energy consumption and network lifetime. Experimental results show significant energy conservation and increase in network lifetime as compared to existing schemes.

  9. An Energy Efficient Cooperative Hierarchical MIMO Clustering Scheme for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Sungyoung Lee

    2011-12-01

    Full Text Available In this work, we present an energy efficient hierarchical cooperative clustering scheme for wireless sensor networks. Communication cost is a crucial factor in depleting the energy of sensor nodes. In the proposed scheme, nodes cooperate to form clusters at each level of network hierarchy ensuring maximal coverage and minimal energy expenditure with relatively uniform distribution of load within the network. Performance is enhanced by cooperative multiple-input multiple-output (MIMO communication ensuring energy efficiency for WSN deployments over large geographical areas. We test our scheme using TOSSIM and compare the proposed scheme with cooperative multiple-input multiple-output (CMIMO clustering scheme and traditional multihop Single-Input-Single-Output (SISO routing approach. Performance is evaluated on the basis of number of clusters, number of hops, energy consumption and network lifetime. Experimental results show significant energy conservation and increase in network lifetime as compared to existing schemes.

  10. A Multidimensional and Multimembership Clustering Method for Social Networks and Its Application in Customer Relationship Management

    Directory of Open Access Journals (Sweden)

    Peixin Zhao

    2013-01-01

    Full Text Available Community detection in social networks plays an important role in cluster analysis. Many traditional techniques for one-dimensional problems have been proven inadequate for high-dimensional or mixed type datasets due to the data sparseness and attribute redundancy. In this paper we propose a graph-based clustering method for multidimensional datasets. This novel method has two distinguished features: nonbinary hierarchical tree and the multi-membership clusters. The nonbinary hierarchical tree clearly highlights meaningful clusters, while the multimembership feature may provide more useful service strategies. Experimental results on the customer relationship management confirm the effectiveness of the new method.

  11. Bayesian hierarchical models for cost-effectiveness analyses that use data from cluster randomized trials.

    Science.gov (United States)

    Grieve, Richard; Nixon, Richard; Thompson, Simon G

    2010-01-01

    Cost-effectiveness analyses (CEA) may be undertaken alongside cluster randomized trials (CRTs) where randomization is at the level of the cluster (for example, the hospital or primary care provider) rather than the individual. Costs (and outcomes) within clusters may be correlated so that the assumption made by standard bivariate regression models, that observations are independent, is incorrect. This study develops a flexible modeling framework to acknowledge the clustering in CEA that use CRTs. The authors extend previous Bayesian bivariate models for CEA of multicenter trials to recognize the specific form of clustering in CRTs. They develop new Bayesian hierarchical models (BHMs) that allow mean costs and outcomes, and also variances, to differ across clusters. They illustrate how each model can be applied using data from a large (1732 cases, 70 primary care providers) CRT evaluating alternative interventions for reducing postnatal depression. The analyses compare cost-effectiveness estimates from BHMs with standard bivariate regression models that ignore the data hierarchy. The BHMs show high levels of cost heterogeneity across clusters (intracluster correlation coefficient, 0.17). Compared with standard regression models, the BHMs yield substantially increased uncertainty surrounding the cost-effectiveness estimates, and altered point estimates. The authors conclude that ignoring clustering can lead to incorrect inferences. The BHMs that they present offer a flexible modeling framework that can be applied more generally to CEA that use CRTs.

  12. CA-tree: a hierarchical structure for efficient and scalable coassociation-based cluster ensembles.

    Science.gov (United States)

    Wang, Tsaipei

    2011-06-01

    Cluster ensembles have attracted a lot of research interests in recent years, and their applications continue to expand. Among the various algorithms for cluster ensembles, those based on coassociation matrices are probably the ones studied and used the most because coassociation matrices are easy to understand and implement. However, the main limitation of coassociation matrices as the data structure for combining multiple clusterings is the complexity that is at least quadratic to the number of patterns N. In this paper, we propose CA-tree, which is a dendogram-like hierarchical data structure, to facilitate efficient and scalable cluster ensembles for coassociation-matrix-based algorithms. All the properties of the CA-tree are derived from base cluster labels and do not require the access to the original data features. We then apply a threshold to the CA-tree to obtain a set of nodes, which are then used in place of the original patterns for ensemble-clustering algorithms. The experiments demonstrate that the complexity for coassociation-based cluster ensembles can be reduced to close to linear to N with minimal loss on clustering accuracy.

  13. Topology of foreign exchange markets using hierarchical structure methods

    Science.gov (United States)

    Naylor, Michael J.; Rose, Lawrence C.; Moyle, Brendan J.

    2007-08-01

    This paper uses two physics derived hierarchical techniques, a minimal spanning tree and an ultrametric hierarchical tree, to extract a topological influence map for major currencies from the ultrametric distance matrix for 1995-2001. We find that these two techniques generate a defined and robust scale free network with meaningful taxonomy. The topology is shown to be robust with respect to method, to time horizon and is stable during market crises. This topology, appropriately used, gives a useful guide to determining the underlying economic or regional causal relationships for individual currencies and to understanding the dynamics of exchange rate price determination as part of a complex network.

  14. MAP-Based Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and -Norm Minimization

    Directory of Open Access Journals (Sweden)

    Kellermann Walter

    2007-01-01

    Full Text Available We address the problem of underdetermined BSS. While most previous approaches are designed for instantaneous mixtures, we propose a time-frequency-domain algorithm for convolutive mixtures. We adopt a two-step method based on a general maximum a posteriori (MAP approach. In the first step, we estimate the mixing matrix based on hierarchical clustering, assuming that the source signals are sufficiently sparse. The algorithm works directly on the complex-valued data in the time-frequency domain and shows better convergence than algorithms based on self-organizing maps. The assumption of Laplacian priors for the source signals in the second step leads to an algorithm for estimating the source signals. It involves the -norm minimization of complex numbers because of the use of the time-frequency-domain approach. We compare a combinatorial approach initially designed for real numbers with a second-order cone programming (SOCP approach designed for complex numbers. We found that although the former approach is not theoretically justified for complex numbers, its results are comparable to, or even better than, the SOCP solution. The advantage is a lower computational cost for problems with low input/output dimensions.

  15. An Algorithm for Inspecting Self Check-in Airline Luggage Based on Hierarchical Clustering and Cube-fitting

    Directory of Open Access Journals (Sweden)

    Gao Qingji

    2014-04-01

    Full Text Available Airport passengers are required to put only one baggage each time in the check-in self-service so that the baggage can be detected and identified successfully. In order to automatically get the number of baggage that had been put on the conveyor belt, dual laser rangefinders are used to scan the outer contour of luggage in this paper. The algorithm based on hierarchical clustering and cube-fitting is proposed to inspect the number and dimension of airline luggage. Firstly, the point cloud is projected to vertical direction. By the analysis of one-dimensional clustering, the number and height of luggage will be quickly computed. Secondly, the method of nearest hierarchical clustering is applied to divide the point cloud if the above cannot be distinguished. It can preferably solve the difficult issue like crossing or overlapping pieces of baggage. Finally, the point cloud is projected to the horizontal plane. By rotating point cloud based on the centre, its minimum bounding rectangle (MBR is obtained. The length and width of luggage are got form MBR. Many experiments in different cases have been done to verify the effectiveness of the algorithm.

  16. Evolutionary-Hierarchical Bases of the Formation of Cluster Model of Innovation Economic Development

    Directory of Open Access Journals (Sweden)

    Yuliya Vladimirovna Dubrovskaya

    2016-10-01

    Full Text Available The functioning of a modern economic system is based on the interaction of objects of different hierarchical levels. Thus, the problem of the study of innovation processes taking into account the mutual influence of the activities of these economic actors becomes important. The paper dwells evolutionary basis for the formation of models of innovation development on the basis of micro and macroeconomic analysis. Most of the concepts recognized that despite a big number of diverse models, the coordination of the relations between economic agents is of crucial importance for the successful innovation development. According to the results of the evolutionary-hierarchical analysis, the authors reveal key phases of the development of forms of business cooperation, science and government in the domestic economy. It has become the starting point of the conception of the characteristics of the interaction in the cluster models of innovation development of the economy. Considerable expectancies on improvement of the national innovative system are connected with the development of cluster and network structures. The main objective of government authorities is the formation of mechanisms and institutions that will foster cooperation between members of the clusters. The article explains that the clusters cannot become the factors in the growth of the national economy, not being an effective tool for interaction between the actors of the regional innovative systems.

  17. Hierarchical Star Formation in Turbulent Media: Evidence from Young Star Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Grasha, K.; Calzetti, D. [Astronomy Department, University of Massachusetts, Amherst, MA 01003 (United States); Elmegreen, B. G. [IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY (United States); Adamo, A.; Messa, M. [Department of Astronomy, The Oskar Klein Centre, Stockholm University, Stockholm (Sweden); Aloisi, A.; Bright, S. N.; Lee, J. C.; Ryon, J. E.; Ubeda, L. [Space Telescope Science Institute, Baltimore, MD (United States); Cook, D. O. [California Institute of Technology, 1200 East California Boulevard, Pasadena, CA (United States); Dale, D. A. [Department of Physics and Astronomy, University of Wyoming, Laramie, WY (United States); Fumagalli, M. [Institute for Computational Cosmology and Centre for Extragalactic Astronomy, Department of Physics, Durham University, Durham (United Kingdom); Gallagher III, J. S. [Department of Astronomy, University of Wisconsin–Madison, Madison, WI (United States); Gouliermis, D. A. [Zentrum für Astronomie der Universität Heidelberg, Institut für Theoretische Astrophysik, Albert-Ueberle-Str. 2, D-69120 Heidelberg (Germany); Grebel, E. K. [Astronomisches Rechen-Institut, Zentrum für Astronomie der Universität Heidelberg, Mönchhofstr. 12-14, D-69120, Heidelberg (Germany); Kahre, L. [Department of Astronomy, New Mexico State University, Las Cruces, NM (United States); Kim, H. [Gemini Observatory, La Serena (Chile); Krumholz, M. R., E-mail: kgrasha@astro.umass.edu [Research School of Astronomy and Astrophysics, Australian National University, Canberra, ACT 2611 (Australia)

    2017-06-10

    We present an analysis of the positions and ages of young star clusters in eight local galaxies to investigate the connection between the age difference and separation of cluster pairs. We find that star clusters do not form uniformly but instead are distributed so that the age difference increases with the cluster pair separation to the 0.25–0.6 power, and that the maximum size over which star formation is physically correlated ranges from ∼200 pc to ∼1 kpc. The observed trends between age difference and separation suggest that cluster formation is hierarchical both in space and time: clusters that are close to each other are more similar in age than clusters born further apart. The temporal correlations between stellar aggregates have slopes that are consistent with predictions of turbulence acting as the primary driver of star formation. The velocity associated with the maximum size is proportional to the galaxy’s shear, suggesting that the galactic environment influences the maximum size of the star-forming structures.

  18. Analysis of the effects of the global financial crisis on the Turkish economy, using hierarchical methods

    Science.gov (United States)

    Kantar, Ersin; Keskin, Mustafa; Deviren, Bayram

    2012-04-01

    We have analyzed the topology of 50 important Turkish companies for the period 2006-2010 using the concept of hierarchical methods (the minimal spanning tree (MST) and hierarchical tree (HT)). We investigated the statistical reliability of links between companies in the MST by using the bootstrap technique. We also used the average linkage cluster analysis (ALCA) technique to observe the cluster structures much better. The MST and HT are known as useful tools to perceive and detect global structure, taxonomy, and hierarchy in financial data. We obtained four clusters of companies according to their proximity. We also observed that the Banks and Holdings cluster always forms in the centre of the MSTs for the periods 2006-2007, 2008, and 2009-2010. The clusters match nicely with their common production activities or their strong interrelationship. The effects of the Automobile sector increased after the global financial crisis due to the temporary incentives provided by the Turkish government. We find that Turkish companies were not very affected by the global financial crisis.

  19. A supplier selection using a hybrid grey based hierarchical clustering and artificial bee colony

    Directory of Open Access Journals (Sweden)

    Farshad Faezy Razi

    2014-06-01

    Full Text Available Selection of one or a combination of the most suitable potential providers and outsourcing problem is the most important strategies in logistics and supply chain management. In this paper, selection of an optimal combination of suppliers in inventory and supply chain management are studied and analyzed via multiple attribute decision making approach, data mining and evolutionary optimization algorithms. For supplier selection in supply chain, hierarchical clustering according to the studied indexes first clusters suppliers. Then, according to its cluster, each supplier is evaluated through Grey Relational Analysis. Then the combination of suppliers’ Pareto optimal rank and costs are obtained using Artificial Bee Colony meta-heuristic algorithm. A case study is conducted for a better description of a new algorithm to select a multiple source of suppliers.

  20. Hierarchical Clustering of Large Databases and Classification of Antibiotics at High Noise Levels

    Directory of Open Access Journals (Sweden)

    Alexander V. Yarkov

    2008-12-01

    Full Text Available A new algorithm for divisive hierarchical clustering of chemical compounds based on 2D structural fragments is suggested. The algorithm is deterministic, and given a random ordering of the input, will always give the same clustering and can process a database up to 2 million records on a standard PC. The algorithm was used for classification of 1,183 antibiotics mixed with 999,994 random chemical structures. Similarity threshold, at which best separation of active and non active compounds took place, was estimated as 0.6. 85.7% of the antibiotics were successfully classified at this threshold with 0.4% of inaccurate compounds. A .sdf file was created with the probe molecules for clustering of external databases.

  1. How frequently do clusters occur in hierarchical clustering analysis? A graph theoretical approach to studying ties in proximity.

    Science.gov (United States)

    Leal, Wilmer; Llanos, Eugenio J; Restrepo, Guillermo; Suárez, Carlos F; Patarroyo, Manuel Elkin

    2016-01-01

    Hierarchical cluster analysis (HCA) is a widely used classificatory technique in many areas of scientific knowledge. Applications usually yield a dendrogram from an HCA run over a given data set, using a grouping algorithm and a similarity measure. However, even when such parameters are fixed, ties in proximity (i.e. two equidistant clusters from a third one) may produce several different dendrograms, having different possible clustering patterns (different classifications). This situation is usually disregarded and conclusions are based on a single result, leading to questions concerning the permanence of clusters in all the resulting dendrograms; this happens, for example, when using HCA for grouping molecular descriptors to select that less similar ones in QSAR studies. Representing dendrograms in graph theoretical terms allowed us to introduce four measures of cluster frequency in a canonical way, and use them to calculate cluster frequencies over the set of all possible dendrograms, taking all ties in proximity into account. A toy example of well separated clusters was used, as well as a set of 1666 molecular descriptors calculated for a group of molecules having hepatotoxic activity to show how our functions may be used for studying the effect of ties in HCA analysis. Such functions were not restricted to the tie case; the possibility of using them to derive cluster stability measurements on arbitrary sets of dendrograms having the same leaves is discussed, e.g. dendrograms from variations of HCA parameters. It was found that ties occurred frequently, some yielding tens of thousands of dendrograms, even for small data sets. Our approach was able to detect trends in clustering patterns by offering a simple way of measuring their frequency, which is often very low. This would imply, that inferences and models based on descriptor classifications (e.g. QSAR) are likely to be biased, thereby requiring an assessment of their reliability. Moreover, any

  2. Solvothermal synthesis and thermoelectric properties of indium telluride nanostring-cluster hierarchical structures

    Directory of Open Access Journals (Sweden)

    Zhang Haiqian

    2011-01-01

    Full Text Available Abstract A simple solvothermal approach has been developed to successfully synthesize n-type α-In2Te3 thermoelectric nanomaterials. The nanostring-cluster hierarchical structures were prepared using In(NO33 and Na2TeO3 as the reactants in a mixed solvent of ethylenediamine and ethylene glycol at 200°C for 24 h. A diffusion-limited reaction mechanism was proposed to explain the formation of the hierarchical structures. The Seebeck coefficient of the bulk pellet pressed by the obtained samples exhibits 43% enhancement over that of the corresponding thin film at room temperature. The electrical conductivity of the bulk pellet is one to four orders of magnitude higher than that of the corresponding thin film or p-type bulk sample. The synthetic route can be applied to obtain other low-dimensional semiconducting telluride nanostructures. PACS: 65.80.-g, 68.35.bg, 68.35.bt

  3. Solvothermal synthesis and thermoelectric properties of indium telluride nanostring-cluster hierarchical structures

    Science.gov (United States)

    Tai, Guo'an; Miao, Chunyang; Wang, Yubo; Bai, Yunrui; Zhang, Haiqian; Guo, Wanlin

    2011-12-01

    A simple solvothermal approach has been developed to successfully synthesize n-type α-In2Te3 thermoelectric nanomaterials. The nanostring-cluster hierarchical structures were prepared using In(NO3)3 and Na2TeO3 as the reactants in a mixed solvent of ethylenediamine and ethylene glycol at 200°C for 24 h. A diffusion-limited reaction mechanism was proposed to explain the formation of the hierarchical structures. The Seebeck coefficient of the bulk pellet pressed by the obtained samples exhibits 43% enhancement over that of the corresponding thin film at room temperature. The electrical conductivity of the bulk pellet is one to four orders of magnitude higher than that of the corresponding thin film or p-type bulk sample. The synthetic route can be applied to obtain other low-dimensional semiconducting telluride nanostructures. PACS: 65.80.-g, 68.35.bg, 68.35.bt

  4. Objects Classification for Mobile Robots Using Hierarchic Selective Search Method

    Directory of Open Access Journals (Sweden)

    Xu Cheng

    2016-01-01

    Full Text Available Aiming at determining the category of an image captured from mobile robots for intelligent application, classification with the bag-of-words model is proved effectively in near-duplicate/planar images. When it comes to images from mobile robots with complex background, does it still work well? In this paper, based on the merging criterion improvement, a method named hierarchical selective search is proposed hierarchically extracting complementary features to form a combined and environment-adaptable similarity measurement for segmentation resulting a small and high-quality regions set. Simultaneously those regions rather than a whole image are used for classification. As a result, it well improved the classification accuracy and make the bog-of-word model still work well on classification for mobile robots. The experiments on hierarchical selective search show its better performance than selective search on two task datasets for mobile robots. The experiments on classification shows the samples from regions are better than those original whole images. The advantage of less quantity and higher quality object regions from hierarchical selective search is more prominent when it comes to those special tasks for mobile robots with scarce data.

  5. A hierarchical cluster analysis of normal-tension glaucoma using spectral-domain optical coherence tomography parameters.

    Science.gov (United States)

    Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2015-01-01

    Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.

  6. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Guo Junqiao

    2008-09-01

    Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.

  7. COMPOSITE METHOD OF RELIABILITY RESEARCH FOR HIERARCHICAL MULTILAYER ROUTING SYSTEMS

    Directory of Open Access Journals (Sweden)

    R. B. Tregubov

    2016-09-01

    Full Text Available The paper deals with the idea of a research method for hierarchical multilayer routing systems. The method represents a composition of methods of graph theories, reliability, probabilities, etc. These methods are applied to the solution of different private analysis and optimization tasks and are systemically connected and coordinated with each other through uniform set-theoretic representation of the object of research. The hierarchical multilayer routing systems are considered as infrastructure facilities (gas and oil pipelines, automobile and railway networks, systems of power supply and communication with distribution of material resources, energy or information with the use of hierarchically nested functions of routing. For descriptive reasons theoretical constructions are considered on the example of task solution of probability determination for up state of specific infocommunication system. The author showed the possibility of constructive combination of graph representation of structure of the object of research and a logic probable analysis method of its reliability indices through uniform set-theoretic representation of its elements and processes proceeding in them.

  8. 3D NEAREST NEIGHBOUR SEARCH USING A CLUSTERED HIERARCHICAL TREE STRUCTURE

    Directory of Open Access Journals (Sweden)

    A. Suhaibah

    2016-06-01

    Full Text Available Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  9. Relation between financial market structure and the real economy: comparison between clustering methods.

    Science.gov (United States)

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T

    2015-01-01

    We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].

  10. CLUSTAG & WCLUSTAG: Hierarchical Clustering Algorithms for Efficient Tag-SNP Selection

    Science.gov (United States)

    Ao, Sio-Iong

    More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.

  11. Comparison of the incremental and hierarchical methods for crystalline neon.

    Science.gov (United States)

    Nolan, S J; Bygrave, P J; Allan, N L; Manby, F R

    2010-02-24

    We present a critical comparison of the incremental and hierarchical methods for the evaluation of the static cohesive energy of crystalline neon. Both of these schemes make it possible to apply the methods of molecular electronic structure theory to crystalline solids, offering a systematically improvable alternative to density functional theory. Results from both methods are compared with previous theoretical and experimental studies of solid neon and potential sources of error are discussed. We explore the similarities of the two methods and demonstrate how they may be used in tandem to study crystalline solids.

  12. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis.

    Science.gov (United States)

    Huang, Desheng; Guan, Peng; Guo, Junqiao; Wang, Ping; Zhou, Baosen

    2008-09-25

    The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987-1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.

  13. Hierarchical Agglomerative Clustering Schemes for Energy-Efficiency in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Taleb Tariq

    2017-06-01

    Full Text Available Extending the lifetime of wireless sensor networks (WSNs while delivering the expected level of service remains a hot research topic. Clustering has been identified in the literature as one of the primary means to save communication energy. In this paper, we argue that hierarchical agglomerative clustering (HAC provides a suitable foundation for designing highly energy efficient communication protocols for WSNs. To this end, we study a new mechanism for selecting cluster heads (CHs based both on the physical location of the sensors and their residual energy. Furthermore, we study different patterns of communications between the CHs and the base station depending on the possible transmission ranges and the ability of the sensors to act as traffic relays. Simulation results show that our proposed clustering and communication schemes outperform well-knows existing approaches by comfortable margins. In particular, networks lifetime is increased by more than 60% compared to LEACH and HEED, and by more than 30% compared to K-means clustering.

  14. Efficient visible light photocatalytic NO{sub x} removal with cationic Ag clusters-grafted (BiO){sub 2}CO{sub 3} hierarchical superstructures

    Energy Technology Data Exchange (ETDEWEB)

    Feng, Xin [Chongqing Key Laboratory of Catalysis and Functional Organic Molecules, College of Environment and Resources, Engineering Research Center for Waste Oil Recovery Technology and Equipment of Ministry of Education, College of Environment and Resources, Chongqing Technology and Business University, Chongqing 40067 (China); Zhang, Wendong [Department of Scientific Research Management, Chongqing Normal University, Chongqing 401331 (China); Deng, Hua [State Key Joint Laboratory of Environment Simulation and Pollution Control, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085 (China); Ni, Zilin [Department of Scientific Research Management, Chongqing Normal University, Chongqing 401331 (China); Dong, Fan, E-mail: dfctbu@126.com [Chongqing Key Laboratory of Catalysis and Functional Organic Molecules, College of Environment and Resources, Engineering Research Center for Waste Oil Recovery Technology and Equipment of Ministry of Education, College of Environment and Resources, Chongqing Technology and Business University, Chongqing 40067 (China); Zhang, Yuxin, E-mail: zhangyuxin@cqu.edu.cn [College of Materials Science and Engineering, National Key Laboratory of Fundamental Science of Micro/Nano-Devices and System Technology, Chongqing University, Chongqing 400044 (China)

    2017-01-15

    Graphical abstract: The cationic Ag clusters-grafted (BiO){sub 2}CO{sub 3} hierarchical superstructures exhibits highly enhanced visible light photocatalytic air purification through an interfacial charge transfer process induced by Ag clusters. - Highlights: • Microstructural optimization and surface cluster-grafting were firstly combined. • Cationic Ag clusters were grafted on the surface of (BiO){sub 2}CO{sub 3} superstructures. • The Ag clusters-grafted BHS displayed enhanced visible light photocatalysis. • Direct interfacial charge transfer (IFCT) from BHS to Ag clusters was proposed. • The charge transfer process and the dominant reactive species were revealed. - Abstract: A facile method was developed to graft cationic Ag clusters on (BiO){sub 2}CO{sub 3} hierarchical superstructures (BHS) surface to improve their visible light activity. Significantly, the resultant Ag clusters-grafted BHS displayed a highly enhanced visible light photocatalytic performance for NOx removal due to the direct interfacial charge transfer (IFCT) from BHS to Ag clusters. The chemical and coordination state of the cationic Ag clusters was determined with the extended X-ray absorption fine structure (EXAFS) and a theoretical structure model was proposed for this unique Ag clusters. The charge transfer process and the dominant reactive species (·OH) were revealed on the basis of electron spin resonance (ESR) trapping. A new photocatalysis mechanism of Ag clusters-grafted BHS under visible light involving IFCT process was uncovered. In addition, the cationic Ag clusters-grafted BHS also demonstrated high photochemical and structural stability under repeated photocatalysis runs. The perspective of enhancing photocatalysis through combination of microstructural optimization and IFCT could provide a new avenue for the developing efficient visible light photocatalysts.

  15. Method for implementation of recursive hierarchical segmentation on parallel computers

    Science.gov (United States)

    Tilton, James C. (Inventor)

    2005-01-01

    A method, computer readable storage, and apparatus for implementing a recursive hierarchical segmentation algorithm on a parallel computing platform. The method includes setting a bottom level of recursion that defines where a recursive division of an image into sections stops dividing, and setting an intermediate level of recursion where the recursive division changes from a parallel implementation into a serial implementation. The segmentation algorithm is implemented according to the set levels. The method can also include setting a convergence check level of recursion with which the first level of recursion communicates with when performing a convergence check.

  16. A hierarchical method for molecular docking using cloud computing.

    Science.gov (United States)

    Kang, Ling; Guo, Quan; Wang, Xicheng

    2012-11-01

    Discovering small molecules that interact with protein targets will be a key part of future drug discovery efforts. Molecular docking of drug-like molecules is likely to be valuable in this field; however, the great number of such molecules makes the potential size of this task enormous. In this paper, a method to screen small molecular databases using cloud computing is proposed. This method is called the hierarchical method for molecular docking and can be completed in a relatively short period of time. In this method, the optimization of molecular docking is divided into two subproblems based on the different effects on the protein-ligand interaction energy. An adaptive genetic algorithm is developed to solve the optimization problem and a new docking program (FlexGAsDock) based on the hierarchical docking method has been developed. The implementation of docking on a cloud computing platform is then discussed. The docking results show that this method can be conveniently used for the efficient molecular design of drugs. Copyright © 2012 Elsevier Ltd. All rights reserved.

  17. Chromosomal Regions in Prostatic Carcinomas Studied by Comparative Genomic Hybridization, Hierarchical Cluster Analysis and Self-Organizing Feature Maps

    Directory of Open Access Journals (Sweden)

    Torsten Mattfeldt

    2002-01-01

    Full Text Available Comparative genomic hybridization (CGH is an established genetic method which enables a genome‐wide survey of chromosomal imbalances. For each chromosome region, one obtains the information whether there is a loss or gain of genetic material, or whether there is no change at that place. Therefore, large amounts of data quickly accumulate which must be put into a logical order. Cluster analysis can be used to assign individual cases (samples to different clusters of cases, which are similar and where each cluster may be related to a different tumour biology. Another approach consists in a clustering of chromosomal regions by rewriting the original data matrix, where the cases are written as rows and the chromosomal regions as columns, in a transposed form. In this paper we applied hierarchical cluster analysis as well as two implementations of self‐organizing feature maps as classical and neuronal tools for cluster analysis of CGH data from prostatic carcinomas to such transposed data sets. Self‐organizing maps are artificial neural networks with the capability to form clusters on the basis of an unsupervised learning rule. We studied a group of 48 cases of incidental carcinomas, a tumour category which has not been evaluated by CGH before. In addition we studied a group of 50 cases of pT2N0‐tumours and a group of 20 pT3N0‐carcinomas. The results show in all case groups three clusters of chromosomal regions, which are (i normal or minimally affected by losses and gains, (ii regions with many losses and few gains and (iii regions with many gains and few losses. Moreover, for the pT2N0‐ and pT3N0‐groups, it could be shown that the regions 6q, 8p and 13q lay all on the same cluster (associated with losses, and that the regions 9q and 20q belonged to the same cluster (associated with gains. For the incidental cancers such clear correlations could not be demonstrated.

  18. Analysis of the genetic divergence of soybean lines through hierarchical and Tocher optimization methods.

    Science.gov (United States)

    Cantelli, D A V; Hamawaki, O T; Rocha, M R; Nogueira, A P O; Hamawaki, R L; Sousa, L B; Hamawaki, C D L

    2016-10-05

    This study aimed to evaluate the clustering pattern consistency of soybean (Glycine max) lines, using seven different clustering methods. Our aim was to evaluate the best method for the identification of promising genotypes to obtain segregating populations. We used 51 generations F5 and F6 soybean lines originating from different hybridizations and backcrosses obtained from the soybean breeding program of Universidade Federal de Uberlândia in addition to three controls (Emgopa 302, BRSGO Luziânia, and MG/BR46 Conquista). We evaluated the following agronomic traits: number of days to flowering, number of days to maturity, height of the plant at maturity, insertion height of the first pod, grain yield, and weight of 100 seeds. The data was submitted to analyses of variance followed by average Euclidean distance matrix estimation used as measure of dissimilarity. Subsequently, clusters were formed using the Tocher method and dendrograms were constructed using the hierarchical methods simple connection (nearest neighbor), complete connection (most distant neighbor), Ward, median, average within cluster connection. The nearest neighbor method presented the largest number of genotypes in group I and showed the greatest similarity with the Tocher optimization method. The joint use of these two methodologies allows for differentiation of the most genetically distant genotypes that may constitute the optimal parents in a breeding program.

  19. Hierarchical Matrices Method and Its Application in Electromagnetic Integral Equations

    Directory of Open Access Journals (Sweden)

    Han Guo

    2012-01-01

    Full Text Available Hierarchical (H- matrices method is a general mathematical framework providing a highly compact representation and efficient numerical arithmetic. When applied in integral-equation- (IE- based computational electromagnetics, H-matrices can be regarded as a fast algorithm; therefore, both the CPU time and memory requirement are reduced significantly. Its kernel independent feature also makes it suitable for any kind of integral equation. To solve H-matrices system, Krylov iteration methods can be employed with appropriate preconditioners, and direct solvers based on the hierarchical structure of H-matrices are also available along with high efficiency and accuracy, which is a unique advantage compared to other fast algorithms. In this paper, a novel sparse approximate inverse (SAI preconditioner in multilevel fashion is proposed to accelerate the convergence rate of Krylov iterations for solving H-matrices system in electromagnetic applications, and a group of parallel fast direct solvers are developed for dealing with multiple right-hand-side cases. Finally, numerical experiments are given to demonstrate the advantages of the proposed multilevel preconditioner compared to conventional “single level” preconditioners and the practicability of the fast direct solvers for arbitrary complex structures.

  20. Clustering Methods with Qualitative Data: A Mixed Methods Approach for Prevention Research with Small Samples

    Science.gov (United States)

    Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.

    2016-01-01

    Qualitative methods potentially add depth to prevention research, but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data, but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-Means clustering, and latent class analysis produced similar levels of accuracy with binary data, and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities. PMID:25946969

  1. Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples.

    Science.gov (United States)

    Henry, David; Dymnicki, Allison B; Mohatt, Nathaniel; Allen, James; Kelly, James G

    2015-10-01

    Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a "real-world" example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.

  2. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    Science.gov (United States)

    Ellefsen, Karl J.; Smith, David

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples.

  3. Parallel iterative solvers and preconditioners using approximate hierarchical methods

    Energy Technology Data Exchange (ETDEWEB)

    Grama, A.; Kumar, V.; Sameh, A. [Univ. of Minnesota, Minneapolis, MN (United States)

    1996-12-31

    In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significant speedups. We demonstrate the excellent parallel efficiency and scalability of our solver. The combined speedups from approximation and parallelism represent an improvement of several orders in solution time. We also develop fast and paralellizable preconditioners for this problem. We report on the performance of an inner-outer scheme and a preconditioner based on truncated Green`s function. Experimental results on a 256 processor Cray T3D are presented.

  4. An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes

    Directory of Open Access Journals (Sweden)

    P. M. Booma

    2014-01-01

    Full Text Available Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC. Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.

  5. A Hierarchical Classification Method for Breast Tumor Detection

    Directory of Open Access Journals (Sweden)

    Mojtaba Mohammadpoor

    2016-12-01

    Full Text Available Introduction Breast cancer is the second cause of mortality among women. Early detection of it can enhance the chance of survival. Screening systems such as mammography cannot perfectly differentiate between patients and healthy individuals. Computer-aided diagnosis can help physicians make a more accurate diagnosis. Materials and Methods Regarding the importance of separating normal and abnormal cases in screening systems, a hierarchical classification system is defined in this paper. The proposed system is including two Adaptive Boosting (AdaBoost classifiers, the first classifier separates the candidate images into two groups of normal and abnormal. The second classifier is applied on the abnormal group of the previous stage and divides them into benign and malignant categories. The proposed algorithm is evaluated by applying it on publicly available  Mammographic Image Analysis Society (MIAS dataset. 288 images of the database are used, including 208  normal and 80 abnormal images. 47 images of the abnormal images showed benign lesion and 33 of them had malignant lesion.  Results Applying the proposed algorithm on MIAS database indicates its advantage compared to previous methods. A major improvement occurred in the first classification stage. Specificity, sensitivity, and accuracy of the first classifier are obtained as 100%, 95.83%, and 97.91%, respectively. These values are calculated as 75% in the second stage   Conclusion A hierarchical classification method for breast cancer detection is developed in this paper. Regarding the importance of separating normal and abnormal cases in screening systems, the first classifier is devoted to separate normal and tumorous cases. Experimental results on available database shown that the performance of this step is adequately high (100% specificity. The second layer is designed to detect tumor type.  The accuracy in the second layer is obtained 75%.

  6. Visual cluster analysis and pattern recognition methods

    Science.gov (United States)

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    2001-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  7. Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.

  8. Semiconductor hierarchically structured flower-like clusters for dye-sensitized solar cells with nearly 100% charge collection efficiency.

    Science.gov (United States)

    Xin, Xukai; Liu, Hsiang-Yu; Ye, Meidan; Lin, Zhiqun

    2013-11-21

    By combining the ease of producing ZnO nanoflowers with the advantageous chemical stability of TiO2, hierarchically structured hollow TiO2 flower-like clusters were yielded via chemical bath deposition (CBD) of ZnO nanoflowers, followed by their conversion into TiO2 flower-like clusters in the presence of TiO2 precursors. The effects of ZnO precursor concentration, precursor amount, and reaction time on the formation of ZnO nanoflowers were systematically explored. Dye-sensitized solar cells fabricated by utilizing these hierarchically structured ZnO and TiO2 flower clusters exhibited a power conversion efficiency of 1.16% and 2.73%, respectively, under 100 mW cm(-2) illumination. The intensity modulated photocurrent/photovoltage spectroscopy (IMPS/IMVS) studies suggested that flower-like structures had a fast electron transit time and their charge collection efficiency was nearly 100%.

  9. An Adaptive Method for Mining Hierarchical Spatial Co-location Patterns

    Directory of Open Access Journals (Sweden)

    CAI Jiannan

    2016-04-01

    Full Text Available Mining spatial co-location patterns plays a key role in spatial data mining. Spatial co-location patterns refer to subsets of features whose objects are frequently located in close geographic proximity. Due to spatial heterogeneity, spatial co-location patterns are usually not the same across geographic space. However, existing methods are mainly designed to discover global spatial co-location patterns, and not suitable for detecting regional spatial co-location patterns. On that account, an adaptive method for mining hierarchical spatial co-location patterns is proposed in this paper. Firstly, global spatial co-location patterns are detected and other non-prevalent co-location patterns are identified as candidate regional co-location patterns. Then, for each candidate pattern, adaptive spatial clustering method is used to delineate localities of that pattern in the study area, and participation ratio is utilized to measure the prevalence of the candidate co-location pattern. Finally, an overlap operation is developed to deduce localities of (k+1-size co-location patterns from localities of k-size co-location patterns. Experiments on both simulated and real-life datasets show that the proposed method is effective for detecting hierarchical spatial co-location patterns.

  10. A hierarchical network modeling method for railway tunnels safety assessment

    Science.gov (United States)

    Zhou, Jin; Xu, Weixiang; Guo, Xin; Liu, Xumin

    2017-02-01

    Using network theory to model risk-related knowledge on accidents is regarded as potential very helpful in risk management. A large amount of defects detection data for railway tunnels is collected in autumn every year in China. It is extremely important to discover the regularities knowledge in database. In this paper, based on network theories and by using data mining techniques, a new method is proposed for mining risk-related regularities to support risk management in railway tunnel projects. A hierarchical network (HN) model which takes into account the tunnel structures, tunnel defects, potential failures and accidents is established. An improved Apriori algorithm is designed to rapidly and effectively mine correlations between tunnel structures and tunnel defects. Then an algorithm is presented in order to mine the risk-related regularities table (RRT) from the frequent patterns. At last, a safety assessment method is proposed by consideration of actual defects and possible risks of defects gained from the RRT. This method cannot only generate the quantitative risk results but also reveal the key defects and critical risks of defects. This paper is further development on accident causation network modeling methods which can provide guidance for specific maintenance measure.

  11. How fast is mass segregation happening in hierarchically formed embedded star clusters?

    Science.gov (United States)

    Domínguez, R.; Fellhauer, M.; Blaña, M.; Farias, J. P.; Dabringhausen, J.

    2017-11-01

    We investigate the evolution of mass segregation in initially substructured young embedded star clusters with two different background potentials mimicking the gas. Our clusters are initially in virial or subvirial global states and have different initial distributions for the most massive stars: randomly placed, initially mass segregated or even inversely segregated. By means of N-body simulation, we follow their evolution for 5 Myr. We measure the mass segregation using the minimum spanning tree method ΛMSR and an equivalent restricted method. Despite this variety of different initial conditions, we find that our stellar distributions almost always settle very fast into a mass segregated and more spherical configuration, suggesting that once we see a spherical or nearly spherical embedded star cluster, we can be sure it is mass segregated no matter what the real initial conditions were. We, furthermore, report under which circumstances this process can be more rapid or delayed, respectively.

  12. CHIMERA: Top-down model for hierarchical, overlapping and directed cluster structures in directed and weighted complex networks

    Science.gov (United States)

    Franke, R.

    2016-11-01

    In many networks discovered in biology, medicine, neuroscience and other disciplines special properties like a certain degree distribution and hierarchical cluster structure (also called communities) can be observed as general organizing principles. Detecting the cluster structure of an unknown network promises to identify functional subdivisions, hierarchy and interactions on a mesoscale. It is not trivial choosing an appropriate detection algorithm because there are multiple network, cluster and algorithmic properties to be considered. Edges can be weighted and/or directed, clusters overlap or build a hierarchy in several ways. Algorithms differ not only in runtime, memory requirements but also in allowed network and cluster properties. They are based on a specific definition of what a cluster is, too. On the one hand, a comprehensive network creation model is needed to build a large variety of benchmark networks with different reasonable structures to compare algorithms. On the other hand, if a cluster structure is already known, it is desirable to separate effects of this structure from other network properties. This can be done with null model networks that mimic an observed cluster structure to improve statistics on other network features. A third important application is the general study of properties in networks with different cluster structures, possibly evolving over time. Currently there are good benchmark and creation models available. But what is left is a precise sandbox model to build hierarchical, overlapping and directed clusters for undirected or directed, binary or weighted complex random networks on basis of a sophisticated blueprint. This gap shall be closed by the model CHIMERA (Cluster Hierarchy Interconnection Model for Evaluation, Research and Analysis) which will be introduced and described here for the first time.

  13. Applying Hierarchical Task Analysis Method to Discovery Layer Evaluation

    Directory of Open Access Journals (Sweden)

    Marlen Promann

    2015-03-01

    Full Text Available Libraries are implementing discovery layers to offer better user experiences. While usability tests have been helpful in evaluating the success or failure of implementing discovery layers in the library context, the focus has remained on its relative interface benefits over the traditional federated search. The informal site- and context specific usability tests have offered little to test the rigor of the discovery layers against the user goals, motivations and workflow they have been designed to support. This study proposes hierarchical task analysis (HTA as an important complementary evaluation method to usability testing of discovery layers. Relevant literature is reviewed for the discovery layers and the HTA method. As no previous application of HTA to the evaluation of discovery layers was found, this paper presents the application of HTA as an expert based and workflow centered (e.g. retrieving a relevant book or a journal article method to evaluating discovery layers. Purdue University’s Primo by Ex Libris was used to map eleven use cases as HTA charts. Nielsen’s Goal Composition theory was used as an analytical framework to evaluate the goal carts from two perspectives: a users’ physical interactions (i.e. clicks, and b user’s cognitive steps (i.e. decision points for what to do next. A brief comparison of HTA and usability test findings is offered as a way of conclusion.

  14. Local Approximation and Hierarchical Methods for Stochastic Optimization

    Science.gov (United States)

    Cheng, Bolong

    In this thesis, we present local and hierarchical approximation methods for two classes of stochastic optimization problems: optimal learning and Markov decision processes. For the optimal learning problem class, we introduce a locally linear model with radial basis function for estimating the posterior mean of the unknown objective function. The method uses a compact representation of the function which avoids storing the entire history, as is typically required by nonparametric methods. We derive a knowledge gradient policy with the locally parametric model, which maximizes the expected value of information. We show the policy is asymptotically optimal in theory, and experimental works suggests that the method can reliably find the optimal solution on a range of test functions. For the Markov decision processes problem class, we are motivated by an application where we want to co-optimize a battery for multiple revenue, in particular energy arbitrage and frequency regulation. The nature of this problem requires the battery to make charging and discharging decisions at different time scales while accounting for the stochastic information such as load demand, electricity prices, and regulation signals. Computing the exact optimal policy becomes intractable due to the large state space and the number of time steps. We propose two methods to circumvent the computation bottleneck. First, we propose a nested MDP model that structure the co-optimization problem into smaller sub-problems with reduced state space. This new model allows us to understand how the battery behaves down to the two-second dynamics (that of the frequency regulation market). Second, we introduce a low-rank value function approximation for backward dynamic programming. This new method only requires computing the exact value function for a small subset of the state space and approximate the entire value function via low-rank matrix completion. We test these methods on historical price data from the

  15. A Technique of Two-Stage Clustering Applied to Environmental and Civil Engineering and Related Methods of Citation Analysis.

    Science.gov (United States)

    Miyamoto, S.; Nakayama, K.

    1983-01-01

    A method of two-stage clustering of literature based on citation frequency is applied to 5,065 articles from 57 journals in environmental and civil engineering. Results of related methods of citation analysis (hierarchical graph, clustering of journals, multidimensional scaling) applied to same set of articles are compared. Ten references are…

  16. Using hierarchical clustering of secreted protein families to classify and rank candidate effectors of rust fungi.

    Science.gov (United States)

    Saunders, Diane G O; Win, Joe; Cano, Liliana M; Szabo, Les J; Kamoun, Sophien; Raffaele, Sylvain

    2012-01-01

    Rust fungi are obligate biotrophic pathogens that cause considerable damage on crop plants. Puccinia graminis f. sp. tritici, the causal agent of wheat stem rust, and Melampsora larici-populina, the poplar leaf rust pathogen, have strong deleterious impacts on wheat and poplar wood production, respectively. Filamentous pathogens such as rust fungi secrete molecules called disease effectors that act as modulators of host cell physiology and can suppress or trigger host immunity. Current knowledge on effectors from other filamentous plant pathogens can be exploited for the characterisation of effectors in the genome of recently sequenced rust fungi. We designed a comprehensive in silico analysis pipeline to identify the putative effector repertoire from the genome of two plant pathogenic rust fungi. The pipeline is based on the observation that known effector proteins from filamentous pathogens have at least one of the following properties: (i) contain a secretion signal, (ii) are encoded by in planta induced genes, (iii) have similarity to haustorial proteins, (iv) are small and cysteine rich, (v) contain a known effector motif or a nuclear localization signal, (vi) are encoded by genes with long intergenic regions, (vii) contain internal repeats, and (viii) do not contain PFAM domains, except those associated with pathogenicity. We used Markov clustering and hierarchical clustering to classify protein families of rust pathogens and rank them according to their likelihood of being effectors. Using this approach, we identified eight families of candidate effectors that we consider of high value for functional characterization. This study revealed a diverse set of candidate effectors, including families of haustorial expressed secreted proteins and small cysteine-rich proteins. This comprehensive classification of candidate effectors from these devastating rust pathogens is an initial step towards probing plant germplasm for novel resistance components.

  17. Using hierarchical clustering of secreted protein families to classify and rank candidate effectors of rust fungi.

    Directory of Open Access Journals (Sweden)

    Diane G O Saunders

    Full Text Available Rust fungi are obligate biotrophic pathogens that cause considerable damage on crop plants. Puccinia graminis f. sp. tritici, the causal agent of wheat stem rust, and Melampsora larici-populina, the poplar leaf rust pathogen, have strong deleterious impacts on wheat and poplar wood production, respectively. Filamentous pathogens such as rust fungi secrete molecules called disease effectors that act as modulators of host cell physiology and can suppress or trigger host immunity. Current knowledge on effectors from other filamentous plant pathogens can be exploited for the characterisation of effectors in the genome of recently sequenced rust fungi. We designed a comprehensive in silico analysis pipeline to identify the putative effector repertoire from the genome of two plant pathogenic rust fungi. The pipeline is based on the observation that known effector proteins from filamentous pathogens have at least one of the following properties: (i contain a secretion signal, (ii are encoded by in planta induced genes, (iii have similarity to haustorial proteins, (iv are small and cysteine rich, (v contain a known effector motif or a nuclear localization signal, (vi are encoded by genes with long intergenic regions, (vii contain internal repeats, and (viii do not contain PFAM domains, except those associated with pathogenicity. We used Markov clustering and hierarchical clustering to classify protein families of rust pathogens and rank them according to their likelihood of being effectors. Using this approach, we identified eight families of candidate effectors that we consider of high value for functional characterization. This study revealed a diverse set of candidate effectors, including families of haustorial expressed secreted proteins and small cysteine-rich proteins. This comprehensive classification of candidate effectors from these devastating rust pathogens is an initial step towards probing plant germplasm for novel resistance components.

  18. [Hierarchical clustering analysis to detect associations between clinical and pathological features of gastric tumors and hypermethylation of suppressor genes].

    Science.gov (United States)

    Zavala G, Luis; Luengo J, Víctor; Ossandón C, Francisco; Riquelme S, Erick; Backhouse E, Claudia; Palma V, Mariana; Argandoña C, Jorge; Cumsille, Miguel Angel; Corvalán R, Alejandro

    2007-01-01

    Methylation is an inactivation mechanism for tumor suppressor genes, that can have important clinical implications. To analyze the methylation status of 11 tumor suppressor genes in pathological samples of diffuse gastric cancer. Eighty three patients with diffuse gastric cancer with information about survival and infection with Epstein Barr virus, were studied. DNA was extracted from pathological slides and the methylation status of genes p14, p15, p16, APC, p73, FHIT, E-cadherin, SEMA3B, BRCA-1, MINT-2 y MGMT, was studied using sodium bisulphite modification and polymerase chain reaction. Results were grouped according to the methylation index or Hierarchical clustering (TIGR MultiExperiment Viewer). Three genes had a high frequency of methylation (FHIT, BRCA1, APC), four had an intermediate frequency (p15, MGMT, p14, MINT2) and four had a low frequency (p16, p73, E-cadherin, SEMA3B). The methylation index had no association with clinical or pathological features of tumors or patients survival. Hierarchical clustering generated two clusters. One grouped clinical and pathological features with FHIT, BRCA1, and APC and the other grouped the other eight genes and Epstein Barr virus infection. Two significant associations were found, between APC and survival and p16/p14 and Epstein Barr virus infection. Hierarchical clustering is a tool that identifies associations between clinical and pathological features of tumors and methylation of tumor suppressor genes.

  19. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2012-01-01

    Full Text Available Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

  20. The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    Chen Yidong

    2004-01-01

    Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.

  1. Quantum Monte Carlo methods and lithium cluster properties. [Atomic clusters

    Energy Technology Data Exchange (ETDEWEB)

    Owen, R.K.

    1990-12-01

    Properties of small lithium clusters with sizes ranging from n = 1 to 5 atoms were investigated using quantum Monte Carlo (QMC) methods. Cluster geometries were found from complete active space self consistent field (CASSCF) calculations. A detailed development of the QMC method leading to the variational QMC (V-QMC) and diffusion QMC (D-QMC) methods is shown. The many-body aspect of electron correlation is introduced into the QMC importance sampling electron-electron correlation functions by using density dependent parameters, and are shown to increase the amount of correlation energy obtained in V-QMC calculations. A detailed analysis of D-QMC time-step bias is made and is found to be at least linear with respect to the time-step. The D-QMC calculations determined the lithium cluster ionization potentials to be 0.1982(14) (0.1981), 0.1895(9) (0.1874(4)), 0.1530(34) (0.1599(73)), 0.1664(37) (0.1724(110)), 0.1613(43) (0.1675(110)) Hartrees for lithium clusters n = 1 through 5, respectively; in good agreement with experimental results shown in the brackets. Also, the binding energies per atom was computed to be 0.0177(8) (0.0203(12)), 0.0188(10) (0.0220(21)), 0.0247(8) (0.0310(12)), 0.0253(8) (0.0351(8)) Hartrees for lithium clusters n = 2 through 5, respectively. The lithium cluster one-electron density is shown to have charge concentrations corresponding to nonnuclear attractors. The overall shape of the electronic charge density also bears a remarkable similarity with the anisotropic harmonic oscillator model shape for the given number of valence electrons.

  2. Typing of unknown microorganisms based on quantitative analysis of fatty acids by mass spectrometry and hierarchical clustering

    Energy Technology Data Exchange (ETDEWEB)

    Li Tingting; Dai Ling; Li Lun; Hu Xuejiao; Dong Linjie; Li Jianjian; Salim, Sule Khalfan; Fu Jieying [Key Laboratory of Pesticides and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, Hubei 430079 (China); Zhong Hongying, E-mail: hyzhong@mail.ccnu.edu.cn [Key Laboratory of Pesticides and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, Hubei 430079 (China)

    2011-01-17

    Rapid identification of unknown microorganisms of clinical and agricultural importance is not only critical for accurate diagnosis of infections but also essential for appropriate and prompt treatment. We describe here a rapid method for microorganisms typing based on quantitative analysis of fatty acids by iFAT approach (Isotope-coded Fatty Acid Transmethylation). In this work, lyophilized cell lysates were directly mixed with 0.5 M NaOH solution in d3-methanol and n-hexane. After 1 min of ultrasonication, the top n-hexane layer was combined with a mixture of standard d0-methanol derived fatty acid methylesters with known concentration. Measurement of intensity ratios of d3/d0 labeled fragment ion and molecular ion pairs at the corresponding target fatty acids provides a quantitative basis for hierarchical clustering. In the resultant dendrogram, the Euclidean distance between unknown species and known species quantitatively reveals their differences or shared similarities in fatty acid related pathways. It is of particular interest to apply this method for typing fungal species because fungi has distinguished lipid biosynthetic pathways that have been targeted for lots of drugs or fungicides compared with bacteria and animals. The proposed method has no dependence on the availability of genome or proteome databases. Therefore, it is can be applicable for a broad range of unknown microorganisms or mutant species.

  3. A hierarchical method for whole-brain connectivity-based parcellation.

    Science.gov (United States)

    Moreno-Dominguez, David; Anwander, Alfred; Knösche, Thomas R

    2014-10-01

    In modern neuroscience there is general agreement that brain function relies on networks and that connectivity is therefore of paramount importance for brain function. Accordingly, the delineation of functional brain areas on the basis of diffusion magnetic resonance imaging (dMRI) and tractography may lead to highly relevant brain maps. Existing methods typically aim to find a predefined number of areas and/or are limited to small regions of grey matter. However, it is in general not likely that a single parcellation dividing the brain into a finite number of areas is an adequate representation of the function-anatomical organization of the brain. In this work, we propose hierarchical clustering as a solution to overcome these limitations and achieve whole-brain parcellation. We demonstrate that this method encodes the information of the underlying structure at all granularity levels in a hierarchical tree or dendrogram. We develop an optimal tree building and processing pipeline that reduces the complexity of the tree with minimal information loss. We show how these trees can be used to compare the similarity structure of different subjects or recordings and how to extract parcellations from them. Our novel approach yields a more exhaustive representation of the real underlying structure and successfully tackles the challenge of whole-brain parcellation. Copyright © 2014 Wiley Periodicals, Inc.

  4. Discovering hierarchical structure in normal relational data

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Herlau, Tue; Mørup, Morten

    2014-01-01

    Hierarchical clustering is a widely used tool for structuring and visualizing complex data using similarity. Traditionally, hierarchical clustering is based on local heuristics that do not explicitly provide assessment of the statistical saliency of the extracted hierarchy. We propose a non......-parametric generative model for hierarchical clustering of similarity based on multifurcating Gibbs fragmentation trees. This allows us to infer and display the posterior distribution of hierarchical structures that comply with the data. We demonstrate the utility of our method on synthetic data and data of functional...

  5. Comparing the performance of biomedical clustering methods

    DEFF Research Database (Denmark)

    Wiwie, Christian; Baumbach, Jan; Röttger, Richard

    2015-01-01

    expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future....... This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide......-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art....

  6. Grinding Wheel Condition Monitoring with Hidden Markov Model-Based Clustering Methods

    Energy Technology Data Exchange (ETDEWEB)

    Liao, T. W. [Louisiana State University; Hua, G [Louisiana State University; Qu, Jun [ORNL; Blau, Peter Julian [ORNL

    2006-01-01

    Hidden Markov model (HMM) is well known for sequence modeling and has been used for condition monitoring. However, HMM-based clustering methods are developed only recently. This article proposes a HMM-based clustering method for monitoring the condition of grinding wheel used in grinding operations. The proposed method first extract features from signals based on discrete wavelet decomposition using a moving window approach. It then generates a distance (dissimilarity) matrix using HMM. Based on this distance matrix several hierarchical and partitioning-based clustering algorithms are applied to obtain clustering results. The proposed methodology was tested with feature sequences extracted from acoustic emission signals. The results show that clustering accuracy is dependent upon cutting condition. Higher material removal rate seems to produce more discriminatory signals/features than lower material removal rate. The effect of window size, wavelet decomposition level, wavelet basis, clustering algorithm, and data normalization were also studied.

  7. A Resting-State Brain Functional Network Study in MDD Based on Minimum Spanning Tree Analysis and the Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Xiaowei Li

    2017-01-01

    Full Text Available A large number of studies demonstrated that major depressive disorder (MDD is characterized by the alterations in brain functional connections which is also identifiable during the brain’s “resting-state.” But, in the present study, the approach of constructing functional connectivity is often biased by the choice of the threshold. Besides, more attention was paid to the number and length of links in brain networks, and the clustering partitioning of nodes was unclear. Therefore, minimum spanning tree (MST analysis and the hierarchical clustering were first used for the depression disease in this study. Resting-state electroencephalogram (EEG sources were assessed from 15 healthy and 23 major depressive subjects. Then the coherence, MST, and the hierarchical clustering were obtained. In the theta band, coherence analysis showed that the EEG coherence of the MDD patients was significantly higher than that of the healthy controls especially in the left temporal region. The MST results indicated the higher leaf fraction in the depressed group. Compared with the normal group, the major depressive patients lost clustering in frontal regions. Our findings suggested that there was a stronger brain interaction in the MDD group and a left-right functional imbalance in the frontal regions for MDD controls.

  8. Using Hierarchical Clustering in Order to Increase Efficiency of Self-Organizing Feature Map to Identify Hydrological Homogeneous Regions for Flood Estimation

    Directory of Open Access Journals (Sweden)

    F. Farsadnia

    2017-01-01

    Full Text Available Introduction: Hydrologic homogeneous group identification is considered both fundamental and applied research in hydrology. Clustering methods are among conventional methods to assess the hydrological homogeneous regions. Recently, Self Organizing feature Map (SOM method has been applied in some studies. However, the main problem of this method is the interpretation on the output map of this approach. Therefore, SOM is used as input to other clustering algorithms. The aim of this study is to apply a two-level Self-Organizing feature map and Ward hierarchical clustering method to determine the hydrologic homogenous regions in North and Razavi Khorasan provinces. Materials and Methods: SOM approximates the probability density function of input data through an unsupervised learning algorithm, and is not only an effective method for clustering, but also for the visualization and abstraction of complex data. The algorithm has properties of neighborhood preservation and local resolution of the input space proportional to the data distribution. A SOM consists of two layers: an input layer formed by a set of nodes and an output layer formed by nodes arranged in a two-dimensional grid. In this study we used SOM for visualization and clustering of watersheds based on physiographical data in North and Razavi Khorasan provinces. In the next step, SOM weight vectors were used to classify the units by Ward’s Agglomerative hierarchical clustering (Ward methods. Ward’s algorithm is a frequently used technique for regionalization studies in hydrology and climatology. It is based on the assumption that if two clusters are merged, the resulting loss of information, or change in the value of objective function, will depend only on the relationship between the two merged clusters and not on the relationships with any other clusters. After the formation of clusters by SOM and Ward, the most frequently applied tests of regional homogeneity based on the theory of L

  9. Multimorbidity and health-related quality of life (HRQoL) in a nationally representative population sample: implications of count versus cluster method for defining multimorbidity on HRQoL

    National Research Council Canada - National Science Library

    Wang, Lili; Palmer, Andrew J; Cocker, Fiona; Sanderson, Kristy

    2017-01-01

    ...). The HRQoL scores were measured using the Assessment of Quality of Life (AQoL-4D) instrument. The simple count (2+ & 3+ conditions) and hierarchical cluster methods were used to define/identify clusters of multimorbidity...

  10. "Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping out and Hierarchical Cluster Analysis"

    Directory of Open Access Journals (Sweden)

    Alex J. Bowers

    2010-05-01

    Full Text Available School personnel currently lack an effective method to pattern and visually interpret disaggregated achievement data collected on students as a means to help inform decision making. This study, through the examination of longitudinal K-12 teacher assigned grading histories for entire cohorts of students from a school district (n=188, demonstrates a novel application of hierarchical cluster analysis and pattern visualization in which all data points collected on every student in a cohort can be patterned, visualized and interpreted to aid in data driven decision making by teachers and administrators. Additionally, as a proof-of-concept study, overall schooling outcomes, such as student dropout or taking a college entrance exam, are identified from the data patterns and compared to past methods of dropout identification as one example of the usefulness of the method. Hierarchical cluster analysis correctly identified over 80% of the students who dropped out using the entire student grade history patterns from either K-12 or K-8.

  11. The polarizable embedding coupled cluster method

    DEFF Research Database (Denmark)

    Sneskov, Kristian; Schwabe, Tobias; Kongsted, Jacob

    2011-01-01

    We formulate a new combined quantum mechanics/molecular mechanics (QM/MM) method based on a self-consistent polarizable embedding (PE) scheme. For the description of the QM region, we apply the popular coupled cluster (CC) method detailing the inclusion of electrostatic and polarization effects...

  12. Simultaneous Determination of Four Triterpenoid Saponins in Aralia elata Leaves by HPLC-ELSD Combined with Hierarchical Clustering Analysis.

    Science.gov (United States)

    Sun, Yichun; Li, Baimei; Lin, Xiaoting; Xue, Juan; Wang, Zhibin; Zhang, Hongwei; Jiang, Hai; Wang, Qiuhong; Kuang, Haixue

    2017-05-01

    Aralia elata leaves are known to have several biological activities, including anti-arrythmia, antitumor, anti-inflammatory, anti-fatigue, antimicrobial and antiviral effects. Our previous study found that triterpenoid saponins from the leaves of A. elata had antitumor effects. Quantification of the triterpenoids is important for the quality control of A. elata leaves. To establish high-performance liquid chromatography coupled with evaporative light scattering detection (HPLC-ELSD) for the simultaneous determination of four major triterpenoid saponins, including Aralia-saponin IV, Aralia-saponin VI, 3-O-β-d- glucopyranosyl-(1 → 3)-β-d-glucopyranosyl-(1 → 3)-β-d-glucopyranosyl oleanolic acid 28-O-β-d-glucopyranoside (Aralia-saponin TTP)and Aralia-saponin V. The separation was carried out on a Dikma Diamonsil C 18 column (4.6 mm × 250 mm, 5 μm) efficiently with gradient elution consisting of acetonitrile and water. All calibration curves showed good linear regression (R 2  > 0.9996) within the ranges of tested concentrations. This validated method was applied to determine the contents of the four major triterpenoid saponins in 53 samples from different regions of northeast China. Hierarchical clustering analysis was first used to classify and differentiate Aralia elata leaves. The method developed was successfully applied to analyse four major triterpenoid saponins in Aralia elata leaves which is helpful for quality control of the herb. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  13. METHOD OF CONSTRUCTION OF GENETIC DATA CLUSTERS

    Directory of Open Access Journals (Sweden)

    N. A. Novoselova

    2016-01-01

    Full Text Available The paper presents a method of construction of genetic data clusters (functional modules using the randomized matrices. To build the functional modules the selection and analysis of the eigenvalues of the gene profiles correlation matrix is performed. The principal components, corresponding to the eigenvalues, which are significantly different from those obtained for the randomly generated correlation matrix, are used for the analysis. Each selected principal component forms gene cluster. In a comparative experiment with the analogs the proposed method shows the advantage in allocating statistically significant different-sized clusters, the ability to filter non- informative genes and to extract the biologically interpretable functional modules matching the real data structure.

  14. Dynamic Hierarchical Energy-Efficient Method Based on Combinatorial Optimization for Wireless Sensor Networks.

    Science.gov (United States)

    Chang, Yuchao; Tang, Hongying; Cheng, Yongbo; Zhao, Qin; Yuan, Baoqing Li andXiaobing

    2017-07-19

    Routing protocols based on topology control are significantly important for improving network longevity in wireless sensor networks (WSNs). Traditionally, some WSN routing protocols distribute uneven network traffic load to sensor nodes, which is not optimal for improving network longevity. Differently to conventional WSN routing protocols, we propose a dynamic hierarchical protocol based on combinatorial optimization (DHCO) to balance energy consumption of sensor nodes and to improve WSN longevity. For each sensor node, the DHCO algorithm obtains the optimal route by establishing a feasible routing set instead of selecting the cluster head or the next hop node. The process of obtaining the optimal route can be formulated as a combinatorial optimization problem. Specifically, the DHCO algorithm is carried out by the following procedures. It employs a hierarchy-based connection mechanism to construct a hierarchical network structure in which each sensor node is assigned to a special hierarchical subset; it utilizes the combinatorial optimization theory to establish the feasible routing set for each sensor node, and takes advantage of the maximum-minimum criterion to obtain their optimal routes to the base station. Various results of simulation experiments show effectiveness and superiority of the DHCO algorithm in comparison with state-of-the-art WSN routing algorithms, including low-energy adaptive clustering hierarchy (LEACH), hybrid energy-efficient distributed clustering (HEED), genetic protocol-based self-organizing network clustering (GASONeC), and double cost function-based routing (DCFR) algorithms.

  15. Dynamic Hierarchical Energy-Efficient Method Based on Combinatorial Optimization for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Yuchao Chang

    2017-07-01

    Full Text Available Routing protocols based on topology control are significantly important for improving network longevity in wireless sensor networks (WSNs. Traditionally, some WSN routing protocols distribute uneven network traffic load to sensor nodes, which is not optimal for improving network longevity. Differently to conventional WSN routing protocols, we propose a dynamic hierarchical protocol based on combinatorial optimization (DHCO to balance energy consumption of sensor nodes and to improve WSN longevity. For each sensor node, the DHCO algorithm obtains the optimal route by establishing a feasible routing set instead of selecting the cluster head or the next hop node. The process of obtaining the optimal route can be formulated as a combinatorial optimization problem. Specifically, the DHCO algorithm is carried out by the following procedures. It employs a hierarchy-based connection mechanism to construct a hierarchical network structure in which each sensor node is assigned to a special hierarchical subset; it utilizes the combinatorial optimization theory to establish the feasible routing set for each sensor node, and takes advantage of the maximum–minimum criterion to obtain their optimal routes to the base station. Various results of simulation experiments show effectiveness and superiority of the DHCO algorithm in comparison with state-of-the-art WSN routing algorithms, including low-energy adaptive clustering hierarchy (LEACH, hybrid energy-efficient distributed clustering (HEED, genetic protocol-based self-organizing network clustering (GASONeC, and double cost function-based routing (DCFR algorithms.

  16. Message clusterization method based on archive transformation

    Directory of Open Access Journals (Sweden)

    Олексій Олександрович Сірий

    2015-06-01

    Full Text Available This article represents the method of the text’s parameters identification and their classification with the help of archiving. Using the direct bond between the archiving with LZ77 and Huffman algorithm and entropy, the text’s characteristics are identified, and they help to define its language, style, authorship, and cluster data files by their topic relevance

  17. Prediction and characterization of P-glycoprotein substrates potentially bound to different sites by emerging chemical pattern and hierarchical cluster analysis.

    Science.gov (United States)

    Pan, Xianchao; Mei, Hu; Qu, Sujun; Huang, Shuheng; Sun, Jiaying; Yang, Li; Chen, Hua

    2016-04-11

    P-glycoprotein (P-gp), an ATP-binding cassette (ABC) multidrug transporter, can actively transport a broad spectrum of chemically diverse substrates out of cells and is heavily involved in multidrug resistance (MDR) in tumors. So far, the multiple specific binding sites remain a major obstacle in developing an efficient prediction method for P-gp substrates. Herein, emerging chemical pattern (ECP) combined by hierarchical cluster analysis was utilized to predict P-gp substrates as well as their potential binding sites. An optimal ECP model using only 3 descriptors was established with prediction accuracies of 0.80, 0.81 and 0.74 for 803 training samples, 120 test samples, and 179 independent validation samples, respectively. Hierarchical cluster analysis (HCA) of the ECPs of P-gp substrates derived 2 distinct ECP groups (ECPGs). Interestingly, HCA of the P-gp substrates based on ECP similarities also showed 2 distinct classes, which happened to be dominated by the 2 ECPGs, respectively. In the light of available experimental proofs and molecular docking results, the 2 distinct ECPGs were proved to be closely related to the binding profiles of R- and H-site substrates, respectively. The present study demonstrates, for the first time, a successful ECP model, which can not only accurately predict P-gp substrates, but also identify their potential substrate-binding sites. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Recent progress in the direct synthesis of hierarchical zeolites: synthetic strategies and characterization methods

    KAUST Repository

    Liu, Zhaohui

    2017-06-16

    Hierarchically structured zeolites combine the merits of microporous zeolites and mesoporous materials to offer enhanced molecular diffusion and mass transfer without compromising the inherent catalytic activities and selectivity of zeolites. This short review gives an introduction to the synthesis strategies for hierarchically structured zeolites with emphasis on the latest progress in the route of ‘direct synthesis’ using various templates. Several characterization methods that allow us to evaluate the ‘quality’ of complex porous structures are also introduced. At the end of this review, an outlook is given to discuss some critical issues and challenges regarding the development of novel hierarchically structured zeolites as well as their applications.

  19. Multilook SAR Image Segmentation with an Unknown Number of Clusters Using a Gamma Mixture Model and Hierarchical Clustering.

    Science.gov (United States)

    Zhao, Quanhua; Li, Xiaoli; Li, Yu

    2017-05-12

    This paper presents a novel multilook SAR image segmentation algorithm with an unknown number of clusters. Firstly, the marginal probability distribution for a given SAR image is defined by a Gamma mixture model (GaMM), in which the number of components corresponds to the number of homogeneous regions needed to segment and the spatial relationship among neighboring pixels is characterized by a Markov Random Field (MRF) defined by the weighting coefficients of components in GaMM. During the algorithm iteration procedure, the number of clusters is gradually reduced by merging two components until they are equal to one. For each fixed number of clusters, the parameters of GaMM are estimated and the optimal segmentation result corresponding to the number is obtained by maximizing the marginal probability. Finally, the number of clusters with minimum global energy defined as the negative logarithm of marginal probability is indicated as the expected number of clusters with the homogeneous regions needed to be segmented, and the corresponding segmentation result is considered as the final optimal one. The experimental results from the proposed and comparing algorithms for simulated and real multilook SAR images show that the proposed algorithm can find the real number of clusters and obtain more accurate segmentation results simultaneously.

  20. Hierarchical cluster analysis applied to workers' exposures in fiberglass insulation manufacturing.

    Science.gov (United States)

    Wu, J D; Milton, D K; Hammond, S K; Spear, R C

    1999-01-01

    The objectives of this study were to explore the application of cluster analysis to the characterization of multiple exposures in industrial hygiene practice and to compare exposure groupings based on the result from cluster analysis with that based on non-measurement-based approaches commonly used in epidemiology. Cluster analysis was performed for 37 workers simultaneously exposed to three agents (endotoxin, phenolic compounds and formaldehyde) in fiberglass insulation manufacturing. Different clustering algorithms, including complete-linkage (or farthest-neighbor), single-linkage (or nearest-neighbor), group-average and model-based clustering approaches, were used to construct the tree structures from which clusters can be formed. Differences were observed between the exposure clusters constructed by these different clustering algorithms. When contrasting the exposure classification based on tree structures with that based on non-measurement-based information, the results indicate that the exposure clusters identified from the tree structures had little in common with the classification results from either the traditional exposure zone or the work group classification approach. In terms of the defining homogeneous exposure groups or from the standpoint of health risk, some toxicological normalization in the components of the exposure vector appears to be required in order to form meaningful exposure groupings from cluster analysis. Finally, it remains important to see if the lack of correspondence between exposure groups based on epidemiological classification and measurement data is a peculiarity of the data or a more general problem in multivariate exposure analysis.

  1. Controllable synthesis of hierarchical strontium molybdate by sonochemical method

    Energy Technology Data Exchange (ETDEWEB)

    Jiang, Wanquan; Zhu, Wei [Department of Chemistry, University of Science and Technology of China (USTC), Hefei 230026 (China); Peng, Chao; Yang, Fan; Xuan, Shouhu; Gong, Xinglong [CAS Key Laboratory of Mechanical Behavior and Design of Materials, Department of Modern Mechanics, USTC, Hefei 230027 (China)

    2012-09-15

    Large-scale chrysanthemum-like strontium molybdate (SrMoO{sub 4}) with hierarchical structure has been successfully synthesized via a facile and fast ultrasound irradiation approach at room temperature. By varying the experimental conditions, SrMoO{sub 4} with different morphologies, such as spindles, peanuts, spheres, and rods, can be obtained. The products are characterized by X-ray diffraction (XRD), scanning electron microscopy (SEM), transmission electron microscopy (TEM), and selected-area electron diffraction (SAED). The influent parameters including concentration, pH value, and surfactants have been investigated. A possible growth mechanism is proposed and the shape evolution of the products is characterized. The as-prepared chrysanthemum-like SrMoO{sub 4} particles are used as the precursor for electrorheological fluid and their electrorheological property is investigated. (Copyright copyright 2012 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim)

  2. A method for identifying hierarchical sub-networks / modules and weighting network links based on their similarity in sub-network / module affiliation

    Directory of Open Access Journals (Sweden)

    WenJun Zhang

    2016-06-01

    Full Text Available Some networks, including biological networks, consist of hierarchical sub-networks / modules. Based on my previous study, in present study a method for both identifying hierarchical sub-networks / modules and weighting network links is proposed. It is based on the cluster analysis in which between-node similarity in sets of adjacency nodes is used. Two matrices, linkWeightMat and linkClusterIDs, are achieved by using the algorithm. Two links with both the same weight in linkWeightMat and the same cluster ID in linkClusterIDs belong to the same sub-network / module. Two links with the same weight in linkWeightMat but different cluster IDs in linkClusterIDs belong to two sub-networks / modules at the same hirarchical level. However, a link with an unique cluster ID in linkClusterIDs does not belong to any sub-networks / modules. A sub-network / module of the greater weight is the more connected sub-network / modules. Matlab codes of the algorithm are presented.

  3. Biomolecule-assisted hydrothermal synthesis and self-assembly of Bi2Te3 nanostring-cluster hierarchical structure.

    Science.gov (United States)

    Mi, Jian-Li; Lock, Nina; Sun, Ting; Christensen, Mogens; Søndergaard, Martin; Hald, Peter; Hng, Huey H; Ma, Jan; Iversen, Bo B

    2010-05-25

    A simple biomolecule-assisted hydrothermal approach has been developed for the fabrication of Bi(2)Te(3) thermoelectric nanomaterials. The product has a nanostring-cluster hierarchical structure which is composed of ordered and aligned platelet-like crystals. The platelets are approximately 100 nm in diameter and only approximately 10 nm thick even though a high reaction temperature of 220 degrees C and a long reaction time of 24 h were applied to prepare the sample. The growth of the Bi(2)Te(3) hierarchical structure appears to be a self-assembly process. Initially, Te nanorods are formed using alginic acid as both reductant and template. Subsequently, Bi(2)Te(3) grows in a certain direction on the surface of the Te rods, resulting in the nanostring structure. The nanostrings further recombine side-by-side with each other to achieve the ordered nanostring clusters. The particle size and morphology can be controlled by adjusting the concentration of NaOH, which plays a crucial role on the formation mechanism of Bi(2)Te(3). An even smaller polycrystalline Bi(2)Te(3) superstructure composed of polycrystalline nanorods with some nanoplatelets attached to the nanorods is achieved at lower NaOH concentration. The room temperature thermoelectric properties have been evaluated with an average Seebeck coefficient of -172 microV K(-1), an electrical resistivity of 1.97 x 10(-3) Omegam, and a thermal conductivity of 0.29 W m(-1) K(-1).

  4. Fuzzy Clustering - Principles, Methods and Examples

    DEFF Research Database (Denmark)

    Kroszynski, Uri; Zhou, Jianjun

    1998-01-01

    One of the most remarkable advances in the field of identification and control of systems -in particular mechanical systems- whose behaviour can not be described by means of the usual mathematical models, has been achieved by the application of methods of fuzzy theory.In the framework of a study ...... of the methods. The examples were solved by hand and served as a test bench for exploration of the MATLAB capabilities included in the Fuzzy Control Toolbox. The fuzzy clustering methods described include Fuzzy c-means (FCM), Fuzzy c-lines (FCL) and Fuzzy c-elliptotypes (FCE)....

  5. Applying Clustering Methods in Drawing Maps of Science: Case Study of the Map For Urban Management Science

    Directory of Open Access Journals (Sweden)

    Mohammad Abuei Ardakan

    2010-04-01

    Full Text Available The present paper offers a basic introduction to data clustering and demonstrates the application of clustering methods in drawing maps of science. All approaches towards classification and clustering of information are briefly discussed. Their application to the process of visualization of conceptual information and drawing of science maps are illustrated by reviewing similar researches in this field. By implementing aggregated hierarchical clustering algorithm, which is an algorithm based on complete-link method, the map for urban management science as an emerging, interdisciplinary scientific field is analyzed and reviewed.

  6. Recent advances in coupled-cluster methods

    CERN Document Server

    Bartlett, Rodney J

    1997-01-01

    Today, coupled-cluster (CC) theory has emerged as the most accurate, widely applicable approach for the correlation problem in molecules. Furthermore, the correct scaling of the energy and wavefunction with size (i.e. extensivity) recommends it for studies of polymers and crystals as well as molecules. CC methods have also paid dividends for nuclei, and for certain strongly correlated systems of interest in field theory.In order for CC methods to have achieved this distinction, it has been necessary to formulate new, theoretical approaches for the treatment of a variety of essential quantities

  7. Energy flow in plate assembles by hierarchical version of finite element method

    DEFF Research Database (Denmark)

    Wachulec, Marcin; Kirkegaard, Poul Henning

    the finite element method has been used to study the energy flow. The finite element method proved its usefulness despite the computational expense. Therefore studies have been conducted in order to simplify and reduce the computations required. Among others, the use of hierarchical version of finite element...... method has been proposed. In this paper a modified hierarchical version of finite element method is used for modelling of energy flow in plate assembles. The formulation includes description of in-plane forces so that planes lying in different planes can be modelled. Two examples considered are: L......-corner of two rectangular plates an a I-shaped plate girder made of five plates. Energy distribution among plates due to harmonic load is studied and the comparison of performance between the hierarchical and standard finite element formulation is presented....

  8. Clustering of hydrological data: a review of methods for runoff predictions in ungauged basins

    Science.gov (United States)

    Dogulu, Nilay; Kentel, Elcin

    2017-04-01

    There is a great body of research that has looked into the challenge of hydrological predictions in ungauged basins as driven by the Prediction in Ungauged Basins (PUB) initiative of the International Association of Hydrological Sciences (IAHS). Transfer of hydrological information (e.g. model parameters, flow signatures) from gauged to ungauged catchment, often referred as "regionalization", is the main objective and benefits from identification of hydrologically homogenous regions. Within this context, indirect representation of hydrologic similarity for ungauged catchments, which is not a straightforward task due to absence of streamflow measurements and insufficient knowledge of hydrologic behavior, has been explored in the literature. To this aim, clustering methods have been widely adopted. While most of the studies employ hard clustering techniques such as hierarchical (divisive or agglomerative) clustering, there have been more recent attempts taking advantage of fuzzy set theory (fuzzy clustering) and nonlinear methods (e.g. self-organizing maps). The relevant research findings from this fundamental task of hydrologic sciences have revealed the value of different clustering methods for improved understanding of catchment hydrology. However, despite advancements there still remains challenges and yet opportunities for research on clustering for regionalization purposes. The present work provides an overview of clustering techniques and their applications in hydrology with focus on regionalization for the PUB problem. Identifying their advantages and disadvantages, we discuss the potential of innovative clustering methods and reflect on future challenges in view of the research objectives of the PUB initiative.

  9. The hierarchical teaching method exploration for curriculum design of photoelectric discipline

    Science.gov (United States)

    Gong, Huaping; Liang, Pei; Jin, Yongxing; Xu, Sunan; Zhang, Yan

    2017-08-01

    This paper is mainly introducing the exploration of the hierarchical teaching method for curriculum design of photoelectric discipline. Due to the primal problems which extensively exist in current teaching on curriculum design practical course, some new suggestions are discussed in the aspects of teaching contents, experimental schemes, instruction modes and assessment methods. The curriculum design practical course should be updated with the professional hot spots. Combining the big class oriented instruction and group instruction, a hierarchical teaching mode is established, which implements layered training with a wide range for all students. With all of these efforts the teaching method of curriculum design practical course can be improved.

  10. Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods.

    Science.gov (United States)

    Notaro, Marco; Schubach, Max; Robinson, Peter N; Valentini, Giorgio

    2017-10-12

    The prediction of human gene-abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene-disease associations has been widely investigated, the related problem of gene-phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a "flat" learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of-the-art algorithms and with a significant reduction of the computational complexity. Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository.

  11. Using LC and Hierarchical Cluster Analysis as Tools to Distinguish Timbó Collections into Two Deguelia Species: A Contribution to Chemotaxonomy.

    Science.gov (United States)

    da Costa, Danielle; E Silva, Consuelo; Pinheiro, Aline; Frommenwiler, Débora; Arruda, Mara; Guilhon, Giselle; Alves, Cláudio; Arruda, Alberto; Da Silva, Milton

    2016-04-30

    The species Deguelia utilis and Deguelia rufescens var. urucu, popularly known as "timbó," have been used for many years as rotenone sources in insecticide formulations. In this work, a method was developed and validated using a high-performance liquid chromatography-photodiode array (HPLC-PDA) system, and results were analyzed using hierarchical cluster analysis (HCA). By quantifying the major rotenoids of these species, it was possible to establish a linear relation between them. The ratio between the concentrations of rotenone and deguelin for D. utilis is approximately 1:0.8, respectively, while for D. rufescens var. urucu it is 2:1. These results may help to distinguish these species contributing to their taxonomic identification.

  12. Testing perturbative results with non-perturbative methods for the Hierarchical model

    OpenAIRE

    Meurice, Y.; Oktay, M. B.

    2000-01-01

    We present non-perturbative methods to calculate accurately the renormalized quantities for Dyson's Hierarchical Model. We apply this method and calculate the critical exponent gamma with 12 and 4 significant digits in the high and low temperature phases, respectively. We report accurate values for universal ratios of amplitudes and preliminary results concerning the comparison with perturbative results.

  13. Using hierarchical cluster models to systematically identify groups of jobs with similar occupational questionnaire response patterns to assist rule-based expert exposure assessment in population-based studies.

    Science.gov (United States)

    Friesen, Melissa C; Shortreed, Susan M; Wheeler, David C; Burstyn, Igor; Vermeulen, Roel; Pronk, Anjoeka; Colt, Joanne S; Baris, Dalsu; Karagas, Margaret R; Schwenn, Molly; Johnson, Alison; Armenti, Karla R; Silverman, Debra T; Yu, Kai

    2015-05-01

    Rule-based expert exposure assessment based on questionnaire response patterns in population-based studies improves the transparency of the decisions. The number of unique response patterns, however, can be nearly equal to the number of jobs. An expert may reduce the number of patterns that need assessment using expert opinion, but each expert may identify different patterns of responses that identify an exposure scenario. Here, hierarchical clustering methods are proposed as a systematic data reduction step to reproducibly identify similar questionnaire response patterns prior to obtaining expert estimates. As a proof-of-concept, we used hierarchical clustering methods to identify groups of jobs (clusters) with similar responses to diesel exhaust-related questions and then evaluated whether the jobs within a cluster had similar (previously assessed) estimates of occupational diesel exhaust exposure. Using the New England Bladder Cancer Study as a case study, we applied hierarchical cluster models to the diesel-related variables extracted from the occupational history and job- and industry-specific questionnaires (modules). Cluster models were separately developed for two subsets: (i) 5395 jobs with ≥1 variable extracted from the occupational history indicating a potential diesel exposure scenario, but without a module with diesel-related questions; and (ii) 5929 jobs with both occupational history and module responses to diesel-relevant questions. For each subset, we varied the numbers of clusters extracted from the cluster tree developed for each model from 100 to 1000 groups of jobs. Using previously made estimates of the probability (ordinal), intensity (µg m(-3) respirable elemental carbon), and frequency (hours per week) of occupational exposure to diesel exhaust, we examined the similarity of the exposure estimates for jobs within the same cluster in two ways. First, the clusters' homogeneity (defined as >75% with the same estimate) was examined compared

  14. Hierarchical outranking methods for multi-criteria decision aiding

    OpenAIRE

    Del Vasto Terrientes, Luis Miguel

    2015-01-01

    Multi-Criteria Decision Aiding (MCDA) methods support complex decision making involving multiple and conflictive criteria. MCDA distinguishes two main approaches to deal with this type of problems: utility-based and outranking methods, each with its own strengths and weaknesses. Outranking methods are based on social choice models combined with Artificial Intelligence techniques (such as the management of categorical data or uncertainty). They are recognized as providing tools for a realisti...

  15. A novel 3D constellation-masked method for physical security in hierarchical OFDMA system.

    Science.gov (United States)

    Zhang, Lijia; Liu, Bo; Xin, Xiangjun; Liu, Deming

    2013-07-01

    This paper proposes a novel 3D constellation-masked method to ensure the physical security in hierarchical optical orthogonal frequency division multiplexing access (OFDMA) system. The 3D constellation masking is executed on the two levels of hierarchical modulation and among different OFDM subcarriers, which is realized by the masking vectors. The Lorenz chaotic model is adopted for the generation of masking vectors in the proposed scheme. A 9.85 Gb/s encrypted hierarchical QAM OFDM signal is successfully demonstrated in the experiment. The performance of illegal optical network unit (ONU) with different masking vectors is also investigated. The proposed method is demonstrated to be secure and efficient against the commonly known attacks in the experiment.

  16. Hierarchical Control of Nitrite Respiration by Transcription Factors Encoded within Mobile Gene Clusters of Thermus thermophilus.

    Science.gov (United States)

    Alvarez, Laura; Quintáns, Nieves G; Blesa, Alba; Baquedano, Ignacio; Mencía, Mario; Bricio, Carlos; Berenguer, José

    2017-12-01

    Denitrification in Thermus thermophilus is encoded by the nitrate respiration conjugative element (NCE) and nitrite and nitric oxide respiration (nic) gene clusters. A tight coordination of each cluster's expression is required to maximize anaerobic growth, and to avoid toxicity by intermediates, especially nitric oxides (NO). Here, we study the control of the nitrite reductases (Nir) and NO reductases (Nor) upon horizontal acquisition of the NCE and nic clusters by a formerly aerobic host. Expression of the nic promoters PnirS, PnirJ, and PnorC, depends on the oxygen sensor DnrS and on the DnrT protein, both NCE-encoded. NsrR, a nic-encoded transcription factor with an iron-sulfur cluster, is also involved in Nir and Nor control. Deletion of nsrR decreased PnorC and PnirJ transcription, and activated PnirS under denitrification conditions, exhibiting a dual regulatory role never described before for members of the NsrR family. On the basis of these results, a regulatory hierarchy is proposed, in which under anoxia, there is a pre-activation of the nic promoters by DnrS and DnrT, and then NsrR leads to Nor induction and Nir repression, likely as a second stage of regulation that would require NO detection, thus avoiding accumulation of toxic levels of NO. The whole system appears to work in remarkable coordination to function only when the relevant nitrogen species are present inside the cell.

  17. Hierarchical cluster analysis of labour market regulations and population health: a taxonomy of low- and middle-income countries.

    Science.gov (United States)

    Muntaner, Carles; Chung, Haejoo; Benach, Joan; Ng, Edwin

    2012-04-18

    An important contribution of the social determinants of health perspective has been to inquire about non-medical determinants of population health. Among these, labour market regulations are of vital significance. In this study, we investigate the labour market regulations among low- and middle-income countries (LMICs) and propose a labour market taxonomy to further understand population health in a global context. Using Gross National Product per capita, we classify 113 countries into either low-income (n = 71) or middle-income (n = 42) strata. Principal component analysis of three standardized indicators of labour market inequality and poverty is used to construct 2 factor scores. Factor score reliability is evaluated with Cronbach's alpha. Using these scores, we conduct a hierarchical cluster analysis to produce a labour market taxonomy, conduct zero-order correlations, and create box plots to test their associations with adult mortality, healthy life expectancy, infant mortality, maternal mortality, neonatal mortality, under-5 mortality, and years of life lost to communicable and non-communicable diseases. Labour market and health data are retrieved from the International Labour Organization's Key Indicators of Labour Markets and World Health Organization's Statistical Information System. Six labour market clusters emerged: Residual (n = 16), Emerging (n = 16), Informal (n = 10), Post-Communist (n = 18), Less Successful Informal (n = 22), and Insecure (n = 31). Primary findings indicate: (i) labour market poverty and population health is correlated in both LMICs; (ii) association between labour market inequality and health indicators is significant only in low-income countries; (iii) Emerging (e.g., East Asian and Eastern European countries) and Insecure (e.g., sub-Saharan African nations) clusters are the most advantaged and disadvantaged, respectively, with the remaining clusters experiencing levels of population health consistent with their labour market

  18. Sparse Event Modeling with Hierarchical Bayesian Kernel Methods

    Science.gov (United States)

    2016-01-05

    data, is it equally important to analyze the prediction power of a statistical model if it is going to be used for forecasting purposes. Prediction...Poisson Bayesian Kernel Methods for Modeling Count Data, Computational Statistics and Data Analysis (04 2016) TOTAL: 1 Books Number of Manuscripts...factors into the assessment of a rehabilitation project. Conclusions Bayesian kernel methods are powerful tools in forecasting data. These models make

  19. Single pass kernel k-means clustering method

    Indian Academy of Sciences (India)

    Abstract. In unsupervised classification, kernel k-means clustering method has been shown to perform better than conventional k-means clustering method in iden- tifying non-isotropic clusters in a data set. The space and time requirements of this method are O(n2), where n is the data set size. Because of this quadratic time ...

  20. A hierarchical voltage control method for multi-terminal AC/DC distribution system

    Science.gov (United States)

    Ma, Zhoujun; Zhu, Hong; Zhou, Dahong; Wang, Chunning; Tang, Renquan; Xu, Honghua

    2017-08-01

    A hierarchical control system is proposed in this paper to control the voltage of multi-terminal AC/DC distribution system. The hierarchical control system consists of PCC voltage control system, DG voltage control system and voltage regulator control system. The functions of three systems are to control the voltage of DC distribution network, AC bus voltage and area voltage. A method is proposed to deal with the whole control system. And a case study indicates that when voltage fluctuating, three layers of power flow control system is running orderly, and can maintain voltage stability.

  1. METHODS AND MODELS OF HIERARCHIZATION OF THE TOURIST ATTRACTIONS. STUDY CASE: NEAMȚ COUNTY

    Directory of Open Access Journals (Sweden)

    CEHAN Alexandra

    2015-06-01

    Full Text Available The aim of the present study is to emphasise the utility of hierachization in the field of tourism, utility proved through the creation of a tourist attractiveness index based on both quantitative and qualitative features. This index, besides determining the hierarchical position of each tourist attraction, proves useful for pointing out the most important tourist areas of Neamt County, these results being obtained through data collection and analysis and through the creation of primary indices. The outcome of this study is, therefore, a generally applicable instrument for any tourist hierarchization approaches, whose efficiency is discussed in the end by comparing the values obtained for each territorial unit of the county through the use of this instrument with the values assigned to the same units by the Spatial Planning of National Territory. In this way are highlighted the advantages this method of hierarchization brings to the process of evaluation of tourism potential, as well as its faults.

  2. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  3. MANNER OF STOCKS SORTING USING CLUSTER ANALYSIS METHODS

    Directory of Open Access Journals (Sweden)

    Jana Halčinová

    2014-06-01

    Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.

  4. Hierarchical formation of Westerlund 1: a collapsing cluster with no primordial mass segregation?

    Science.gov (United States)

    Gennaro, Mario; Goodwin, Simon P.; Parker, Richard J.; Allison, Richard J.; Brandner, Wolfgang

    2017-12-01

    We examine the level of substructure and mass segregation in the massive, young cluster Westerlund 1. We find that it is relatively smooth, with little or no mass segregation, but with the massive stars in regions of significantly higher than average surface density. While an expanding or bouncing-back scenario for the evolution of Westerlund 1 cannot be ruled out, we argue that the most natural model to explain these observations is one in which Westerlund 1 formed with no primordial mass segregation and at a similar or larger size than we now observe.

  5. A hierarchical method for discrete structural topology design problems with local stress and displacement constraints

    DEFF Research Database (Denmark)

    Stolpe, Mathias; Stidsen, Thomas K.

    2007-01-01

    of minimizing the weight of a structure subject to displacement and local design-dependent stress constraints. The method iteratively treats a sequence of problems of increasing size of the same type as the original problem. The problems are defined on a design mesh which is initially coarse......In this paper, we present a hierarchical optimization method for finding feasible true 0-1 solutions to finite-element-based topology design problems. The topology design problems are initially modelled as non-convex mixed 0-1 programs. The hierarchical optimization method is applied to the problem...... and then successively refined as needed. At each level of design mesh refinement, a neighbourhood optimization method is used to treat the problem considered. The non-convex topology design problems are equivalently reformulated as convex all-quadratic mixed 0-1 programs. This reformulation enables the use of methods...

  6. A hierarchical method for structural topology design problems with local stress and displacement constraints

    DEFF Research Database (Denmark)

    Stolpe, Mathias; Stidsen, Thomas K.

    2005-01-01

    of minimizing the weight of a structure subject to displacement and local design-dependent stress constraints. The method iteratively solves a sequence of problems of increasing size of the same type as the original problem. The problems are defined on a design mesh which is initially coarse......In this paper we present a hierarchical optimization method for finding feasible true 0-1 solutions to finite element based topology design problems. The topology design problems are initially modeled as non-convex mixed 0-1 programs. The hierarchical optimization method is applied to the problem...... and then successively refined as needed. At each level of design mesh refinement, a neighborhood optimization method is used to solve the problem considered. The non-convex topology design problems are equivalently reformulated as convex all-quadratic mixed 0-1 programs. This reformulation enables the use of methods...

  7. Ultrathin mesoporous Co3O4 nanosheets-constructed hierarchical clusters as high rate capability and long life anode materials for lithium-ion batteries

    Science.gov (United States)

    Wu, Shengming; Xia, Tian; Wang, Jingping; Lu, Feifei; Xu, Chunbo; Zhang, Xianfa; Huo, Lihua; Zhao, Hui

    2017-06-01

    Herein, Ultrathin mesoporous Co3O4 nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment at 600 °C in air. The products consist of cluster-like Co3O4 microarchitectures, which are assembled by numerous ultrathin mesoporous Co3O4 nanosheets. When tested as anode materials for lithium-ion batteries, UMCN-HCs deliver a high reversible capacity of 1067 mAh g-1 at a current density of 100 mA g-1 after 100 cycles. Even at 2 A g-1, a stable capacity as high as 507 mAh g-1 can be achieved after 500 cycles. The high reversible capacity, excellent cycling stability, and good rate capability of UMCN-HCs may be attributed to their mesoporous sheet-like nanostructure. The sheet-layered structure of UMCN-HCs may buffer the volume change during the lithiation-delithiation process, and the mesoporous characteristic make lithium-ion transfer more easily at the interface between the active electrode and the electrolyte.

  8. A proximity-based graph clustering method for the identification and application of transcription factor clusters.

    Science.gov (United States)

    Spadafore, Maxwell; Najarian, Kayvan; Boyle, Alan P

    2017-11-29

    Transcription factors (TFs) form a complex regulatory network within the cell that is crucial to cell functioning and human health. While methods to establish where a TF binds to DNA are well established, these methods provide no information describing how TFs interact with one another when they do bind. TFs tend to bind the genome in clusters, and current methods to identify these clusters are either limited in scope, unable to detect relationships beyond motif similarity, or not applied to TF-TF interactions. Here, we present a proximity-based graph clustering approach to identify TF clusters using either ChIP-seq or motif search data. We use TF co-occurrence to construct a filtered, normalized adjacency matrix and use the Markov Clustering Algorithm to partition the graph while maintaining TF-cluster and cluster-cluster interactions. We then apply our graph structure beyond clustering, using it to increase the accuracy of motif-based TFBS searching for an example TF. We show that our method produces small, manageable clusters that encapsulate many known, experimentally validated transcription factor interactions and that our method is capable of capturing interactions that motif similarity methods might miss. Our graph structure is able to significantly increase the accuracy of motif TFBS searching, demonstrating that the TF-TF connections within the graph correlate with biological TF-TF interactions. The interactions identified by our method correspond to biological reality and allow for fast exploration of TF clustering and regulatory dynamics.

  9. A Temperature Sensor Clustering Method for Thermal Error Modeling of Heavy Milling Machine Tools

    Directory of Open Access Journals (Sweden)

    Fengchun Li

    2017-01-01

    Full Text Available A clustering method is an effective way to select the proper temperature sensor location for thermal error modeling of machine tools. In this paper, a new temperature sensor clustering method is proposed. By analyzing the characteristics of the temperature of the sensors in a heavy floor-type milling machine tool, an indicator involving both the Euclidean distance and the correlation coefficient was proposed to reflect the differences between temperature sensors, and the indicator was expressed by a distance matrix to be used for hierarchical clustering. Then, the weight coefficient in the distance matrix and the number of the clusters (groups were optimized by a genetic algorithm (GA, and the fitness function of the GA was also rebuilt by establishing the thermal error model at one rotation speed, then deriving its accuracy at two different rotation speeds with a temperature disturbance. Thus, the parameters for clustering, as well as the final selection of the temperature sensors, were derived. Finally, the method proposed in this paper was verified on a machine tool. According to the selected temperature sensors, a thermal error model of the machine tool was established and used to predict the thermal error. The results indicate that the selected temperature sensors can accurately predict thermal error at different rotation speeds, and the proposed temperature sensor clustering method for sensor selection is expected to be used for the thermal error modeling for other machine tools.

  10. Integrated management of thesis using clustering method

    Science.gov (United States)

    Astuti, Indah Fitri; Cahyadi, Dedy

    2017-02-01

    Thesis is one of major requirements for student in pursuing their bachelor degree. In fact, finishing the thesis involves a long process including consultation, writing manuscript, conducting the chosen method, seminar scheduling, searching for references, and appraisal process by the board of mentors and examiners. Unfortunately, most of students find it hard to match all the lecturers' free time to sit together in a seminar room in order to examine the thesis. Therefore, seminar scheduling process should be on the top of priority to be solved. Manual mechanism for this task no longer fulfills the need. People in campus including students, staffs, and lecturers demand a system in which all the stakeholders can interact each other and manage the thesis process without conflicting their timetable. A branch of computer science named Management Information System (MIS) could be a breakthrough in dealing with thesis management. This research conduct a method called clustering to distinguish certain categories using mathematics formulas. A system then be developed along with the method to create a well-managed tool in providing some main facilities such as seminar scheduling, consultation and review process, thesis approval, assessment process, and also a reliable database of thesis. The database plays an important role in present and future purposes.

  11. Symptom Clusters in Advanced Cancer Patients: An Empirical Comparison of Statistical Methods and the Impact on Quality of Life.

    Science.gov (United States)

    Dong, Skye T; Costa, Daniel S J; Butow, Phyllis N; Lovell, Melanie R; Agar, Meera; Velikova, Galina; Teckle, Paulos; Tong, Allison; Tebbutt, Niall C; Clarke, Stephen J; van der Hoek, Kim; King, Madeleine T; Fayers, Peter M

    2016-01-01

    Symptom clusters in advanced cancer can influence patient outcomes. There is large heterogeneity in the methods used to identify symptom clusters. To investigate the consistency of symptom cluster composition in advanced cancer patients using different statistical methodologies for all patients across five primary cancer sites, and to examine which clusters predict functional status, a global assessment of health and global quality of life. Principal component analysis and exploratory factor analysis (with different rotation and factor selection methods) and hierarchical cluster analysis (with different linkage and similarity measures) were used on a data set of 1562 advanced cancer patients who completed the European Organization for the Research and Treatment of Cancer Quality of Life Questionnaire-Core 30. Four clusters consistently formed for many of the methods and cancer sites: tense-worry-irritable-depressed (emotional cluster), fatigue-pain, nausea-vomiting, and concentration-memory (cognitive cluster). The emotional cluster was a stronger predictor of overall quality of life than the other clusters. Fatigue-pain was a stronger predictor of overall health than the other clusters. The cognitive cluster and fatigue-pain predicted physical functioning, role functioning, and social functioning. The four identified symptom clusters were consistent across statistical methods and cancer types, although there were some noteworthy differences. Statistical derivation of symptom clusters is in need of greater methodological guidance. A psychosocial pathway in the management of symptom clusters may improve quality of life. Biological mechanisms underpinning symptom clusters need to be delineated by future research. A framework for evidence-based screening, assessment, treatment, and follow-up of symptom clusters in advanced cancer is essential. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

  12. Hierarchical Clustering Analysis of Reading Aloud Data: A New Technique for Evaluating the Performance of Computational Models

    Directory of Open Access Journals (Sweden)

    Serje eRobidoux

    2014-03-01

    Full Text Available DRC (Coltheart et al., 2001 and CDP++ (Perry, Zorzi, & Ziegler, 2010 are two of the most successful models of reading aloud. These models differ primarily in how their sublexical systems convert letter strings into phonological codes. DRC adopts a set of grapheme-to-phoneme conversion rules (GPCs while CDP++ uses a simple trained network that has been exposed to a combination of rules and the spellings and pronunciations of known words. Thus far the debate between fixed rules and learned associations has largely emphasized reaction time experiments, error rates in dyslexias, and item-level variance from large-scale databases. Recently, Pritchard, Coltheart, Palethorpe, and Castles (2012 examined the models’ nonword reading in a new way. They compared responses produced by the models to those produced by 45 skilled readers. Their item-by-item analysis is informative, but leaves open some questions that can be addressed with a different technique. Using hierarchical clustering techniques, we first examined the subject data to identify if there are classes of subjects that are similar to each other in their overall response profiles. We found that there are indeed two groups of subject that differ in their pronunciations for certain consonant clusters. We also tested the possibility that CDP++ is modeling one set of subjects well, while DRC is modeling a different set of subjects. We found that CDP++ does not fit any human reader’s response pattern very well, while DRC fits the human readers as well as or better than any other reader.

  13. Fuzzy Clustering Methods and their Application to Fuzzy Modeling

    DEFF Research Database (Denmark)

    Kroszynski, Uri; Zhou, Jianjun

    1999-01-01

    prediction of outputs. This article presents an overview of some of the most popular clustering methods, namely Fuzzy Cluster-Means (FCM) and its generalizations to Fuzzy C-Lines and Elliptotypes. The algorithms for computing cluster centers and principal directions from a training data-set are described...

  14. Peringkasan Tweet Berdasarkan Trending Topic Twitter Dengan Pembobotan TF-IDF dan Single Linkage AngglomerativeHierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Annisa Annisa

    2016-10-01

    Full Text Available Trending topic is a feature provided by twitter that informs something widely discussed by users in a particular time. The form of a trending topic is a hashtag and can be selected by clicking. However, the number of tweets for each trending topics can be very large, so it will be difficult if we want to know all the contents. So, in order to make easy when reading the topic, a small number of tweets can be selected as the main idea of the topic. In this study, we applied the Agglomerative Single Linkage Hierarchical Clustering by calculating the TF-IDF value for each word in advance. We used 100 trending topics, where each topic consists of 50 tweets in Indonesian. For testing, we provided 30 trending topics which consist of 2 until 9 sub-topics. The result is that each trending topics can be summarized into shorter text contains 2 until 9 tweets. We were able to summarize 1 trending topics exactly same as the topic summarized by human expert. However, the rest of topics corresponded partially with human expert.

  15. Coresets vs clustering: comparison of methods for redundancy reduction in very large white matter fiber sets

    Science.gov (United States)

    Alexandroni, Guy; Zimmerman Moreno, Gali; Sochen, Nir; Greenspan, Hayit

    2016-03-01

    Recent advances in Diffusion Weighted Magnetic Resonance Imaging (DW-MRI) of white matter in conjunction with improved tractography produce impressive reconstructions of White Matter (WM) pathways. These pathways (fiber sets) often contain hundreds of thousands of fibers, or more. In order to make fiber based analysis more practical, the fiber set needs to be preprocessed to eliminate redundancies and to keep only essential representative fibers. In this paper we demonstrate and compare two distinctive frameworks for selecting this reduced set of fibers. The first framework entails pre-clustering the fibers using k-means, followed by Hierarchical Clustering and replacing each cluster with one representative. For the second clustering stage seven distance metrics were evaluated. The second framework is based on an efficient geometric approximation paradigm named coresets. Coresets present a new approach to optimization and have huge success especially in tasks requiring large computation time and/or memory. We propose a modified version of the coresets algorithm, Density Coreset. It is used for extracting the main fibers from dense datasets, leaving a small set that represents the main structures and connectivity of the brain. A novel approach, based on a 3D indicator structure, is used for comparing the frameworks. This comparison was applied to High Angular Resolution Diffusion Imaging (HARDI) scans of 4 healthy individuals. We show that among the clustering based methods, that cosine distance gives the best performance. In comparing the clustering schemes with coresets, Density Coreset method achieves the best performance.

  16. An image segmentation method based on network clustering model

    Science.gov (United States)

    Jiao, Yang; Wu, Jianshe; Jiao, Licheng

    2018-01-01

    Network clustering phenomena are ubiquitous in nature and human society. In this paper, a method involving a network clustering model is proposed for mass segmentation in mammograms. First, the watershed transform is used to divide an image into regions, and features of the image are computed. Then a graph is constructed from the obtained regions and features. The network clustering model is applied to realize clustering of nodes in the graph. Compared with two classic methods, the algorithm based on the network clustering model performs more effectively in experiments.

  17. A new anisotropic mesh adaptation method based upon hierarchical a posteriori error estimates

    Science.gov (United States)

    Huang, Weizhang; Kamenski, Lennard; Lang, Jens

    2010-03-01

    A new anisotropic mesh adaptation strategy for finite element solution of elliptic differential equations is presented. It generates anisotropic adaptive meshes as quasi-uniform ones in some metric space, with the metric tensor being computed based on hierarchical a posteriori error estimates. A global hierarchical error estimate is employed in this study to obtain reliable directional information of the solution. Instead of solving the global error problem exactly, which is costly in general, we solve it iteratively using the symmetric Gauß-Seidel method. Numerical results show that a few GS iterations are sufficient for obtaining a reasonably good approximation to the error for use in anisotropic mesh adaptation. The new method is compared with several strategies using local error estimators or recovered Hessians. Numerical results are presented for a selection of test examples and a mathematical model for heat conduction in a thermal battery with large orthotropic jumps in the material coefficients.

  18. Progressive clustering based method for protein function prediction.

    Science.gov (United States)

    Saini, Ashish; Hou, Jingyu

    2013-02-01

    In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.

  19. A Latent Variable Clustering Method for Wireless Sensor Networks

    DEFF Research Database (Denmark)

    Vasilev, Vladislav; Iliev, Georgi; Poulkov, Vladimir

    2016-01-01

    In this paper we derive a clustering method based on the Hidden Conditional Random Field (HCRF) model in order to maximizes the performance of a wireless sensor. Our novel approach to clustering in this paper is in the application of an index invariant graph that we defined in a previous work...... obtain by running simulations of a time dynamic sensor network. The performance of the proposed method outperforms the existing clustering methods, such as the Girvan-Newmans algorithm, the Kargers algorithm and the Spectral Clustering method, in terms of packet acceptance probability and delay....

  20. SIA's asymmetric rules approximation to hierarchical clustering in Learning Analytics: mathematical issues

    OpenAIRE

    Pazmiño, R. A.; García-Peñalvo, F. J.; Conde, M. Á.

    2017-01-01

    Bichsel, proposes an analytics maturity model used to evaluate the progress in the use of academic and learning analytics. In the progress, there are positive results but, most institutions are below 80% level. Most institutions also scored low for data analytics tools, reporting, and expertise"]. In addition, a task with the methods of Data Mining and Learning Analytics is analyze them (precision, accuracy, sensitivity, coherence, fitness measures, cosine, confidence, lift, similarity weight...

  1. A simulation study of three methods for detecting disease clusters

    Directory of Open Access Journals (Sweden)

    Samuelsen Sven O

    2006-04-01

    Full Text Available Abstract Background Cluster detection is an important part of spatial epidemiology because it can help identifying environmental factors associated with disease and thus guide investigation of the aetiology of diseases. In this article we study three methods suitable for detecting local spatial clusters: (1 a spatial scan statistic (SaTScan, (2 generalized additive models (GAM and (3 Bayesian disease mapping (BYM. We conducted a simulation study to compare the methods. Seven geographic clusters with different shapes were initially chosen as high-risk areas. Different scenarios for the magnitude of the relative risk of these areas as compared to the normal risk areas were considered. For each scenario the performance of the methods were assessed in terms of the sensitivity, specificity, and percentage correctly classified for each cluster. Results The performance depends on the relative risk, but all methods are in general suitable for identifying clusters with a relative risk larger than 1.5. However, it is difficult to detect clusters with lower relative risks. The GAM approach had the highest sensitivity, but relatively low specificity leading to an overestimation of the cluster area. Both the BYM and the SaTScan methods work well. Clusters with irregular shapes are more difficult to detect than more circular clusters. Conclusion Based on our simulations we conclude that the methods differ in their ability to detect spatial clusters. Different aspects should be considered for appropriate choice of method such as size and shape of the assumed spatial clusters and the relative importance of sensitivity and specificity. In general, the BYM method seems preferable for local cluster detection with relatively high relative risks whereas the SaTScan method appears preferable for lower relative risks. The GAM method needs to be tuned (using cross-validation to get satisfactory results.

  2. Quantitative and Chemical Fingerprint Analysis for the Quality Evaluation of Receptaculum Nelumbinis by RP-HPLC Coupled with Hierarchical Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Jin-Zhong Wu

    2013-01-01

    Full Text Available A simple and reliable method of high-performance liquid chromatography with photodiode array detection (HPLC-DAD was developed to evaluate the quality of Receptaculum Nelumbinis (dried receptacle of Nelumbo nucifera through establishing chromatographic fingerprint and simultaneous determination of five flavonol glycosides, including hyperoside, isoquercitrin, quercetin-3-O-β-d-glucuronide, isorhamnetin-3-O-β-d-galactoside and syringetin-3-O-β-d-glucoside. In quantitative analysis, the five components showed good regression (R > 0.9998 within linear ranges, and their recoveries were in the range of 98.31%–100.32%. In the chromatographic fingerprint, twelve peaks were selected as the characteristic peaks to assess the similarities of different samples collected from different origins in China according to the State Food and Drug Administration (SFDA requirements. Furthermore, hierarchical cluster analysis (HCA was also applied to evaluate the variation of chemical components among different sources of Receptaculum Nelumbinis in China. This study indicated that the combination of quantitative and chromatographic fingerprint analysis can be readily utilized as a quality control method for Receptaculum Nelumbinis and its related traditional Chinese medicinal preparations.

  3. Simultaneous determination of 19 flavonoids in commercial trollflowers by using high-performance liquid chromatography and classification of samples by hierarchical clustering analysis.

    Science.gov (United States)

    Song, Zhiling; Hashi, Yuki; Sun, Hongyang; Liang, Yi; Lan, Yuexiang; Wang, Hong; Chen, Shizhong

    2013-12-01

    The flowers of Trollius species, named Jin Lianhua in Chinese, are widely used traditional Chinese herbs with vital biological activity that has been used for several decades in China to treat upper respiratory infections, pharyngitis, tonsillitis, and bronchitis. We developed a rapid and reliable method for simultaneous quantitative analysis of 19 flavonoids in trollflowers by using high-performance liquid chromatography (HPLC). Chromatography was performed on Inertsil ODS-3 C18 column, with gradient elution methanol-acetonitrile-water with 0.02% (v/v) formic acid. Content determination was used to evaluate the quality of commercial trollflowers from different regions in China, while three Trollius species (Trollius chinensis Bunge, Trollius ledebouri Reichb, Trollius buddae Schipcz) were explicitly distinguished by using hierarchical clustering analysis. The linearity, precision, accuracy, limit of detection, and limit of quantification were validated for the quantification method, which proved sensitive, accurate and reproducible indicating that the proposed approach was applicable for the routine analysis and quality control of trollflowers. © 2013.

  4. HILIC-UPLC-MS/MS combined with hierarchical clustering analysis to rapidly analyze and evaluate nucleobases and nucleosides in Ginkgo biloba leaves.

    Science.gov (United States)

    Yao, Xin; Zhou, Guisheng; Tang, Yuping; Guo, Sheng; Qian, Dawei; Duan, Jin-Ao

    2015-02-01

    Ginkgo biloba leaf extract has been widely used in dietary supplements and more recently in some foods and beverages. In addition to the well-known flavonol glycosides and terpene lactones, G. biloba leaves are also rich in nucleobases and nucleosides. To determine the content of nucleobases and nucleosides in G. biloba leaves at trace levels, a reliable method has been established by using hydrophilic interaction ultra performance liquid chromatography coupled with triple-quadrupole tandem mass spectrometry (HILIC-UPLC-TQ-MS/MS) working in multiple reaction monitoring mode. Eleven nucleobases and nucleosides were simultaneously determined in seven min. The proposed method was fully validated in terms of linearity, sensitivity, and repeatability, as well as recovery. Furthermore, hierarchical clustering analysis (HCA) was performed to evaluate and classify the samples according to the contents of the eleven chemical constituents. The established approach could be helpful for evaluation of the potential values as dietary supplements and the quality control of G. biloba leaves, which might also be utilized for the investigation of other medicinal herbs containing nucleobases and nucleosides. Copyright © 2014 John Wiley & Sons, Ltd.

  5. Visual cluster analysis and pattern recognition template and methods

    Science.gov (United States)

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    1999-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  6. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR is an efficient tool for metamodelling of nonlinear dynamic models

    Directory of Open Access Journals (Sweden)

    Omholt Stig W

    2011-06-01

    Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback

  7. CCM: A Text Classification Method by Clustering

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    In this paper, a new Cluster based Classification Model (CCM) for suspicious email detection and other text classification tasks, is presented. Comparative experiments of the proposed model against traditional classification models and the boosting algorithm are also discussed. Experimental results...... show that the CCM outperforms traditional classification models as well as the boosting algorithm for the task of suspicious email detection on terrorism domain email dataset and topic categorization on the Reuters-21578 and 20 Newsgroups datasets. The overall finding is that applying a cluster based...

  8. A graph clustering method for community detection in complex networks

    Science.gov (United States)

    Zhou, HongFang; Li, Jin; Li, JunHuai; Zhang, FaCun; Cui, YingAn

    2017-03-01

    Information mining from complex networks by identifying communities is an important problem in a number of research fields, including the social sciences, biology, physics and medicine. First, two concepts are introduced, Attracting Degree and Recommending Degree. Second, a graph clustering method, referred to as AR-Cluster, is presented for detecting community structures in complex networks. Third, a novel collaborative similarity measure is adopted to calculate node similarities. In the AR-Cluster method, vertices are grouped together based on calculated similarity under a K-Medoids framework. Extensive experimental results on two real datasets show the effectiveness of AR-Cluster.

  9. Improving local clustering based top-L link prediction methods via asymmetric link clustering information

    Science.gov (United States)

    Wu, Zhihao; Lin, Youfang; Zhao, Yiji; Yan, Hongyan

    2018-02-01

    Networks can represent a wide range of complex systems, such as social, biological and technological systems. Link prediction is one of the most important problems in network analysis, and has attracted much research interest recently. Many link prediction methods have been proposed to solve this problem with various techniques. We can note that clustering information plays an important role in solving the link prediction problem. In previous literatures, we find node clustering coefficient appears frequently in many link prediction methods. However, node clustering coefficient is limited to describe the role of a common-neighbor in different local networks, because it cannot distinguish different clustering abilities of a node to different node pairs. In this paper, we shift our focus from nodes to links, and propose the concept of asymmetric link clustering (ALC) coefficient. Further, we improve three node clustering based link prediction methods via the concept of ALC. The experimental results demonstrate that ALC-based methods outperform node clustering based methods, especially achieving remarkable improvements on food web, hamster friendship and Internet networks. Besides, comparing with other methods, the performance of ALC-based methods are very stable in both globalized and personalized top-L link prediction tasks.

  10. Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles

    Directory of Open Access Journals (Sweden)

    Lee Yun-Shien

    2008-03-01

    Full Text Available Abstract Background The hierarchical clustering tree (HCT with a dendrogram 1 and the singular value decomposition (SVD with a dimension-reduced representative map 2 are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. Results This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose seriation by Chen 3 as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. Conclusion We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at http://gap.stat.sinica.edu.tw/Software/GAP.

  11. A hierarchical detection method in external communication for self-driving vehicles based on TDMA

    Science.gov (United States)

    Al-ani, Muzhir Shaban; McDonald-Maier, Klaus

    2018-01-01

    Security is considered a major challenge for self-driving and semi self-driving vehicles. These vehicles depend heavily on communications to predict and sense their external environment used in their motion. They use a type of ad hoc network termed Vehicular ad hoc networks (VANETs). Unfortunately, VANETs are potentially exposed to many attacks on network and application level. This paper, proposes a new intrusion detection system to protect the communication system of self-driving cars; utilising a combination of hierarchical models based on clusters and log parameters. This security system is designed to detect Sybil and Wormhole attacks in highway usage scenarios. It is based on clusters, utilising Time Division Multiple Access (TDMA) to overcome some of the obstacles of VANETs such as high density, high mobility and bandwidth limitations in exchanging messages. This makes the security system more efficient, accurate and capable of real time detection and quick in identification of malicious behaviour in VANETs. In this scheme, each vehicle log calculates and stores different parameter values after receiving the cooperative awareness messages from nearby vehicles. The vehicles exchange their log data and determine the difference between the parameters, which is utilised to detect Sybil attacks and Wormhole attacks. In order to realize efficient and effective intrusion detection system, we use the well-known network simulator (ns-2) to verify the performance of the security system. Simulation results indicate that the security system can achieve high detection rates and effectively detect anomalies with low rate of false alarms. PMID:29315302

  12. A hierarchical detection method in external communication for self-driving vehicles based on TDMA.

    Science.gov (United States)

    Alheeti, Khattab M Ali; Al-Ani, Muzhir Shaban; McDonald-Maier, Klaus

    2018-01-01

    Security is considered a major challenge for self-driving and semi self-driving vehicles. These vehicles depend heavily on communications to predict and sense their external environment used in their motion. They use a type of ad hoc network termed Vehicular ad hoc networks (VANETs). Unfortunately, VANETs are potentially exposed to many attacks on network and application level. This paper, proposes a new intrusion detection system to protect the communication system of self-driving cars; utilising a combination of hierarchical models based on clusters and log parameters. This security system is designed to detect Sybil and Wormhole attacks in highway usage scenarios. It is based on clusters, utilising Time Division Multiple Access (TDMA) to overcome some of the obstacles of VANETs such as high density, high mobility and bandwidth limitations in exchanging messages. This makes the security system more efficient, accurate and capable of real time detection and quick in identification of malicious behaviour in VANETs. In this scheme, each vehicle log calculates and stores different parameter values after receiving the cooperative awareness messages from nearby vehicles. The vehicles exchange their log data and determine the difference between the parameters, which is utilised to detect Sybil attacks and Wormhole attacks. In order to realize efficient and effective intrusion detection system, we use the well-known network simulator (ns-2) to verify the performance of the security system. Simulation results indicate that the security system can achieve high detection rates and effectively detect anomalies with low rate of false alarms.

  13. A Novel Data Hierarchical Fusion Method for Gas Turbine Engine Performance Fault Diagnosis

    Directory of Open Access Journals (Sweden)

    Feng Lu

    2016-10-01

    Full Text Available Gas path fault diagnosis involves the effective utilization of condition-based sensor signals along engine gas path to accurately identify engine performance failure. The rapid development of information processing technology has led to the use of multiple-source information fusion for fault diagnostics. Numerous efforts have been paid to develop data-based fusion methods, such as neural networks fusion, while little research has focused on fusion architecture or the fusion of different method kinds. In this paper, a data hierarchical fusion using improved weighted Dempster–Shaffer evidence theory (WDS is proposed, and the integration of data-based and model-based methods is presented for engine gas-path fault diagnosis. For the purpose of simplifying learning machine typology, a recursive reduced kernel based extreme learning machine (RR-KELM is developed to produce the fault probability, which is considered as the data-based evidence. Meanwhile, the model-based evidence is achieved using particle filter-fuzzy logic algorithm (PF-FL by engine health estimation and component fault location in feature level. The outputs of two evidences are integrated using WDS evidence theory in decision level to reach a final recognition decision of gas-path fault pattern. The characteristics and advantages of two evidences are analyzed and used as guidelines for data hierarchical fusion framework. Our goal is that the proposed methodology provides much better performance of gas-path fault diagnosis compared to solely relying on data-based or model-based method. The hierarchical fusion framework is evaluated in terms to fault diagnosis accuracy and robustness through a case study involving fault mode dataset of a turbofan engine that is generated by the general gas turbine simulation. These applications confirm the effectiveness and usefulness of the proposed approach.

  14. Progeny Clustering: A Method to Identify Biological Phenotypes

    Science.gov (United States)

    Hu, Chenyue W.; Kornblau, Steven M.; Slater, John H.; Qutub, Amina A.

    2015-01-01

    Estimating the optimal number of clusters is a major challenge in applying cluster analysis to any type of dataset, especially to biomedical datasets, which are high-dimensional and complex. Here, we introduce an improved method, Progeny Clustering, which is stability-based and exceptionally efficient in computing, to find the ideal number of clusters. The algorithm employs a novel Progeny Sampling method to reconstruct cluster identity, a co-occurrence probability matrix to assess the clustering stability, and a set of reference datasets to overcome inherent biases in the algorithm and data space. Our method was shown successful and robust when applied to two synthetic datasets (datasets of two-dimensions and ten-dimensions containing eight dimensions of pure noise), two standard biological datasets (the Iris dataset and Rat CNS dataset) and two biological datasets (a cell phenotype dataset and an acute myeloid leukemia (AML) reverse phase protein array (RPPA) dataset). Progeny Clustering outperformed some popular clustering evaluation methods in the ten-dimensional synthetic dataset as well as in the cell phenotype dataset, and it was the only method that successfully discovered clinically meaningful patient groupings in the AML RPPA dataset. PMID:26267476

  15. Open-Source Sequence Clustering Methods Improve the State Of the Art.

    Science.gov (United States)

    Kopylova, Evguenia; Navas-Molina, Jose A; Mercier, Céline; Xu, Zhenjiang Zech; Mahé, Frédéric; He, Yan; Zhou, Hong-Wei; Rognes, Torbjørn; Caporaso, J Gregory; Knight, Rob

    2016-01-01

    Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent analysis steps. Here, we evaluated the performance of recently released state-of-the-art open-source clustering software products, namely, OTUCLUST, Swarm, SUMACLUST, and SortMeRNA, against current principal options (UCLUST and USEARCH) in QIIME, hierarchical clustering methods in mothur, and USEARCH's most recent clustering algorithm, UPARSE. All the latest open-source tools showed promising results, reporting up to 60% fewer spurious OTUs than UCLUST, indicating that the underlying clustering algorithm can vastly reduce the number of these derived OTUs. Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results. Swarm, SUMACLUST, and SortMeRNA have been included in the QIIME 1.9.0 release. IMPORTANCE Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014, http

  16. Analysis of genetic diversity in banana cultivars (Musa cvs.) from the South of Oman using AFLP markers and classification by phylogenetic, hierarchical clustering and principal component analyses.

    Science.gov (United States)

    Opara, Umezuruike Linus; Jacobson, Dan; Al-Saady, Nadiya Abubakar

    2010-05-01

    Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs. This study employed amplified fragment length polymorphism (AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman. Using 12 primer combinations, a total of 1094 bands were scored, of which 1012 were polymorphic. Eighty-two unique markers were identified, which revealed the distinct separation of the seven cultivars. The results obtained show that AFLP can be used to differentiate the banana cultivars. Further classification by phylogenetic, hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis. Based on the analytical results, a consensus dendrogram of the banana cultivars is presented.

  17. Analysis of genetic diversity in banana cultivars (Musa cvs.) from the South of Oman using AFLP markers and classification by phylogenetic, hierarchical clustering and principal component analyses*

    Science.gov (United States)

    Opara, Umezuruike Linus; Jacobson, Dan; Al-Saady, Nadiya Abubakar

    2010-01-01

    Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs. This study employed amplified fragment length polymorphism (AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman. Using 12 primer combinations, a total of 1094 bands were scored, of which 1012 were polymorphic. Eighty-two unique markers were identified, which revealed the distinct separation of the seven cultivars. The results obtained show that AFLP can be used to differentiate the banana cultivars. Further classification by phylogenetic, hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis. Based on the analytical results, a consensus dendrogram of the banana cultivars is presented. PMID:20443211

  18. The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

    Science.gov (United States)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-07-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  19. Application of the cluster variation method to interstitial solid solutions

    NARCIS (Netherlands)

    Pekelharing, M.I.

    2008-01-01

    A thermodynamic model for interstitial alloys, based on the Cluster Variation Method (CVM), has been developed, capable of incorporating short range ordering (SRO), long range ordering (LRO), and the mutual interaction between the host and the interstitial sublattices. The obtained cluster-based

  20. Close relation of large cell carcinoma to adenocarcinoma by hierarchical cluster analysis: implications for histologic typing of lung cancer on biopsies.

    Science.gov (United States)

    Hammer, Stephan H; Prall, Friedrich

    2015-09-01

    Determining histologic types of lung cancer on biopsies can be difficult. This study addresses the role of immunohistochemistry in histologic typing, using a tissue microarray (TMA) as "model biopsies," and presents a classification generated by an unsupervised hierarchical cluster analysis. A TMA was made from resection specimens of a consecutive series of 165 lung tumors. In a "tissue-spot review" with hematoxylin and eosin sections all the large cell carcinomas (N=22) were assigned to the noncommittal class of non-small cell lung cancer (NSCLC), as were an additional 37 tumors of defined histologic types. Adenocarcinomas and squamous cell carcinomas included with these NSCLC could be diagnosed by immunohistochemistry with antibodies against TTF-1, Napsin A, cytokeratin (CK)7, p40, p63, and CK5/6 with moderate to good sensitivities and specificities. Unsupervised hierarchical clustering was done with these data and additional high-molecular-weight cytokeratins, CD56, synaptophysin, and chromogranin immunohistochemistry. This delineated separate clusters for adenocarcinomas, large cell carcinomas, neuroendocrine tumors, and squamous cell carcinomas. Notably, adenocarcinoma and large cell carcinoma clusters were closely related and clearly set off from the squamous cell carcinoma cluster. As would be expected for a clinically well-staged series CDX2, GATA3, estrogen, and progesterone receptor immunohistochemistry remained negative in the vast majority of the tumors and, if positive, were restricted to very few cells. These results, the clustering data in particular, underpin the pragmatic recommendation canvassed with the IASLC/ATS/ERS classification of lung cancers that adenocarcinoma-type molecular studies should include NSCLC with a nonsquamous cell carcinoma immunophenotype.

  1. Hierarchical remote data possession checking method based on massive cloud files

    Directory of Open Access Journals (Sweden)

    Ma Haifeng

    2017-06-01

    Full Text Available Cloud storage service enables users to migrate their data and applications to the cloud, which saves the local data maintenance and brings great convenience to the users. But in cloud storage, the storage servers may not be fully trustworthy. How to verify the integrity of cloud data with lower overhead for users has become an increasingly concerned problem. Many remote data integrity protection methods have been proposed, but these methods authenticated cloud files one by one when verifying multiple files. Therefore, the computation and communication overhead are still high. Aiming at this problem, a hierarchical remote data possession checking (hierarchical-remote data possession checking (H-RDPC method is proposed, which can provide efficient and secure remote data integrity protection and can support dynamic data operations. This paper gives the algorithm descriptions, security, and false negative rate analysis of H-RDPC. The security analysis and experimental performance evaluation results show that the proposed H-RDPC is efficient and reliable in verifying massive cloud files, and it has 32–81% improvement in performance compared with RDPC.

  2. Initialization independent clustering with actively self-training method.

    Science.gov (United States)

    Nie, Feiping; Xu, Dong; Li, Xuelong

    2012-02-01

    The results of traditional clustering methods are usually unreliable as there is not any guidance from the data labels, while the class labels can be predicted more reliable by the semisupervised learning if the labels of partial data are given. In this paper, we propose an actively self-training clustering method, in which the samples are actively selected as training set to minimize an estimated Bayes error, and then explore semisupervised learning to perform clustering. Traditional graph-based semisupervised learning methods are not convenient to estimate the Bayes error; we develop a specific regularization framework on graph to perform semisupervised learning, in which the Bayes error can be effectively estimated. In addition, the proposed clustering algorithm can be readily applied in a semisupervised setting with partial class labels. Experimental results on toy data and real-world data sets demonstrate the effectiveness of the proposed clustering method on the unsupervised and the semisupervised setting. It is worthy noting that the proposed clustering method is free of initialization, while traditional clustering methods are usually dependent on initialization.

  3. Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation

    KAUST Repository

    Yokota, Rio

    2018-01-03

    There has been a large increase in the amount of work on hierarchical low-rank approximation methods, where the interest is shared by multiple communities that previously did not intersect. This objective of this article is two-fold; to provide a thorough review of the recent advancements in this field from both analytical and algebraic perspectives, and to present a comparative benchmark of two highly optimized implementations of contrasting methods for some simple yet representative test cases. The first half of this paper has the form of a survey paper, to achieve the former objective. We categorize the recent advances in this field from the perspective of compute-memory tradeoff, which has not been considered in much detail in this area. Benchmark tests reveal that there is a large difference in the memory consumption and performance between the different methods.

  4. Robust Optimization Design for Turbine Blade-Tip Radial Running Clearance using Hierarchically Response Surface Method

    Science.gov (United States)

    Zhiying, Chen; Ping, Zhou

    2017-11-01

    Considering the robust optimization computational precision and efficiency for complex mechanical assembly relationship like turbine blade-tip radial running clearance, a hierarchically response surface robust optimization algorithm is proposed. The distribute collaborative response surface method is used to generate assembly system level approximation model of overall parameters and blade-tip clearance, and then a set samples of design parameters and objective response mean and/or standard deviation is generated by using system approximation model and design of experiment method. Finally, a new response surface approximation model is constructed by using those samples, and this approximation model is used for robust optimization process. The analyses results demonstrate the proposed method can dramatic reduce the computational cost and ensure the computational precision. The presented research offers an effective way for the robust optimization design of turbine blade-tip radial running clearance.

  5. Hierarchical photocatalysts.

    Science.gov (United States)

    Li, Xin; Yu, Jiaguo; Jaroniec, Mietek

    2016-05-07

    As a green and sustainable technology, semiconductor-based heterogeneous photocatalysis has received much attention in the last few decades because it has potential to solve both energy and environmental problems. To achieve efficient photocatalysts, various hierarchical semiconductors have been designed and fabricated at the micro/nanometer scale in recent years. This review presents a critical appraisal of fabrication methods, growth mechanisms and applications of advanced hierarchical photocatalysts. Especially, the different synthesis strategies such as two-step templating, in situ template-sacrificial dissolution, self-templating method, in situ template-free assembly, chemically induced self-transformation and post-synthesis treatment are highlighted. Finally, some important applications including photocatalytic degradation of pollutants, photocatalytic H2 production and photocatalytic CO2 reduction are reviewed. A thorough assessment of the progress made in photocatalysis may open new opportunities in designing highly effective hierarchical photocatalysts for advanced applications ranging from thermal catalysis, separation and purification processes to solar cells.

  6. A graph-based clustering method applied to protein sequences.

    Science.gov (United States)

    Mishra, Pooja; Pandey, Paras Nath

    2011-01-01

    The number of amino acid sequences is increasing very rapidly in the protein databases like Swiss-Prot, Uniprot, PIR and others, but the structure of only some amino acid sequences are found in the Protein Data Bank. Thus, an important problem in genomics is automatically clustering homologous protein sequences when only sequence information is available. Here, we use graph theoretic techniques for clustering amino acid sequences. A similarity graph is defined and clusters in that graph correspond to connected subgraphs. Cluster analysis seeks grouping of amino acid sequences into subsets based on distance or similarity score between pairs of sequences. Our goal is to find disjoint subsets, called clusters, such that two criteria are satisfied: homogeneity: sequences in the same cluster are highly similar to each other; and separation: sequences in different clusters have low similarity to each other. We tested our method on several subsets of SCOP (Structural Classification of proteins) database, a gold standard for protein structure classification. The results show that for a given set of proteins the number of clusters we obtained is close to the superfamilies in that set; there are fewer singeltons; and the method correctly groups most remote homologs.

  7. Coherence-based Time Series Clustering for Brain Connectivity Visualization

    KAUST Repository

    Euan, Carolina

    2017-11-19

    We develop the hierarchical cluster coherence (HCC) method for brain signals, a procedure for characterizing connectivity in a network by clustering nodes or groups of channels that display high level of coordination as measured by

  8. A hierarchical layout design method based on rubber band potentialenergy descending

    Directory of Open Access Journals (Sweden)

    Ou Cheng Yi

    2016-01-01

    Full Text Available Strip packing problems is one important sub-problem of the Cutting stock problems. Its application domains include sheet metal, ship making, wood, furniture, garment, shoes and glass. In this paper, a hierarchical layout design method based on rubber band potential-energy descending was proposed. The basic concept of the rubber band enclosing model was described in detail. We divided the layout process into three different stages: initial layout stage, rubber band enclosing stage and local adjustment stage. In different stages, the most efficient strategies were employed for further improving the layout solution. Computational results show that the proposed method performed better than the GLSHA algorithm for three out of nine instances in utilization.

  9. Comparison Of Keyword Based Clustering Of Web Documents By Using Openstack 4j And By Traditional Method

    Directory of Open Access Journals (Sweden)

    Shiza Anand

    2015-08-01

    Full Text Available As the number of hypertext documents are increasing continuously day by day on world wide web. Therefore clustering methods will be required to bind documents into the clusters repositories according to the similarity lying between the documents. Various clustering methods exist such as Hierarchical Based K-means Fuzzy Logic Based Centroid Based etc. These keyword based clustering methods takes much more amount of time for creating containers and putting documents in their respective containers. These traditional methods use File Handling techniques of different programming languages for creating repositories and transferring web documents into these containers. In contrast openstack4j SDK is a new technique for creating containers and shifting web documents into these containers according to the similarity in much more less amount of time as compared to the traditional methods. Another benefit of this technique is that this SDK understands and reads all types of files such as jpg html pdf doc etc. This paper compares the time required for clustering of documents by using openstack4j and by traditional methods and suggests various search engines to adopt this technique for clustering so that they give result to the user querries in less amount of time.

  10. A novel generic optimization method for irrigation scheduling under multiple objectives and multiple hierarchical layers in a canal network

    Science.gov (United States)

    Delgoda, Dilini; Malano, Hector; Saleem, Syed K.; Halgamuge, Malka N.

    2017-07-01

    This research proposes a novel generic method for irrigation scheduling in a canal network to optimize multiple objectives related to canal scheduling (e.g. maximizing water supply and minimizing imbalance of water distribution) within multiple hierarchical layers (e.g. the layers consisting of the main canal, distributaries) while utilizing traditional canal scheduling methods. It is based on modularizing the optimization process. The method is theoretically capable of optimizing an unlimited number of user-defined objectives within an unlimited number of hierarchical layers and only limited by resource availability (e.g. maximum canal capacity and water limitations) in the network. It allows flexible decision-making through quantification of the mutual effects of optimizing conflicting objectives and is adaptable to available multi-objective evolutionary algorithms. The method's application is demonstrated using a hypothetical canal network example with six objectives and three hierarchical layers, and a real scenario with four objectives and two layers.

  11. Correlation of Volatile Compounds and Sensory Attributes of Chinese Traditional Sweet Fermented Flour Pastes Using Hierarchical Cluster Analysis and Partial Least Squares-Discriminant Analysis

    Directory of Open Access Journals (Sweden)

    Meigui Huang

    2017-01-01

    Full Text Available The aroma compositions, sensory attributes, and their correlations of various traditional Chinese sweet fermented flour pastes (SFFPs were investigated. SFFPs, including LEEJ, LEEH, and XH6, showed high overall acceptance scores of 8.00, 8.21, and 7.50, respectively. Ninety-six volatile compounds were detected using solid-phase microextraction gas chromatography mass spectrometry. Hierarchical cluster analysis grouped SFFPs into three clusters according to their concentrations and compositions of volatile components. Partial least squares-discriminant analysis showed that volatile compounds, including ethyl phenylacetate, 5-methyl furfural, amyl cinnamal, ethyl myristate, decyl aldehyde, 1-phenylethyl acetate, 1-octen-3-ol, 3-buten-2-ol, butanoic acid, and caproaldehyde, were highly negatively correlated with saltiness, sourness, and bitterness, while they were positively correlated with sweetness, umami, richness, and acceptance. The obvious correlation between flavor profiles and sensory attributes could help online monitoring of SFFPs’ flavor quality during production.

  12. Linking landscape characteristics to local grizzly bear abundance using multiple detection methods in a hierarchical model

    Science.gov (United States)

    Graves, T.A.; Kendall, Katherine C.; Royle, J. Andrew; Stetz, J.B.; Macleod, A.C.

    2011-01-01

    Few studies link habitat to grizzly bear Ursus arctos abundance and these have not accounted for the variation in detection or spatial autocorrelation. We collected and genotyped bear hair in and around Glacier National Park in northwestern Montana during the summer of 2000. We developed a hierarchical Markov chain Monte Carlo model that extends the existing occupancy and count models by accounting for (1) spatially explicit variables that we hypothesized might influence abundance; (2) separate sub-models of detection probability for two distinct sampling methods (hair traps and rub trees) targeting different segments of the population; (3) covariates to explain variation in each sub-model of detection; (4) a conditional autoregressive term to account for spatial autocorrelation; (5) weights to identify most important variables. Road density and per cent mesic habitat best explained variation in female grizzly bear abundance; spatial autocorrelation was not supported. More female bears were predicted in places with lower road density and with more mesic habitat. Detection rates of females increased with rub tree sampling effort. Road density best explained variation in male grizzly bear abundance and spatial autocorrelation was supported. More male bears were predicted in areas of low road density. Detection rates of males increased with rub tree and hair trap sampling effort and decreased over the sampling period. We provide a new method to (1) incorporate multiple detection methods into hierarchical models of abundance; (2) determine whether spatial autocorrelation should be included in final models. Our results suggest that the influence of landscape variables is consistent between habitat selection and abundance in this system.

  13. A new method to prepare colloids of size-controlled clusters from a matrix assembly cluster source

    Science.gov (United States)

    Cai, Rongsheng; Jian, Nan; Murphy, Shane; Bauer, Karl; Palmer, Richard E.

    2017-05-01

    A new method for the production of colloidal suspensions of physically deposited clusters is demonstrated. A cluster source has been used to deposit size-controlled clusters onto water-soluble polymer films, which are then dissolved to produce colloidal suspensions of clusters encapsulated with polymer molecules. This process has been demonstrated using different cluster materials (Au and Ag) and polymers (polyvinylpyrrolidone, polyvinyl alcohol, and polyethylene glycol). Scanning transmission electron microscopy of the clusters before and after colloidal dispersion confirms that the polymers act as stabilizing agents. We propose that this method is suitable for the production of biocompatible colloids of ultraprecise clusters.

  14. Macroscopic Rock Texture Image Classification Using a Hierarchical Neuro-Fuzzy Class Method

    Directory of Open Access Journals (Sweden)

    Laercio B. Gonçalves

    2010-01-01

    Full Text Available We used a Hierarchical Neuro-Fuzzy Class Method based on binary space partitioning (NFHB-Class Method for macroscopic rock texture classification. The relevance of this study is in helping Geologists in the diagnosis and planning of oil reservoir exploration. The proposed method is capable of generating its own decision structure, with automatic extraction of fuzzy rules. These rules are linguistically interpretable, thus explaining the obtained data structure. The presented image classification for macroscopic rocks is based on texture descriptors, such as spatial variation coefficient, Hurst coefficient, entropy, and cooccurrence matrix. Four rock classes have been evaluated by the NFHB-Class Method: gneiss (two subclasses, basalt (four subclasses, diabase (five subclasses, and rhyolite (five subclasses. These four rock classes are of great interest in the evaluation of oil boreholes, which is considered a complex task by geologists. We present a computer method to solve this problem. In order to evaluate system performance, we used 50 RGB images for each rock classes and subclasses, thus producing a total of 800 images. For all rock classes, the NFHB-Class Method achieved a percentage of correct hits over 73%. The proposed method converged for all tests presented in the case study.

  15. Maritime clusters productivity and competitiveness evaluation methods: Systematic approach

    Directory of Open Access Journals (Sweden)

    Viederytė Rasa

    2014-01-01

    Full Text Available Many scientists underline the importance of the clusters as agglomerated industries, working for the same purpose with joined resources and potential. This article analyses the basic assumptions which turn organizations to be clustered: the Productivity and the Competitiveness. For the evaluation of those assumptions in Maritime Clusters, many of the methods practically are applied without systematic approach - some are focused to the port efficiency, others provide quantity of resources growth dynamics, infrastructure parameters or even explain productivity and competitiveness as the same assumption. This article presents the analysis of Maritime Clusters' Productivity and Competitiveness evaluation methods in systematic approach, providing the analysis on the mostly-used variables and parameters of the evaluation the assumptions to be examined.

  16. A Hierarchical Approach Using Machine Learning Methods in Solar Photovoltaic Energy Production Forecasting

    Directory of Open Access Journals (Sweden)

    Zhaoxuan Li

    2016-01-01

    Full Text Available We evaluate and compare two common methods, artificial neural networks (ANN and support vector regression (SVR, for predicting energy productions from a solar photovoltaic (PV system in Florida 15 min, 1 h and 24 h ahead of time. A hierarchical approach is proposed based on the machine learning algorithms tested. The production data used in this work corresponds to 15 min averaged power measurements collected from 2014. The accuracy of the model is determined using computing error statistics such as mean bias error (MBE, mean absolute error (MAE, root mean square error (RMSE, relative MBE (rMBE, mean percentage error (MPE and relative RMSE (rRMSE. This work provides findings on how forecasts from individual inverters will improve the total solar power generation forecast of the PV system.

  17. Methods for sample size determination in cluster randomized trials.

    Science.gov (United States)

    Rutterford, Clare; Copas, Andrew; Eldridge, Sandra

    2015-06-01

    The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. © The Author 2015. Published by Oxford University Press on behalf of the International Epidemiological Association.

  18. Clustering and training set selection methods for improving the accuracy of quantitative laser induced breakdown spectroscopy

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Ryan B., E-mail: randerson@astro.cornell.edu [Cornell University Department of Astronomy, 406 Space Sciences Building, Ithaca, NY 14853 (United States); Bell, James F., E-mail: Jim.Bell@asu.edu [Arizona State University School of Earth and Space Exploration, Bldg.: INTDS-A, Room: 115B, Box 871404, Tempe, AZ 85287 (United States); Wiens, Roger C., E-mail: rwiens@lanl.gov [Los Alamos National Laboratory, P.O. Box 1663 MS J565, Los Alamos, NM 87545 (United States); Morris, Richard V., E-mail: richard.v.morris@nasa.gov [NASA Johnson Space Center, 2101 NASA Parkway, Houston, TX 77058 (United States); Clegg, Samuel M., E-mail: sclegg@lanl.gov [Los Alamos National Laboratory, P.O. Box 1663 MS J565, Los Alamos, NM 87545 (United States)

    2012-04-15

    We investigated five clustering and training set selection methods to improve the accuracy of quantitative chemical analysis of geologic samples by laser induced breakdown spectroscopy (LIBS) using partial least squares (PLS) regression. The LIBS spectra were previously acquired for 195 rock slabs and 31 pressed powder geostandards under 7 Torr CO{sub 2} at a stand-off distance of 7 m at 17 mJ per pulse to simulate the operational conditions of the ChemCam LIBS instrument on the Mars Science Laboratory Curiosity rover. The clustering and training set selection methods, which do not require prior knowledge of the chemical composition of the test-set samples, are based on grouping similar spectra and selecting appropriate training spectra for the partial least squares (PLS2) model. These methods were: (1) hierarchical clustering of the full set of training spectra and selection of a subset for use in training; (2) k-means clustering of all spectra and generation of PLS2 models based on the training samples within each cluster; (3) iterative use of PLS2 to predict sample composition and k-means clustering of the predicted compositions to subdivide the groups of spectra; (4) soft independent modeling of class analogy (SIMCA) classification of spectra, and generation of PLS2 models based on the training samples within each class; (5) use of Bayesian information criteria (BIC) to determine an optimal number of clusters and generation of PLS2 models based on the training samples within each cluster. The iterative method and the k-means method using 5 clusters showed the best performance, improving the absolute quadrature root mean squared error (RMSE) by {approx} 3 wt.%. The statistical significance of these improvements was {approx} 85%. Our results show that although clustering methods can modestly improve results, a large and diverse training set is the most reliable way to improve the accuracy of quantitative LIBS. In particular, additional sulfate standards and

  19. Intensity-based hierarchical Bayes method improves testing for differentially expressed genes in microarray experiments

    Directory of Open Access Journals (Sweden)

    Wesselkamper Scott C

    2006-12-01

    Full Text Available Abstract Background The small sample sizes often used for microarray experiments result in poor estimates of variance if each gene is considered independently. Yet accurately estimating variability of gene expression measurements in microarray experiments is essential for correctly identifying differentially expressed genes. Several recently developed methods for testing differential expression of genes utilize hierarchical Bayesian models to "pool" information from multiple genes. We have developed a statistical testing procedure that further improves upon current methods by incorporating the well-documented relationship between the absolute gene expression level and the variance of gene expression measurements into the general empirical Bayes framework. Results We present a novel Bayesian moderated-T, which we show to perform favorably in simulations, with two real, dual-channel microarray experiments and in two controlled single-channel experiments. In simulations, the new method achieved greater power while correctly estimating the true proportion of false positives, and in the analysis of two publicly-available "spike-in" experiments, the new method performed favorably compared to all tested alternatives. We also applied our method to two experimental datasets and discuss the additional biological insights as revealed by our method in contrast to the others. The R-source code for implementing our algorithm is freely available at http://eh3.uc.edu/ibmt. Conclusion We use a Bayesian hierarchical normal model to define a novel Intensity-Based Moderated T-statistic (IBMT. The method is completely data-dependent using empirical Bayes philosophy to estimate hyperparameters, and thus does not require specification of any free parameters. IBMT has the strength of balancing two important factors in the analysis of microarray data: the degree of independence of variances relative to the degree of identity (i.e. t-tests vs. equal variance assumption

  20. A geostatistics-informed hierarchical sensitivity analysis method for complex groundwater flow and transport modeling

    Science.gov (United States)

    Dai, Heng; Chen, Xingyuan; Ye, Ming; Song, Xuehang; Zachara, John M.

    2017-05-01

    Sensitivity analysis is an important tool for development and improvement of mathematical models, especially for complex systems with a high dimension of spatially correlated parameters. Variance-based global sensitivity analysis has gained popularity because it can quantify the relative contribution of uncertainty from different sources. However, its computational cost increases dramatically with the complexity of the considered model and the dimension of model parameters. In this study, we developed a new sensitivity analysis method that integrates the concept of variance-based method with a hierarchical uncertainty quantification framework. Different uncertain inputs are grouped and organized into a multilayer framework based on their characteristics and dependency relationships to reduce the dimensionality of the sensitivity analysis. A set of new sensitivity indices are defined for the grouped inputs using the variance decomposition method. Using this methodology, we identified the most important uncertainty source for a dynamic groundwater flow and solute transport model at the Department of Energy (DOE) Hanford site. The results indicate that boundary conditions and permeability field contribute the most uncertainty to the simulated head field and tracer plume, respectively. The relative contribution from each source varied spatially and temporally. By using a geostatistical approach to reduce the number of realizations needed for the sensitivity analysis, the computational cost of implementing the developed method was reduced to a practically manageable level. The developed sensitivity analysis method is generally applicable to a wide range of hydrologic and environmental problems that deal with high-dimensional spatially distributed input variables.

  1. Clustering Dycom

    KAUST Repository

    Minku, Leandro L.

    2017-10-06

    Background: Software Effort Estimation (SEE) can be formulated as an online learning problem, where new projects are completed over time and may become available for training. In this scenario, a Cross-Company (CC) SEE approach called Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving the high cost of collecting such training projects. However, Dycom relies on splitting CC projects into different subsets in order to create its CC models. Such splitting can have a significant impact on Dycom\\'s predictive performance. Aims: This paper investigates whether clustering methods can be used to help finding good CC splits for Dycom. Method: Dycom is extended to use clustering methods for creating the CC subsets. Three different clustering methods are investigated, namely Hierarchical Clustering, K-Means, and Expectation-Maximisation. Clustering Dycom is compared against the original Dycom with CC subsets of different sizes, based on four SEE databases. A baseline WC model is also included in the analysis. Results: Clustering Dycom with K-Means can potentially help to split the CC projects, managing to achieve similar or better predictive performance than Dycom. However, K-Means still requires the number of CC subsets to be pre-defined, and a poor choice can negatively affect predictive performance. EM enables Dycom to automatically set the number of CC subsets while still maintaining or improving predictive performance with respect to the baseline WC model. Clustering Dycom with Hierarchical Clustering did not offer significant advantage in terms of predictive performance. Conclusion: Clustering methods can be an effective way to automatically generate Dycom\\'s CC subsets.

  2. Image Registration Using Single Cluster PHD Methods

    Science.gov (United States)

    Campbell, M.; Schlangen, I.; Delande, E.; Clark, D.

    Cadets in the Department of Physics at the United States Air Force Academy are using the technique of slitless spectroscopy to analyze the spectra from geostationary satellites during glint season. The equinox periods of the year are particularly favorable for earth-based observers to detect specular reflections off satellites (glints), which have been observed in the past using broadband photometry techniques. Three seasons of glints were observed and analyzed for multiple satellites, as measured across the visible spectrum using a diffraction grating on the Academy’s 16-inch, f/8.2 telescope. It is clear from the results that the glint maximum wavelength decreases relative to the time periods before and after the glint, and that the spectral reflectance during the glint is less like a blackbody. These results are consistent with the presumption that solar panels are the predominant source of specular reflection. The glint spectra are also quantitatively compared to different blackbody curves and the solar spectrum by means of absolute differences and standard deviations. Our initial analysis appears to indicate a potential method of determining relative power capacity.

  3. Adaptive finite element method for fractional differential equations using hierarchical matrices

    Science.gov (United States)

    Zhao, Xuan; Hu, Xiaozhe; Cai, Wei; Karniadakis, George Em

    2017-10-01

    A robust and fast solver for the fractional differential equation (FDEs) involving the Riesz fractional derivative is developed using an adaptive finite element method on non-uniform meshes. It is based on the utilization of hierarchical matrices ($\\mathcal{H}$-Matrices) for the representation of the stiffness matrix resulting from the finite element discretization of the FDEs. We employ a geometric multigrid method for the solution of the algebraic system of equations. We combine it with an adaptive algorithm based on a posteriori error estimation to deal with general-type singularities arising in the solution of the FDEs. Through various test examples we demonstrate the efficiency of the method and the high-accuracy of the numerical solution even in the presence of singularities. The proposed technique has been verified effectively through fundamental examples including Riesz, Left/Right Riemann-Liouville fractional derivative and, furthermore, it can be readily extended to more general fractional differential equations with different boundary conditions and low-order terms. To the best of our knowledge, there are currently no other methods for FDEs that resolve singularities accurately at linear complexity as the one we propose here.

  4. A hierarchical fingerprint alignment method and its application to fuzzy vault

    Science.gov (United States)

    Li, Peng; Yang, Xin; Zang, Yali; Cao, Kai; Tian, Jie

    2010-04-01

    Fuzzy vault is a practical and promising scheme, which can protect biometric templates and perform secure key management simultaneously. Aligning the query sample and the template sample in the encrypted domain remains a challenging task in the fingerprint-based fuzzy vault scheme. To some extent, all the existing fingerprint aligning methods in the encrypted domain have their own drawbacks, e.g., not enough alignment accuracy or information leakage because of publishing helper data. In this paper, a novel fingerprint aligning method is proposed, which integrates the fingerprint reference points and its neighboring region of interest(ROI) in a hierarchical manner. The concept of mutual information(MI) in the information theory is used to assess the coincidence extent of two fingerprints after being aligned. The novel alignment method is applied to fingerprint-based fuzzy vault implementation. Out of information leakage consideration, the orientation features of fingerprint minutiae are discarded and another distinguishing local feature, inter-minutiae ridge count, is used to replace the minutiae orientation in the implementation of fingerprint-based fuzzy vault. Experiment on FVC2002 DB2a is conducted to show the virtue of proposed alignment method and the promising performance of proposed fingerprint-based fuzzy vault implementation.

  5. Hierarchical XP

    OpenAIRE

    Jacobi, Carsten; Rumpe, Bernhard

    2014-01-01

    XP is a light-weight methodology suited particularly for small-sized teams that develop software which has only vague or rapidly changing requirements. The discipline of systems engineering knows it as approach of incremental system change or also of "muddling through". In this paper, we introduce three well known methods of reorganizing companies, namely, the holistic approach, the incremental approach, and the hierarchical approach. We show similarities between software engineering methods ...

  6. Vinayaka : A Semi-Supervised Projected Clustering Method Using Differential Evolution

    OpenAIRE

    Satish Gajawada; Durga Toshniwal

    2012-01-01

    Differential Evolution (DE) is an algorithm for evolutionary optimization. Clustering problems have beensolved by using DE based clustering methods but these methods may fail to find clusters hidden insubspaces of high dimensional datasets. Subspace and projected clustering methods have been proposed inliterature to find subspace clusters that are present in subspaces of dataset. In this paper we proposeVINAYAKA, a semi-supervised projected clustering method based on DE. In this method DE opt...

  7. Agent-based method for distributed clustering of textual information

    Science.gov (United States)

    Potok, Thomas E [Oak Ridge, TN; Reed, Joel W [Knoxville, TN; Elmore, Mark T [Oak Ridge, TN; Treadwell, Jim N [Louisville, TN

    2010-09-28

    A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.

  8. [Cluster analysis in biomedical researches].

    Science.gov (United States)

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

    2013-01-01

    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.

  9. A general strategy to determine the congruence between a hierarchical and a non-hierarchical classification

    Directory of Open Access Journals (Sweden)

    Marín Ignacio

    2007-11-01

    Full Text Available Abstract Background Classification procedures are widely used in phylogenetic inference, the analysis of expression profiles, the study of biological networks, etc. Many algorithms have been proposed to establish the similarity between two different classifications of the same elements. However, methods to determine significant coincidences between hierarchical and non-hierarchical partitions are still poorly developed, in spite of the fact that the search for such coincidences is implicit in many analyses of massive data. Results We describe a novel strategy to compare a hierarchical and a dichotomic non-hierarchical classification of elements, in order to find clusters in a hierarchical tree in which elements of a given "flat" partition are overrepresented. The key improvement of our strategy respect to previous methods is using permutation analyses of ranked clusters to determine whether regions of the dendrograms present a significant enrichment. We show that this method is more sensitive than previously developed strategies and how it can be applied to several real cases, including microarray and interactome data. Particularly, we use it to compare a hierarchical representation of the yeast mitochondrial interactome and a catalogue of known mitochondrial protein complexes, demonstrating a high level of congruence between those two classifications. We also discuss extensions of this method to other cases which are conceptually related. Conclusion Our method is highly sensitive and outperforms previously described strategies. A PERL script that implements it is available at http://www.uv.es/~genomica/treetracker.

  10. A general strategy to determine the congruence between a hierarchical and a non-hierarchical classification.

    Science.gov (United States)

    Marco, Antonio; Marín, Ignacio

    2007-11-15

    Classification procedures are widely used in phylogenetic inference, the analysis of expression profiles, the study of biological networks, etc. Many algorithms have been proposed to establish the similarity between two different classifications of the same elements. However, methods to determine significant coincidences between hierarchical and non-hierarchical partitions are still poorly developed, in spite of the fact that the search for such coincidences is implicit in many analyses of massive data. We describe a novel strategy to compare a hierarchical and a dichotomic non-hierarchical classification of elements, in order to find clusters in a hierarchical tree in which elements of a given "flat" partition are overrepresented. The key improvement of our strategy respect to previous methods is using permutation analyses of ranked clusters to determine whether regions of the dendrograms present a significant enrichment. We show that this method is more sensitive than previously developed strategies and how it can be applied to several real cases, including microarray and interactome data. Particularly, we use it to compare a hierarchical representation of the yeast mitochondrial interactome and a catalogue of known mitochondrial protein complexes, demonstrating a high level of congruence between those two classifications. We also discuss extensions of this method to other cases which are conceptually related. Our method is highly sensitive and outperforms previously described strategies. A PERL script that implements it is available at http://www.uv.es/~genomica/treetracker.

  11. a Novel Co-Templating Method for Hierarchical Mesoporous Alumina Monoliths Replica

    Science.gov (United States)

    Tang, Shaokun; Cui, Xili; Gu, Ling; Zhou, Hu; Zhang, Xiangwen

    2013-09-01

    In this paper, hierarchical meso/macroporous aluminas were obtained by using nonionic block copolymer EO106PO70EO106(F127)/agarose hydrogel as cotemplates. The hierarchical structure was confirmed by SEM, TEM and small-angle X-ray diffraction. The results showed that Al2O3 exhibited a hierarchical structure with interconnected replicable macropores reproduced by agarose scaffold and ordered mesopores constructed by F127 with uniform size. The template employed here is easy to prepare, degradable and reproducible, indicating the agarose xerogel as a promising candidate for the fabrication of porous metal oxides.

  12. Dynamic and Quantitative Method of Analyzing Service Consistency Evolution Based on Extended Hierarchical Finite State Automata

    Directory of Open Access Journals (Sweden)

    Linjun Fan

    2014-01-01

    Full Text Available This paper is concerned with the dynamic evolution analysis and quantitative measurement of primary factors that cause service inconsistency in service-oriented distributed simulation applications (SODSA. Traditional methods are mostly qualitative and empirical, and they do not consider the dynamic disturbances among factors in service’s evolution behaviors such as producing, publishing, calling, and maintenance. Moreover, SODSA are rapidly evolving in terms of large-scale, reusable, compositional, pervasive, and flexible features, which presents difficulties in the usage of traditional analysis methods. To resolve these problems, a novel dynamic evolution model extended hierarchical service-finite state automata (EHS-FSA is constructed based on finite state automata (FSA, which formally depict overall changing processes of service consistency states. And also the service consistency evolution algorithms (SCEAs based on EHS-FSA are developed to quantitatively assess these impact factors. Experimental results show that the bad reusability (17.93% on average is the biggest influential factor, the noncomposition of atomic services (13.12% is the second biggest one, and the service version’s confusion (1.2% is the smallest one. Compared with previous qualitative analysis, SCEAs present good effectiveness and feasibility. This research can guide the engineers of service consistency technologies toward obtaining a higher level of consistency in SODSA.

  13. Strong convergence with a modified iterative projection method for hierarchical fixed point problems and variational inequalities

    Directory of Open Access Journals (Sweden)

    Ibrahim Karahan

    2016-04-01

    Full Text Available Let C be a nonempty closed convex subset of a real Hilbert space H. Let {T_{n}}:C›H be a sequence of nearly nonexpansive mappings such that F:=?_{i=1}^{?}F(T_{i}?Ø. Let V:C›H be a ?-Lipschitzian mapping and F:C›H be a L-Lipschitzian and ?-strongly monotone operator. This paper deals with a modified iterative projection method for approximating a solution of the hierarchical fixed point problem. It is shown that under certain approximate assumptions on the operators and parameters, the modified iterative sequence {x_{n}} converges strongly to x^{*}?F which is also the unique solution of the following variational inequality: ?0, ?x?F. As a special case, this projection method can be used to find the minimum norm solution of above variational inequality; namely, the unique solution x^{*} to the quadratic minimization problem: x^{*}=argmin_{x?F}?x?². The results here improve and extend some recent corresponding results of other authors.

  14. Dynamic and quantitative method of analyzing service consistency evolution based on extended hierarchical finite state automata.

    Science.gov (United States)

    Fan, Linjun; Tang, Jun; Ling, Yunxiang; Li, Benxian

    2014-01-01

    This paper is concerned with the dynamic evolution analysis and quantitative measurement of primary factors that cause service inconsistency in service-oriented distributed simulation applications (SODSA). Traditional methods are mostly qualitative and empirical, and they do not consider the dynamic disturbances among factors in service's evolution behaviors such as producing, publishing, calling, and maintenance. Moreover, SODSA are rapidly evolving in terms of large-scale, reusable, compositional, pervasive, and flexible features, which presents difficulties in the usage of traditional analysis methods. To resolve these problems, a novel dynamic evolution model extended hierarchical service-finite state automata (EHS-FSA) is constructed based on finite state automata (FSA), which formally depict overall changing processes of service consistency states. And also the service consistency evolution algorithms (SCEAs) based on EHS-FSA are developed to quantitatively assess these impact factors. Experimental results show that the bad reusability (17.93% on average) is the biggest influential factor, the noncomposition of atomic services (13.12%) is the second biggest one, and the service version's confusion (1.2%) is the smallest one. Compared with previous qualitative analysis, SCEAs present good effectiveness and feasibility. This research can guide the engineers of service consistency technologies toward obtaining a higher level of consistency in SODSA.

  15. Application of principal component and hierarchical cluster analyses in the classification of Serbian bottled waters and a comparison with waters from some European countries

    Directory of Open Access Journals (Sweden)

    Cvejanov Jelena Đ.

    2017-01-01

    Full Text Available The contents of major ions in bottled waters were analyzed by principal component (PCA and hierarchical cluster (HCA analysis in order to investigate if these techniques could provide the information necessary for classifications of the water brands marketed in Serbia. Data on the contents of Ca2+, Mg2+, Na+, K+, Cl-, SO4 2-, HCO3 - and total dissolved solids (TDS of 33 bottled waters was used as the input data set. The waters were separated into three main clusters according to their levels of TDS, Na+ and HCO3 -; sub-clustering revealed a group of soft waters with the lowest total hardness. Based on the determined chemical parameters, the Serbian waters were further compared with available literature data on bottled waters from some other European countries. To the best of our knowledge, this is the first report applying chemometric classification of bottled waters from different European countries, thereby representing a unique attempt in contrast to previous studies reporting the results primarily on a country-to-country scale. The diverse character of Serbian bottled waters was demonstrated as well as the usefulness of PCA and HCA in the fast classification of the water brands based on their main chemical parameters. [Project of the Serbian Ministry of Education, Science and Technological Development, Grant no. 172050

  16. Statistical mechanics of self-gravitating system: Cluster expansion method

    Science.gov (United States)

    Iguchi, O.; Kurokawa, T.; Morikawa, M.; Nakamichi, A.; Sota, Y.; Tatekawa, T.; Maeda, K.-I.

    1999-09-01

    We study statistical mechanics of the self-gravitating system applying the cluster expansion method developed in solid state physics. By summing infinite series of diagrams, we derive a complex free energy whose imaginary part is related to the relaxation time of the system, and a two-point correlation function.

  17. Quantum Monte Carlo methods and lithium cluster properties

    Energy Technology Data Exchange (ETDEWEB)

    Owen, Richard Kent [Univ. of California, Berkeley, CA (United States)

    1990-12-01

    Properties of small lithium clusters with sizes ranging from n = 1 to 5 atoms were investigated using quantum Monte Carlo (QMC) methods. Cluster geometries were found from complete active space self consistent field (CASSCF) calculations. A detailed development of the QMC method leading to the variational QMC (V-QMC) and diffusion QMC (D-QMC) methods is shown. The many-body aspect of electron correlation is introduced into the QMC importance sampling electron-electron correlation functions by using density dependent parameters, and are shown to increase the amount of correlation energy obtained in V-QMC calculations. A detailed analysis of D-QMC time-step bias is made and is found to be at least linear with respect to the time-step. The D-QMC calculations determined the lithium cluster ionization potentials to be 0.1982(14) [0.1981], 0.1895(9) [0.1874(4)], 0.1530(34) [0.1599(73)], 0.1664(37) [0.1724(110)], 0.1613(43) [0.1675(110)] Hartrees for lithium clusters n = 1 through 5, respectively; in good agreement with experimental results shown in the brackets. Also, the binding energies per atom was computed to be 0.0177(8) [0.0203(12)], 0.0188(10) [0.0220(21)], 0.0247(8) [0.0310(12)], 0.0253(8) [0.0351(8)] Hartrees for lithium clusters n = 2 through 5, respectively. The lithium cluster one-electron density is shown to have charge concentrations corresponding to nonnuclear attractors. The overall shape of the electronic charge density also bears a remarkable similarity with the anisotropic harmonic oscillator model shape for the given number of valence electrons.

  18. A Hierarchical Approach Using Machine Learning Methods in Solar Photovoltaic Energy Production Forecasting

    National Research Council Canada - National Science Library

    Zhaoxuan Li; SM Mahbobur Rahman; Rolando Vega; Bing Dong

    2016-01-01

    .... A hierarchical approach is proposed based on the machine learning algorithms tested. The production data used in this work corresponds to 15 min averaged power measurements collected from 2014...

  19. a Probabilistic Embedding Clustering Method for Urban Structure Detection

    Science.gov (United States)

    Lin, X.; Li, H.; Zhang, Y.; Gao, L.; Zhao, L.; Deng, M.

    2017-09-01

    Urban structure detection is a basic task in urban geography. Clustering is a core technology to detect the patterns of urban spatial structure, urban functional region, and so on. In big data era, diverse urban sensing datasets recording information like human behaviour and human social activity, suffer from complexity in high dimension and high noise. And unfortunately, the state-of-the-art clustering methods does not handle the problem with high dimension and high noise issues concurrently. In this paper, a probabilistic embedding clustering method is proposed. Firstly, we come up with a Probabilistic Embedding Model (PEM) to find latent features from high dimensional urban sensing data by "learning" via probabilistic model. By latent features, we could catch essential features hidden in high dimensional data known as patterns; with the probabilistic model, we can also reduce uncertainty caused by high noise. Secondly, through tuning the parameters, our model could discover two kinds of urban structure, the homophily and structural equivalence, which means communities with intensive interaction or in the same roles in urban structure. We evaluated the performance of our model by conducting experiments on real-world data and experiments with real data in Shanghai (China) proved that our method could discover two kinds of urban structure, the homophily and structural equivalence, which means clustering community with intensive interaction or under the same roles in urban space.

  20. A PROBABILISTIC EMBEDDING CLUSTERING METHOD FOR URBAN STRUCTURE DETECTION

    Directory of Open Access Journals (Sweden)

    X. Lin

    2017-09-01

    Full Text Available Urban structure detection is a basic task in urban geography. Clustering is a core technology to detect the patterns of urban spatial structure, urban functional region, and so on. In big data era, diverse urban sensing datasets recording information like human behaviour and human social activity, suffer from complexity in high dimension and high noise. And unfortunately, the state-of-the-art clustering methods does not handle the problem with high dimension and high noise issues concurrently. In this paper, a probabilistic embedding clustering method is proposed. Firstly, we come up with a Probabilistic Embedding Model (PEM to find latent features from high dimensional urban sensing data by “learning” via probabilistic model. By latent features, we could catch essential features hidden in high dimensional data known as patterns; with the probabilistic model, we can also reduce uncertainty caused by high noise. Secondly, through tuning the parameters, our model could discover two kinds of urban structure, the homophily and structural equivalence, which means communities with intensive interaction or in the same roles in urban structure. We evaluated the performance of our model by conducting experiments on real-world data and experiments with real data in Shanghai (China proved that our method could discover two kinds of urban structure, the homophily and structural equivalence, which means clustering community with intensive interaction or under the same roles in urban space.

  1. A hierarchical updating method for finite element model of airbag buffer system under landing impact

    Directory of Open Access Journals (Sweden)

    He Huan

    2015-12-01

    Full Text Available In this paper, we propose an impact finite element (FE model for an airbag landing buffer system. First, an impact FE model has been formulated for a typical airbag landing buffer system. We use the independence of the structure FE model from the full impact FE model to develop a hierarchical updating scheme for the recovery module FE model and the airbag system FE model. Second, we define impact responses at key points to compare the computational and experimental results to resolve the inconsistency between the experimental data sampling frequency and experimental triggering. To determine the typical characteristics of the impact dynamics response of the airbag landing buffer system, we present the impact response confidence factors (IRCFs to evaluate how consistent the computational and experiment results are. An error function is defined between the experimental and computational results at key points of the impact response (KPIR to serve as a modified objective function. A radial basis function (RBF is introduced to construct updating variables for a surrogate model for updating the objective function, thereby converting the FE model updating problem to a soluble optimization problem. Finally, the developed method has been validated using an experimental and computational study on the impact dynamics of a classic airbag landing buffer system.

  2. Method of preparing size-selected metal clusters

    Science.gov (United States)

    Elam, Jeffrey W.; Pellin, Michael J.; Stair, Peter C.

    2010-05-11

    The invention provides a method for depositing catalytic clusters on a surface, the method comprising confining the surface to a controlled atmosphere; contacting the surface with catalyst containing vapor for a first period of time; removing the vapor from the controlled atmosphere; and contacting the surface with a reducing agent for a second period of time so as to produce catalyst-containing nucleation sites.

  3. Credit networks and systemic risk of Chinese local financing platforms: Too central or too big to fail?. -based on different credit correlations using hierarchical methods

    Science.gov (United States)

    He, Fang; Chen, Xi

    2016-11-01

    The accelerating accumulation and risk concentration of Chinese local financing platforms debts have attracted wide attention throughout the world. Due to the network of financial exposures among institutions, the failure of several platforms or regions of systemic importance will probably trigger systemic risk and destabilize the financial system. However, the complex network of credit relationships in Chinese local financing platforms at the state level remains unknown. To fill this gap, we presented the first complex networks and hierarchical cluster analysis of the credit market of Chinese local financing platforms using the ;bottom up; method from firm-level data. Based on balance-sheet channel, we analyzed the topology and taxonomy by applying the analysis paradigm of subdominant ultra-metric space to an empirical data in 2013. It is remarked that we chose to extract the network of co-financed financing platforms in order to evaluate the effect of risk contagion from platforms to bank system. We used the new credit similarity measure by combining the factor of connectivity and size, to extract minimal spanning trees (MSTs) and hierarchical trees (HTs). We found that: (1) the degree distributions of credit correlation backbone structure of Chinese local financing platforms are fat tailed, and the structure is unstable with respect to targeted failures; (2) the backbone is highly hierarchical, and largely explained by the geographic region; (3) the credit correlation backbone structure based on connectivity and size is significantly heterogeneous; (4) key platforms and regions of systemic importance, and contagion path of systemic risk are obtained, which are contributed to preventing systemic risk and regional risk of Chinese local financing platforms and preserving financial stability under the framework of macro prudential supervision. Our approach of credit similarity measure provides a means of recognizing ;systemically important; institutions and regions

  4. A novel method for a multi-level hierarchical composite with brick-and-mortar structure.

    Science.gov (United States)

    Brandt, Kristina; Wolff, Michael F H; Salikov, Vitalij; Heinrich, Stefan; Schneider, Gerold A

    2013-01-01

    The fascination for hierarchically structured hard tissues such as enamel or nacre arises from their unique structure-properties-relationship. During the last decades this numerously motivated the synthesis of composites, mimicking the brick-and-mortar structure of nacre. However, there is still a lack in synthetic engineering materials displaying a true hierarchical structure. Here, we present a novel multi-step processing route for anisotropic 2-level hierarchical composites by combining different coating techniques on different length scales. It comprises polymer-encapsulated ceramic particles as building blocks for the first level, followed by spouted bed spray granulation for a second level, and finally directional hot pressing to anisotropically consolidate the composite. The microstructure achieved reveals a brick-and-mortar hierarchical structure with distinct, however not yet optimized mechanical properties on each level. It opens up a completely new processing route for the synthesis of multi-level hierarchically structured composites, giving prospects to multi-functional structure-properties relationships.

  5. Application of a hierarchical enzyme classification method reveals the role of gut microbiome in human metabolism

    Science.gov (United States)

    2015-01-01

    , cofactors and vitamins. Conclusions The ECemble method is able to hierarchically assign high quality enzyme annotations to genomic and metagenomic data. This study demonstrated the real application of ECemble to understand the indispensable role played by microbe-encoded enzymes in the healthy functioning of human metabolic systems. PMID:26099921

  6. Critérios de formação de carteiras de ativos por meio de Hierarchical Clusters

    Directory of Open Access Journals (Sweden)

    Pierre Lucena

    2010-04-01

    Full Text Available Este artigo tem como objetivo principal apresentar e testar uma ferramenta de estatística multivariada em modelos financeiros. Essa metodologia, conhecida como análise de clusters, separa as observações em grupos com suas determinadas características, em contraste com a metodologia tradicional, que é somente a ordem com os quantis. Foi aplicada essa ferramenta em 213 ações negociadas na Bolsa de São Paulo (Bovespa, separando os grupos por tamanho e book-tomarket. Depois, as novas carteiras foram aplicadas no modelo de Fama e French (1996, comparando os resultados numa formação de carteira para quantil e análise de cluster. Foram encontrados melhores resultados na segunda metodologia. Os autores concluem que a análise de cluster pode ser mais adequada porque tende a formar grupos mais homogeneizados, sendo sua aplicação útil para a formação de carteiras e para a teoria financeira.

  7. Investigating the provenance of iron artifacts of the Royal Iron Factory of Sao Joao de Ipanema by hierarchical cluster analysis of EDS microanalyses of slag inclusions

    Energy Technology Data Exchange (ETDEWEB)

    Mamani-Calcina, Elmer Antonio; Landgraf, Fernando Jose Gomes; Azevedo, Cesar Roberto de Farias, E-mail: c.azevedo@usp.br [Universidade de Sao Paulo (USP), Sao Paulo, SP (Brazil). Escola Politecnica. Departmento de Engenharia Metalurgica e de Materiais

    2017-01-15

    Microstructural characterization techniques, including EDX (Energy Dispersive X-ray Analysis) microanalyses, were used to investigate the slag inclusions in the microstructure of ferrous artifacts of the Royal Iron Factory of Sao Joao de Ipanema (first steel plant of Brazil, XIX century), the D. Pedro II Bridge (located in Bahia, assembled in XIX century and produced in Scotland) and the archaeological sites of Sao Miguel de Missoes (Rio Grande do Sul, Brazil, production site of iron artifacts, the XVIII century) and Afonso Sardinha (Sao Paulo, Brazil production site of iron artifacts, XVI century). The microanalyses results of the main micro constituents of the microstructure of the slag inclusions were investigated by hierarchical cluster analysis and the dendrogram with the microanalyses results of the wüstite phase (using as critical variables the contents of MnO, MgO, Al{sub 2}O{sub 3}, V{sub 2}O{sub 5} and TiO{sub 2}) allowed the identification of four clusters, which successfully represented the samples of the four investigated sites (Ipanema, Sardinha, Missoes and Bahia). Finally, the comparatively low volumetric fraction of slag inclusions in the samples of Ipanema (∼1%) suggested the existence of technological expertise at the iron making processing in the Royal Iron Factory of Sao Joao de Ipanema. (author)

  8. Advanced cluster methods for correlated-electron systems

    Energy Technology Data Exchange (ETDEWEB)

    Fischer, Andre

    2015-04-27

    In this thesis, quantum cluster methods are used to calculate electronic properties of correlated-electron systems. A special focus lies in the determination of the ground state properties of a 3/4 filled triangular lattice within the one-band Hubbard model. At this filling, the electronic density of states exhibits a so-called van Hove singularity and the Fermi surface becomes perfectly nested, causing an instability towards a variety of spin-density-wave (SDW) and superconducting states. While chiral d+id-wave superconductivity has been proposed as the ground state in the weak coupling limit, the situation towards strong interactions is unclear. Additionally, quantum cluster methods are used here to investigate the interplay of Coulomb interactions and symmetry-breaking mechanisms within the nematic phase of iron-pnictide superconductors. The transition from a tetragonal to an orthorhombic phase is accompanied by a significant change in electronic properties, while long-range magnetic order is not established yet. The driving force of this transition may not only be phonons but also magnetic or orbital fluctuations. The signatures of these scenarios are studied with quantum cluster methods to identify the most important effects. Here, cluster perturbation theory (CPT) and its variational extention, the variational cluster approach (VCA) are used to treat the respective systems on a level beyond mean-field theory. Short-range correlations are incorporated numerically exactly by exact diagonalization (ED). In the VCA, long-range interactions are included by variational optimization of a fictitious symmetry-breaking field based on a self-energy functional approach. Due to limitations of ED, cluster sizes are limited to a small number of degrees of freedom. For the 3/4 filled triangular lattice, the VCA is performed for different cluster symmetries. A strong symmetry dependence and finite-size effects make a comparison of the results from different clusters difficult

  9. A Hierarchical Bayesian M/EEG Imaging Method Correcting for Incomplete Spatio-Temporal Priors

    DEFF Research Database (Denmark)

    Stahlhut, Carsten; Attias, Hagai T.; Sekihara, Kensuke

    2013-01-01

    In this paper we present a hierarchical Bayesian model, to tackle the highly ill-posed problem that follows with MEG and EEG source imaging. Our model promotes spatiotemporal patterns through the use of both spatial and temporal basis functions. While in contrast to most previous spatio-temporal ......In this paper we present a hierarchical Bayesian model, to tackle the highly ill-posed problem that follows with MEG and EEG source imaging. Our model promotes spatiotemporal patterns through the use of both spatial and temporal basis functions. While in contrast to most previous spatio...

  10. Exploring function prediction in protein interaction networks via clustering methods.

    Science.gov (United States)

    Trivodaliev, Kire; Bogojeska, Aleksandra; Kocarev, Ljupco

    2014-01-01

    Complex networks have recently become the focus of research in many fields. Their structure reveals crucial information for the nodes, how they connect and share information. In our work we analyze protein interaction networks as complex networks for their functional modular structure and later use that information in the functional annotation of proteins within the network. We propose several graph representations for the protein interaction network, each having different level of complexity and inclusion of the annotation information within the graph. We aim to explore what the benefits and the drawbacks of these proposed graphs are, when they are used in the function prediction process via clustering methods. For making this cluster based prediction, we adopt well established approaches for cluster detection in complex networks using most recent representative algorithms that have been proven as efficient in the task at hand. The experiments are performed using a purified and reliable Saccharomyces cerevisiae protein interaction network, which is then used to generate the different graph representations. Each of the graph representations is later analysed in combination with each of the clustering algorithms, which have been possibly modified and implemented to fit the specific graph. We evaluate results in regards of biological validity and function prediction performance. Our results indicate that the novel ways of presenting the complex graph improve the prediction process, although the computational complexity should be taken into account when deciding on a particular approach.

  11. Exploring function prediction in protein interaction networks via clustering methods.

    Directory of Open Access Journals (Sweden)

    Kire Trivodaliev

    Full Text Available Complex networks have recently become the focus of research in many fields. Their structure reveals crucial information for the nodes, how they connect and share information. In our work we analyze protein interaction networks as complex networks for their functional modular structure and later use that information in the functional annotation of proteins within the network. We propose several graph representations for the protein interaction network, each having different level of complexity and inclusion of the annotation information within the graph. We aim to explore what the benefits and the drawbacks of these proposed graphs are, when they are used in the function prediction process via clustering methods. For making this cluster based prediction, we adopt well established approaches for cluster detection in complex networks using most recent representative algorithms that have been proven as efficient in the task at hand. The experiments are performed using a purified and reliable Saccharomyces cerevisiae protein interaction network, which is then used to generate the different graph representations. Each of the graph representations is later analysed in combination with each of the clustering algorithms, which have been possibly modified and implemented to fit the specific graph. We evaluate results in regards of biological validity and function prediction performance. Our results indicate that the novel ways of presenting the complex graph improve the prediction process, although the computational complexity should be taken into account when deciding on a particular approach.

  12. Translationally-invariant coupled-cluster method for finite systems

    Energy Technology Data Exchange (ETDEWEB)

    Guardiola, R.; Moliner, I. [Valencia Univ., Burjassot (Spain). Dept. de Fisica Atomica Molecular i Nuclear; Navarro, J.; Portesi, M. [IFIC (Centre Mixt CSIC -Universitat de Valencia), Avda. Dr. Moliner 50, E-46.100 Burjassot (Spain)

    1998-01-12

    The translational invariant formulation of the coupled-cluster method is presented here at the complete SUB(2) level for a system of nucleons treated as bosons. The correlation amplitudes are solutions of a non-linear coupled system of equations. These equations have been solved for light and medium systems, considering the central but still semi-realistic nucleon-nucleon S3 interaction. (orig.). 16 refs.

  13. Global vs. Localized Search: A Comparison of Database Selection Methods in a Hierarchical Environment.

    Science.gov (United States)

    Conrad, Jack G.; Claussen, Joanne Smestad; Yang, Changwen

    2002-01-01

    Compares standard global information retrieval searching with more localized techniques to address the database selection problem that users often have when searching for the most relevant database, based on experiences with the Westlaw Directory. Findings indicate that a browse plus search approach in a hierarchical environment produces the most…

  14. An accessible method for implementing hierarchical models with spatio-temporal abundance data

    Science.gov (United States)

    Ross, Beth E.; Hooten, Melvin B.; Koons, David N.

    2012-01-01

    A common goal in ecology and wildlife management is to determine the causes of variation in population dynamics over long periods of time and across large spatial scales. Many assumptions must nevertheless be overcome to make appropriate inference about spatio-temporal variation in population dynamics, such as autocorrelation among data points, excess zeros, and observation error in count data. To address these issues, many scientists and statisticians have recommended the use of Bayesian hierarchical models. Unfortunately, hierarchical statistical models remain somewhat difficult to use because of the necessary quantitative background needed to implement them, or because of the computational demands of using Markov Chain Monte Carlo algorithms to estimate parameters. Fortunately, new tools have recently been developed that make it more feasible for wildlife biologists to fit sophisticated hierarchical Bayesian models (i.e., Integrated Nested Laplace Approximation, ‘INLA’). We present a case study using two important game species in North America, the lesser and greater scaup, to demonstrate how INLA can be used to estimate the parameters in a hierarchical model that decouples observation error from process variation, and accounts for unknown sources of excess zeros as well as spatial and temporal dependence in the data. Ultimately, our goal was to make unbiased inference about spatial variation in population trends over time.

  15. An accessible method for implementing hierarchical models with spatio-temporal abundance data.

    Directory of Open Access Journals (Sweden)

    Beth E Ross

    Full Text Available A common goal in ecology and wildlife management is to determine the causes of variation in population dynamics over long periods of time and across large spatial scales. Many assumptions must nevertheless be overcome to make appropriate inference about spatio-temporal variation in population dynamics, such as autocorrelation among data points, excess zeros, and observation error in count data. To address these issues, many scientists and statisticians have recommended the use of Bayesian hierarchical models. Unfortunately, hierarchical statistical models remain somewhat difficult to use because of the necessary quantitative background needed to implement them, or because of the computational demands of using Markov Chain Monte Carlo algorithms to estimate parameters. Fortunately, new tools have recently been developed that make it more feasible for wildlife biologists to fit sophisticated hierarchical Bayesian models (i.e., Integrated Nested Laplace Approximation, 'INLA'. We present a case study using two important game species in North America, the lesser and greater scaup, to demonstrate how INLA can be used to estimate the parameters in a hierarchical model that decouples observation error from process variation, and accounts for unknown sources of excess zeros as well as spatial and temporal dependence in the data. Ultimately, our goal was to make unbiased inference about spatial variation in population trends over time.

  16. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  17. Segmentation of MRI Volume Data Based on Clustering Method

    Directory of Open Access Journals (Sweden)

    Ji Dongsheng

    2016-01-01

    Full Text Available Here we analyze the difficulties of segmentation without tag line of left ventricle MR images, and propose an algorithm for automatic segmentation of left ventricle (LV internal and external profiles. Herein, we propose an Incomplete K-means and Category Optimization (IKCO method. Initially, using Hough transformation to automatically locate initial contour of the LV, the algorithm uses a simple approach to complete data subsampling and initial center determination. Next, according to the clustering rules, the proposed algorithm finishes MR image segmentation. Finally, the algorithm uses a category optimization method to improve segmentation results. Experiments show that the algorithm provides good segmentation results.

  18. Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters as high rate capability and long life anode materials for lithium-ion batteries

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Shengming [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Xia, Tian, E-mail: xiatian@hlju.edu.cn [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Wang, Jingping [Key Laboratory of Superlight Material and Surface Technology, Ministry of Education, College of Materials Science and Chemical Engineering, Harbin Engineering University, Heilongjiang, Harbin 150001 (China); Lu, Feifei [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Xu, Chunbo [Key Laboratory of Superlight Material and Surface Technology, Ministry of Education, College of Materials Science and Chemical Engineering, Harbin Engineering University, Heilongjiang, Harbin 150001 (China); Zhang, Xianfa; Huo, Lihua [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China); Zhao, Hui, E-mail: zhaohui98@yahoo.com [Key Laboratory of Functional Inorganic Materials Chemistry, Ministry of Education, School of Chemistry, Chemical Engineering and Materials, Heilongjiang University, Heilongjiang, Harbin 150080 (China)

    2017-06-01

    Graphical abstract: Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment. When tested as anode materials for LIBs, UMCN-HCs achieve high reversible capacity, good long cycling life, and rate capability. - Highlights: • UMCN-HCs show high capacity, excellent stability, and good rate capability. • UMCN-HCs retain a capacity of 1067 mAh g{sup −1} after 100 cycles at 100 mA g{sup −1}. • UMCN-HCs deliver a capacity of 507 mAh g{sup −1} after 500 cycles at 2 A g{sup −1}. - Abstract: Herein, Ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets-constructed hierarchical clusters (UMCN-HCs) have been successfully synthesized via a facile hydrothermal method followed by a subsequent thermolysis treatment at 600 °C in air. The products consist of cluster-like Co{sub 3}O{sub 4} microarchitectures, which are assembled by numerous ultrathin mesoporous Co{sub 3}O{sub 4} nanosheets. When tested as anode materials for lithium-ion batteries, UMCN-HCs deliver a high reversible capacity of 1067 mAh g{sup −1} at a current density of 100 mA g{sup −1} after 100 cycles. Even at 2 A g{sup −1}, a stable capacity as high as 507 mAh g{sup −1} can be achieved after 500 cycles. The high reversible capacity, excellent cycling stability, and good rate capability of UMCN-HCs may be attributed to their mesoporous sheet-like nanostructure. The sheet-layered structure of UMCN-HCs may buffer the volume change during the lithiation-delithiation process, and the mesoporous characteristic make lithium-ion transfer more easily at the interface between the active electrode and the electrolyte.

  19. Hydrothermal synthesis and photoluminescent properties of hierarchical GdPO4·H2O:Ln3+ (Ln3+ = Eu3+, Ce3+, Tb3+) flower-like clusters

    Science.gov (United States)

    Amurisana, Bao.; Zhiqiang, Song.; Haschaolu, O.; Yi, Chen; Tegus, O.

    2018-02-01

    3D hierarchical GdPO4·H2O:Ln3+ (Ln3+ = Eu3+, Ce3+, Tb3+) flower clusters were successfully prepared on glass slide substrate by a simple, economical hydrothermal process with the assistance of disodium ethylenediaminetetraacetic acid (Na2H2L, where L4- = (CH2COO)2N(CH2)2N(CH2COO)24-). In this process, Na2H2L was used as both a chelating agent and a structure-director. The hierarchical flower clusters have an average diameter of 7-12 μm and are composed of well-aligned microrods. The influence of the molar ratio of Na2H2L/Gd3+ and reaction time on the morphology was systematically studied. A possible crystal growth and formation mechanism of hierarchical flower clusters is proposed based on the evolution of morphology as a function of reaction time. The self-assembled GdPO4·H2O:Ln3+ superstructures exhibit strong orange-red (Eu3+, 5D0 → 7F1), green (Tb3+, 5D4 → 7F5) and near ultraviolet emissions (Ce3+, 5d → 7F5/2) under ultraviolet excitation, respectively. This study may provide a new channel for building hierarchically superstructued oxide micro/nanomaterials with optical and new properties.

  20. Data Clustering

    Science.gov (United States)

    Wagstaff, Kiri L.

    2012-03-01

    particular application involves considerations of the kind of data being analyzed, algorithm runtime efficiency, and how much prior knowledge is available about the problem domain, which can dictate the nature of clusters sought. Fundamentally, the clustering method and its representations of clusters carries with it a definition of what a cluster is, and it is important that this be aligned with the analysis goals for the problem at hand. In this chapter, I emphasize this point by identifying for each algorithm the cluster representation as a model, m_j , even for algorithms that are not typically thought of as creating a “model.” This chapter surveys a basic collection of clustering methods useful to any practitioner who is interested in applying clustering to a new data set. The algorithms include k-means (Section 25.2), EM (Section 25.3), agglomerative (Section 25.4), and spectral (Section 25.5) clustering, with side mentions of variants such as kernel k-means and divisive clustering. The chapter also discusses each algorithm’s strengths and limitations and provides pointers to additional in-depth reading for each subject. Section 25.6 discusses methods for incorporating domain knowledge into the clustering process. This chapter concludes with a brief survey of interesting applications of clustering methods to astronomy data (Section 25.7). The chapter begins with k-means because it is both generally accessible and so widely used that understanding it can be considered a necessary prerequisite for further work in the field. EM can be viewed as a more sophisticated version of k-means that uses a generative model for each cluster and probabilistic item assignments. Agglomerative clustering is the most basic form of hierarchical clustering and provides a basis for further exploration of algorithms in that vein. Spectral clustering permits a departure from feature-vector-based clustering and can operate on data sets instead represented as affinity, or similarity

  1. Method of Parallel-Hierarchical Network Self-Training and its Application for Pattern Classification and Recognition

    Directory of Open Access Journals (Sweden)

    TIMCHENKO, L.

    2012-11-01

    Full Text Available Propositions necessary for development of parallel-hierarchical (PH network training methods are discussed in this article. Unlike already known structures of the artificial neural network, where non-normalized (absolute similarity criteria are used for comparison, the suggested structure uses a normalized criterion. Based on the analysis of training rules, a conclusion is made that application of two training methods with a teacher is optimal for PH network training: error correction-based training and memory-based training. Mathematical models of training and a combined method of PH network training for recognition of static and dynamic patterns are developed.

  2. Calculation of correlated initial state in the hierarchical equations of motion method using an imaginary time path integral approach.

    Science.gov (United States)

    Song, Linze; Shi, Qiang

    2015-11-21

    Based on recent findings in the hierarchical equations of motion (HEOM) for correlated initial state [Y. Tanimura, J. Chem. Phys. 141, 044114 (2014)], we propose a new stochastic method to obtain the initial conditions for the real time HEOM propagation, which can be used further to calculate the equilibrium correlation functions and symmetrized correlation functions. The new method is derived through stochastic unraveling of the imaginary time influence functional, where a set of stochastic imaginary time HEOM are obtained. The validity of the new method is demonstrated using numerical examples including the spin-Boson model, and the Holstein model with undamped harmonic oscillator modes.

  3. Analysis of protein profiles using fuzzy clustering methods

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Ukendt, Sujatha; Rai, Lavanya

    The tissue protein profiles of healthy volunteers and volunteers with cervical cancer were recorded using High Performance Liquid Chromatography combined with Laser Induced Fluorescence  technique  (HPLC-LIF)  developed  in  our  lab.      We analyzed      the protein profile data using different...... clustering methods for their classification followed by various validation  measures.    The  clustering  algorithms  used  for  the  study  were  K-  means,  K- medoid, Fuzzy C-means, Gustafson-Kessel, and Gath-Geva.  The results presented in this study  conclude  that  the  protein  profiles  of  tissue......  samples  recorded  by  using  the  HPLC- LIF  system  and  the  data  analyzed  by  clustering  algorithms  quite  successfully  classifies them as belonging from normal and malignant conditions....

  4. A scanning method for detecting clustering pattern of both attribute and structure in social networks

    Science.gov (United States)

    Wang, Tai-Chi; Phoa, Frederick Kin Hing

    2016-03-01

    Community/cluster is one of the most important features in social networks. Many cluster detection methods were proposed to identify such an important pattern, but few were able to identify the statistical significance of the clusters by considering the likelihood of network structure and its attributes. Based on the definition of clustering, we propose a scanning method, originated from analyzing spatial data, for identifying clusters in social networks. Since the properties of network data are more complicated than those of spatial data, we verify our method's feasibility via simulation studies. The results show that the detection powers are affected by cluster sizes and connection probabilities. According to our simulation results, the detection accuracy of structure clusters and both structure and attribute clusters detected by our proposed method is better than that of other methods in most of our simulation cases. In addition, we apply our proposed method to some empirical data to identify statistically significant clusters.

  5. Determining wood chip size: image analysis and clustering methods

    Directory of Open Access Journals (Sweden)

    Paolo Febbi

    2013-09-01

    Full Text Available One of the standard methods for the determination of the size distribution of wood chips is the oscillating screen method (EN 15149- 1:2010. Recent literature demonstrated how image analysis could return highly accurate measure of the dimensions defined for each individual particle, and could promote a new method depending on the geometrical shape to determine the chip size in a more accurate way. A sample of wood chips (8 litres was sieved through horizontally oscillating sieves, using five different screen hole diameters (3.15, 8, 16, 45, 63 mm; the wood chips were sorted in decreasing size classes and the mass of all fractions was used to determine the size distribution of the particles. Since the chip shape and size influence the sieving results, Wang’s theory, which concerns the geometric forms, was considered. A cluster analysis on the shape descriptors (Fourier descriptors and size descriptors (area, perimeter, Feret diameters, eccentricity was applied to observe the chips distribution. The UPGMA algorithm was applied on Euclidean distance. The obtained dendrogram shows a group separation according with the original three sieving fractions. A comparison has been made between the traditional sieve and clustering results. This preliminary result shows how the image analysis-based method has a high potential for the characterization of wood chip size distribution and could be further investigated. Moreover, this method could be implemented in an online detection machine for chips size characterization. An improvement of the results is expected by using supervised multivariate methods that utilize known class memberships. The main objective of the future activities will be to shift the analysis from a 2-dimensional method to a 3- dimensional acquisition process.

  6. Clustering of resting state networks.

    Directory of Open Access Journals (Sweden)

    Megan H Lee

    Full Text Available The goal of the study was to demonstrate a hierarchical structure of resting state activity in the healthy brain using a data-driven clustering algorithm.The fuzzy-c-means clustering algorithm was applied to resting state fMRI data in cortical and subcortical gray matter from two groups acquired separately, one of 17 healthy individuals and the second of 21 healthy individuals. Different numbers of clusters and different starting conditions were used. A cluster dispersion measure determined the optimal numbers of clusters. An inner product metric provided a measure of similarity between different clusters. The two cluster result found the task-negative and task-positive systems. The cluster dispersion measure was minimized with seven and eleven clusters. Each of the clusters in the seven and eleven cluster result was associated with either the task-negative or task-positive system. Applying the algorithm to find seven clusters recovered previously described resting state networks, including the default mode network, frontoparietal control network, ventral and dorsal attention networks, somatomotor, visual, and language networks. The language and ventral attention networks had significant subcortical involvement. This parcellation was consistently found in a large majority of algorithm runs under different conditions and was robust to different methods of initialization.The clustering of resting state activity using different optimal numbers of clusters identified resting state networks comparable to previously obtained results. This work reinforces the observation that resting state networks are hierarchically organized.

  7. Micromechanics of hierarchical materials

    DEFF Research Database (Denmark)

    Mishnaevsky, Leon, Jr.

    2012-01-01

    A short overview of micromechanical models of hierarchical materials (hybrid composites, biomaterials, fractal materials, etc.) is given. Several examples of the modeling of strength and damage in hierarchical materials are summarized, among them, 3D FE model of hybrid composites...... with nanoengineered matrix, fiber bundle model of UD composites with hierarchically clustered fibers and 3D multilevel model of wood considered as a gradient, cellular material with layered composite cell walls. The main areas of research in micromechanics of hierarchical materials are identified, among them......, the investigations of the effects of load redistribution between reinforcing elements at different scale levels, of the possibilities to control different material properties and to ensure synergy of strengthening effects at different scale levels and using the nanoreinforcement effects. The main future directions...

  8. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry combined with multidimensional scaling, binary hierarchical cluster tree and selected diagnostic masses improves species identification of Neolithic keratin sequences from furs of the Tyrolean Iceman Oetzi.

    Science.gov (United States)

    Hollemeyer, Klaus; Altmeyer, Wolfgang; Heinzle, Elmar; Pitra, Christian

    2012-08-30

    The identification of fur origins from the 5300-year-old Tyrolean Iceman's accoutrement is not yet complete, although definite identification is essential for the socio-cultural context of his epoch. Neither have all potential samples been identified so far, nor there has a consensus been reached on the species identified using the classical methods. Archaeological hair often lacks analyzable hair scale patterns in microscopic analyses and polymer chain reaction (PCR)-based techniques are often inapplicable due to the lack of amplifiable ancient DNA. To overcome these drawbacks, a matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) method was used exclusively based on hair keratins. Thirteen fur specimens from his accoutrement were analyzed after tryptic digest of native hair. Peptide mass fingerprints (pmfs) from ancient samples and from reference species mostly occurring in the Alpine surroundings at his lifetime were compared to each other using multidimensional scaling and binary hierarchical cluster tree analysis. Both statistical methods highly reflect spectral similarities among pmfs as close zoological relationships. While multidimensional scaling was useful to discriminate specimens on the zoological order level, binary hierarchical cluster tree reached the family or subfamily level. Additionally, the presence and/or absence of order, family and/or species-specific diagnostic masses in their pmfs allowed the identification of mammals mostly down to single species level. Red deer was found in his shoe vamp, goat in the leggings, cattle in his shoe sole and at his quiver's closing flap as well as sheep and chamois in his coat. Canid species, like grey wolf, domestic dog or European red fox, were discovered in his leggings for the first time, but could not be differentiated to species level. This is widening the spectrum of processed fur-bearing species to at least one member of the Canidae family. His fur cap was

  9. Geology of the Riacho do Pontal iron oxide copper-gold (IOCG prospect, Bahia, Brazil: hydrothermal alteration approached via hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Sérgio Roberto Bacelar Huhn

    Full Text Available The Riacho do Pontal prospect is situated on the border between the Borborema Province and the São Francisco Craton, in Bahia state. It comprises rocks polydeformed during the Neoproterozoic. The prospect area includes migmatites and gneissic rocks intruded by several sin- to post-tectonic granites. Structural analysis indicates a strong relationship between the development of ductile to brittle-ductile shear zones and associated hydrothermalism. The main tracts of high-strain rate are represented by the Riacho do Pontal (north and Macururé (south shear zones. Several copper occurrences have been mapped within the Riacho do Pontal prospect along secondary shear zones. In these areas, the gneissic rocks were affected by intense hydrothermal alteration. Hierarchical cluster analysis permitted the identification of the main hydrothermal mineral associations present in these rocks, which resulted from potassic (biotite and sodic-calcic (amphibole-albite alteration, in addition to silicification and iron alteration (hematite. These hydrothermal alteration types are similar to those typically found in iron oxide copper-gold deposits developed at intermediate crustal levels. Hematite-quartz-albite-chalcopyrite-pyrite hydrothermal breccias host the highest-grade copper ore (chalcopyrite-pyrite-chalcocite zones. The spatial relationship between copper deposits and shear zones improves the metallogenic potential for copper of the Borborema Province and has important implications for mineral exploration in the region.

  10. Hierarchical cluster analysis and chemical characterisation of Myrtus communis L. essential oil from Yemen region and its antimicrobial, antioxidant and anti-colorectal adenocarcinoma properties.

    Science.gov (United States)

    Anwar, Sirajudheen; Crouch, Rebecca A; Awadh Ali, Nasser A; Al-Fatimi, Mohamed A; Setzer, William N; Wessjohann, Ludger

    2017-09-01

    The hydrodistilled essential oil obtained from the dried leaves of Myrtus communis, collected in Yemen, was analysed by GC-MS. Forty-one compounds were identified, representing 96.3% of the total oil. The major constituents of essential oil were oxygenated monoterpenoids (87.1%), linalool (29.1%), 1,8-cineole (18.4%), α-terpineol (10.8%), geraniol (7.3%) and linalyl acetate (7.4%). The essential oil was assessed for its antimicrobial activity using a disc diffusion assay and resulted in moderate to potent antibacterial and antifungal activities targeting mainly Bacillus subtilis, Staphylococcus aureus and Candida albicans. The oil moderately reduced the diphenylpicrylhydrazyl radical (IC50 = 4.2 μL/mL or 4.1 mg/mL). In vitro cytotoxicity evaluation against HT29 (human colonic adenocarcinoma cells) showed that the essential oil exhibited a moderate antitumor effect with IC50 of 110 ± 4 μg/mL. Hierarchical cluster analysis of M. communis has been carried out based on the chemical compositions of 99 samples reported in the literature, including Yemeni sample.

  11. The outbreak of SARS mirrored by bibliometric mapping: Combining bibliographic coupling with the complete link cluster method

    Directory of Open Access Journals (Sweden)

    Bo Jarneving

    2007-01-01

    Full Text Available In this study a novel method of science mapping is presented which combines bibliographic coupling, as a measure of document-document similarity, with an agglomerative hierarchical cluster method. The focus in this study is on the mapping of so called ‘core documents’, a concept presented first in 1995 by Glänzel and Czerwon. The term ‘core document’ denote documents that have a central position in the research front in terms of many and strong bibliographic coupling links. The identification and mapping of core documents usually requires a large multidisciplinary research setting and in this study the 2003 volume of the Science Citation Index was applied. From this database, a sub-set of core documents reporting on the outbreak of SARS in 2002 was chosen for the demonstration of the application of this mapping method. It was demonstrated that the method, in this case, successfully identified interpretable research themes and that iterative clustering on two subsequent levels of cluster agglomeration may provide with useful and current information.

  12. Hierarchical MnO2 nanosheets synthesized via electrodeposition-hydrothermal method for supercapacitor electrodes

    Science.gov (United States)

    Zheng, Dongdong; Qiang, Yujie; Xu, Shenying; Li, Wenpo; Yu, Shanshan; Zhang, Shengtao

    2017-02-01

    Metal oxides have emerged as one kind of important supercapacitor electrode materials. Herein, we report hierarchical MnO2 nanosheets prepared of indium tin oxide (ITO) coated glass substrates via a hybrid two-step protocol, including a cathodic electrodeposition technique and a hydrothermal process. The samples are characterized by X-ray diffraction (XRD), X-ray photoelectron spectroscopy (XPS), scanning electron microscope (SEM) with energy dispersive X-ray spectroscopy (EDX), and transmission electron microscope (TEM). SEM and TEM images show that the as-synthesized MnO2 nanosheets are hierarchical and porous, which could increase the active surface and short paths for fast ion diffusion. The results of nitrogen adsorption-desorption analysis indicate that the BET surface area of the MnO2 nanosheets is 53.031 m2 g-1. Furthermore, the electrochemical properties of the MnO2 are elucidated by cyclic voltammograms (CV), galvanostatic charge-discharge (GCD) tests, and electrochemical impedance spectroscopy (EIS) in 0.1 M Na2SO4 electrolyte. The electrochemical results demonstrate that the as-grown MnO2 nanosheet exhibits an excellent specific capacitance of 335 F g-1 at 0.5 A g-1 when it is applied as a potential electrode material for an electrochemical supercapacitor. Additionally, the MnO2 nanosheet electrode also presents high rate capability and good cycling stability with 91.8% retention after 1000 cycles. These excellent properties indicate that the hierarchical MnO2 nanosheets are a potential electrode material for electrochemical supercapacitors.

  13. Co3O4–ZnO hierarchical nanostructures by electrospinning and hydrothermal methods

    DEFF Research Database (Denmark)

    Kanjwal, Muzafar Ahmed; Sheikh, Faheem A.; Barakat, Nasser A.M.

    2011-01-01

    A new hierarchical nanostructure that consists of cobalt oxide (Co3O4) and zinc oxide (ZnO) was produced by the electrospinning process followed by a hydrothermal technique. First, electrospinning of a colloidal solution that consisted of zinc nanoparticles, cobalt acetate tetrahydrate and poly......, containing ZnO nanoparticles (ZnNPs), were then exploited as seeds to produce ZnO nanobranches using a specific hydrothermal technique. Scanning electron microscopy (SEM), and transmission electron microscopy (TEM) were employed to characterize the as-spun nanofibers and the calcined product. X-ray powder...

  14. A study of MORT logical tree and Tripod Beta methods in event occurrence causality analysis using hierarchical model

    Directory of Open Access Journals (Sweden)

    F. Alizadeh

    2015-01-01

    Full Text Available Introduction: The purpose of this study was to compare MORT and Tripod Beta methods, using a hierarchical model, in order to choose the best technique to analyze an event in an organization.  .Material and Method: In this study, a critical event was selected and the causes of the event were identified, employing MORT and Tripod Beta capabilities. Following the identification of the event causes, the aforementioned techniques were weighted and compared considering selected criteria and AHP hierarchical method.  .Result: Relative weights of the selected criteria were calculated. The ability to identify the event causes with the weight of 0.315 had the greatest weight. The event analysis cost (0.24, required time to analyze the event (0.146, technical experts (0.125, training for implementation (0.24, and availability of the analytical software (0.07 had obtained the subsequent weights, respectively.  .Conclusion: Analytic hierarchy process is an efficient and practical method to prioritize the choices considering the study objectives and criteria. As scientific method, Analytic hierarchy process helps the experts in decision-making. Considering the selected criteria, findings in this study showed that Tripod Beta technique (with a weight of 0.563 is superior to MORT technique (with a weight of 0.437.

  15. A multidisciplinary coupling relationship coordination algorithm using the hierarchical control methods of complex systems and its application in multidisciplinary design optimization

    OpenAIRE

    Rong Yuan; Haiqing Li

    2016-01-01

    Because of the increasing complexity in engineering systems, multidisciplinary design optimization has attracted increasing attention. High computational expense and organizational complexity are two main challenges of multidisciplinary design optimization. To address these challenges, the hierarchical control method of complex systems is developed in this study. Hierarchical control method is a powerful way which has been utilized widely in the control and coordination of large-scale complex...

  16. A quantitative method for clustering size distributions of elements

    Science.gov (United States)

    Dillner, Ann M.; Schauer, James J.; Christensen, William F.; Cass, Glen R.

    A quantitative method was developed to group similarly shaped size distributions of particle-phase elements in order to ascertain sources of the elements. This method was developed and applied using data from two sites in Houston, TX; one site surrounded by refineries, chemical plants and vehicular and commercial shipping traffic, and the other site, 25 miles inland surrounded by residences, light industrial facilities and vehicular traffic. Twenty-four hour size-segregated (0.056fluid catalytic cracking unit catalysts, fuel oil burning, a coal-fired power plant, and high-temperature metal working. The clustered elements were generally attributed to different sources at the two sites during each sampling day indicating the diversity of local sources that impact heavy metals concentrations in the region.

  17. Weighted Clustering

    DEFF Research Database (Denmark)

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina

    2012-01-01

    We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...

  18. A Hierarchical Multi-Temporal InSAR Method for Increasing the Spatial Density of Deformation Measurements

    Directory of Open Access Journals (Sweden)

    Tao Li

    2014-04-01

    Full Text Available Point-like targets are useful in providing surface deformation with the time series of synthetic aperture radar (SAR images using the multi-temporal interferometric synthetic aperture radar (MTInSAR methodology. However, the spatial density of point-like targets is low, especially in non-urban areas. In this paper, a hierarchical MTInSAR method is proposed to increase the spatial density of deformation measurements by tracking both the point-like targets and the distributed targets with the temporal steadiness of radar backscattering. To efficiently reduce error propagation, the deformation rates on point-like targets with lower amplitude dispersion index values are first estimated using a least squared estimator and a region growing method. Afterwards, the distributed targets are identified using the amplitude dispersion index and a Pearson correlation coefficient through a multi-level processing strategy. Meanwhile, the deformation rates on distributed targets are estimated during the multi-level processing. The proposed MTInSAR method has been tested for subsidence detection over a suburban area located in Tianjin, China using 40 high-resolution TerraSAR-X images acquired between 2009 and 2010, and validated using the ground-based leveling measurements. The experiment results indicate that the spatial density of deformation measurements can be increased by about 250% and that subsidence accuracy can reach to the millimeter level by using the hierarchical MTInSAR method.

  19. Expanding Comparative Literature into Comparative Sciences Clusters with Neutrosophy and Quad-stage Method

    Directory of Open Access Journals (Sweden)

    Fu Yuhua

    2016-08-01

    Full Text Available By using Neutrosophy and Quad-stage Method, the expansions of comparative literature include: comparative social sciences clusters, comparative natural sciences clusters, comparative interdisciplinary sciences clusters, and so on. Among them, comparative social sciences clusters include: comparative literature, comparative history, comparative philosophy, and so on; comparative natural sciences clusters include: comparative mathematics, comparative physics, comparative chemistry, comparative medicine, comparative biology, and so on.

  20. Application of clustering methods: Regularized Markov clustering (R-MCL) for analyzing dengue virus similarity

    Science.gov (United States)

    Lestari, D.; Raharjo, D.; Bustamam, A.; Abdillah, B.; Widhianto, W.

    2017-07-01

    Dengue virus consists of 10 different constituent proteins and are classified into 4 major serotypes (DEN 1 - DEN 4). This study was designed to perform clustering against 30 protein sequences of dengue virus taken from Virus Pathogen Database and Analysis Resource (VIPR) using Regularized Markov Clustering (R-MCL) algorithm and then we analyze the result. By using Python program 3.4, R-MCL algorithm produces 8 clusters with more than one centroid in several clusters. The number of centroid shows the density level of interaction. Protein interactions that are connected in a tissue, form a complex protein that serves as a specific biological process unit. The analysis of result shows the R-MCL clustering produces clusters of dengue virus family based on the similarity role of their constituent protein, regardless of serotypes.

  1. Hierarchical automated clustering of cloud point set by ellipsoidal skeleton: application to organ geometric modeling from CT-scan images

    Science.gov (United States)

    Banegas, Frederic; Michelucci, Dominique; Roelens, Marc; Jaeger, Marc

    1999-05-01

    We present a robust method for automatically constructing an ellipsoidal skeleton (e-skeleton) from a set of 3D points taken from NMR or TDM images. To ensure steadiness and accuracy, all points of the objects are taken into account, including the inner ones, which is different from the existing techniques. This skeleton will be essentially useful for object characterization, for comparisons between various measurements and as a basis for deformable models. It also provides good initial guess for surface reconstruction algorithms. On output of the entire process, we obtain an analytical description of the chosen entity, semantically zoomable (local features only or reconstructed surfaces), with any level of detail (LOD) by discretization step control in voxel or polygon format. This capability allows us to handle objects at interactive frame rates once the e-skeleton is computed. Each e-skeleton is stored as a multiscale CSG implicit tree.

  2. Recursive expectation-maximization clustering: A method for identifying buffering mechanisms composed of phenomic modules

    Science.gov (United States)

    Guo, Jingyu; Tian, Dehua; McKinney, Brett A.; Hartman, John L.

    2010-06-01

    of physiological homeostasis. To develop the method, 297 gene deletion strains were selected based on gene-drug interactions with hydroxyurea, an inhibitor of ribonucleotide reductase enzyme activity, which is critical for DNA synthesis. To partition the gene functions, these 297 deletion strains were challenged with growth inhibitory drugs known to target different genes and cellular pathways. Q-HTCP-derived growth curves were used to quantify all gene interactions, and the data were used to test the performance of REMc. Fundamental advantages of REMc include objective assessment of total number of clusters and assignment to each cluster a log-likelihood value, which can be considered an indicator of statistical quality of clusters. To assess the biological quality of clusters, we developed a method called gene ontology information divergence z-score (GOid_z). GOid_z summarizes total enrichment of GO attributes within individual clusters. Using these and other criteria, we compared the performance of REMc to hierarchical and K-means clustering. The main conclusion is that REMc provides distinct efficiencies for mining Q-HTCP data. It facilitates identification of phenomic modules, which contribute to buffering mechanisms that underlie cellular homeostasis and the regulation of phenotypic expression.

  3. SPSS TwoStep Cluster - a first evaluation

    OpenAIRE

    Bacher, Johann; Wenzig, Knut; Vogler, Melanie

    2004-01-01

    "SPSS 11.5 and later releases offer a two step clustering method. According to the authors' knowledge the procedure has not been used in the social sciences until now. This situation is surprising: The widely used clustering algorithms, k-means clustering and agglomerative hierarchical techniques, suffer from well known problems, whereas SPSS TwoStep clustering promises to solve at least some of these problems. In particular, mixed type attributes can be handled and the number of clusters is ...

  4. A model-based clustering method to detect infectious disease transmission outbreaks from sequence variation.

    Science.gov (United States)

    McCloskey, Rosemary M; Poon, Art F Y

    2017-11-01

    Clustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections. A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with large data sets. However, we have found that nonparametric clustering methods can be biased towards identifying clusters of diagnosis-where individuals are sampled sooner post-infection-rather than the clusters of rapid transmission that are meant to be potential foci for public health efforts. We develop a fundamentally new approach to genetic clustering based on fitting a Markov-modulated Poisson process (MMPP), which represents the evolution of transmission rates along the tree relating different infections. We evaluated this model-based method alongside five nonparametric clustering methods using both simulated and actual HIV sequence data sets. For simulated clusters of rapid transmission, the MMPP clustering method obtained higher mean sensitivity (85%) and specificity (91%) than the nonparametric methods. When we applied these clustering methods to published sequences from a study of HIV-1 genetic clusters in Seattle, USA, we found that the MMPP method categorized about half (46%) as many individuals to clusters compared to the other methods. Furthermore, the mean internal branch lengths that approximate transmission rates were significantly shorter in clusters extracted using MMPP, but not by other methods. We determined that the computing time for the MMPP method scaled linearly with the size of trees, requiring about 30 seconds for a tree of 1,000 tips and about 20 minutes for 50,000 tips on a single computer. This new approach to genetic clustering has significant implications for the application of pathogen sequence analysis to public health, where it is critical to

  5. Spotlight: assembly of protein complexes by integrating graph clustering methods.

    Science.gov (United States)

    Chin, Chia-Hao; Chen, Shu-Hwa; Chen, Chun-Yu; Hsiung, Chao A; Ho, Chin-Wen; Ko, Ming-Tat; Lin, Chung-Yen

    2013-04-10

    As is generally assumed, clusters in protein-protein interaction (PPI) networks perform specific, crucial functions in biological systems. Various network community detection methods have been developed to exploit PPI networks in order to identify protein complexes and functional modules. Due to the potential role of various regulatory modes in biological networks, a single method may just apply a single graph property and neglect communities highlighted by other network properties. This work presents a novel integration method to capture protein modules/protein complexes by multiple network features detected by different algorithms. The integration method is further implemented in a web-based platform with a highly effective interactive network analyzer. Conventionally adopted methods with different perspectives on network community detection (e.g., CPM, FastGreedy, HUNTER, MCL, LE, SpinGlass, and WalkTrap) are also executed simultaneously. Analytical results indicate that the proposed method performs better than the conventional ones. The proposed approach can capture the transcription and RNA splicing machineries from the yeast protein network. Meanwhile, proteins that are highly associated with each other, yet not described in both machineries are also identified. In sum, a protein that is closely connected to components of a known module or a complex in the network view implies the functional association among them. Importantly, our method can detect these unique network features, thus facilitating efforts to discover unknown components of functional modules/protein complexes. Spotlight is freely accessible at http://hub.iis.sinica.edu.tw/spotlight. Video clips for a quick view of usage are available in the website online help page. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.

  6. Application of two-way hierarchical cluster analysis for the identification of similarities between the individual lipid fractions of Lucilia sericata.

    Science.gov (United States)

    Gołębiowski, Marek; Sosnowska, Anita; Puzyn, Tomasz; Boguś, Mieczysława I; Wieloch, Wioletta; Włóka, Emilia; Stepnowski, Piotr

    2014-05-01

    The composition of the cuticular and internal lipids of larvae and pupae of Lucilia sericata was studied using chromatographic techniques. The lipids from both stages of L. sericata had similar free fatty acid (FFA) profiles and also contained alcohols and cholesterol. The range of the number of C-atoms detected for these classes of compounds was to some extent similar in larvae and pupae, but the relative amounts of each class differed between stages. Saturated as well as unsaturated FFAs with even and odd numbered C-atom chains were present in both cuticular and internal lipids. The alcohol fractions of L. sericata were represented by free, straight-chain primary alcohols containing an even number of C-atoms. The lipid composition of male and female L. sericata adults and the hydrocarbon composition of all stages of L. sericata had previously been analyzed. To have a full overview of the lipid composition and to identify similarities or dissimilarities between the individual lipid fractions in this insect species, two-way hierarchical cluster analysis (HCA) was performed using also the data from these previous publications. The content of FFA 18 : 1 (n-9) was noticed to be very high in the cuticular fractions of larvae and pupae as well as in all internal fractions (male, female, larvae, and pupae) and low in the cuticular fractions of male and female imago. The contents of FFAs 16 : 0 and 16 : 1 (n-9), cholesterol, and the n-alkanes n-C31 , n-C29 , n-C27 , n-C25 , and n-C23 varied between particular fractions, whereas the amounts of other compounds were similar in all fractions. Copyright © 2014 Verlag Helvetica Chimica Acta AG, Zürich.

  7. A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

    Science.gov (United States)

    Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

    2016-02-01

    Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.

  8. Hierarchical porous carbon derived from Allium cepa for supercapacitors through direct carbonization method with the assist of calcium acetate

    KAUST Repository

    Xu, Jinhui

    2017-11-02

    In this paper, a direction carbonization method was used to prepare porous carbon from Allium cepa for supercapacitor applications. In this method, calcium acetate was used to assist carbonization process. Scanning electron microscope (SEM) and N2 adsorption/desorption method were used to characterize the morphology, Brunauer-Emmett-Teller (BET) specific surface area and pore size distribution of porous carbon derived from Allium cepa (onion derived porous carbon, OPC). OPC is of hierarchical porous structure with high specific surface area and relatively high specific capacitance. OPC possesses relatively high specific surface area of 533.5 m2/g. What’s more, OPC possesses a specific capacitance of 133.5 F/g at scan rate of 5 mV/s.

  9. The initial conditions of observed star clusters - I. Method description and validation

    Science.gov (United States)

    Pijloo, J. T.; Portegies Zwart, S. F.; Alexander, P. E. R.; Gieles, M.; Larsen, S. S.; Groot, P. J.; Devecchi, B.

    2015-10-01

    We have coupled a fast, parametrized star cluster evolution code to a Markov Chain Monte Carlo code to determine the distribution of probable initial conditions of observed star clusters, that may serve as a starting point for future N-body calculations. In this paper, we validate our method by applying it to a set of star clusters which have been studied in detail numerically with N-body simulations and Monte Carlo methods: the Galactic globular clusters M4, 47 Tucanae, NGC 6397, M22, ω Centauri, Palomar 14 and Palomar 4, the Galactic open cluster M67, and the M31 globular cluster G1. For each cluster, we derive a distribution of initial conditions that, after evolution up to the cluster's current age, evolves to the currently observed conditions. We find that there is a connection between the morphology of the distribution of initial conditions and the dynamical age of a cluster and that a degeneracy in the initial half-mass radius towards small radii is present for clusters that have undergone a core collapse during their evolution. We find that the results of our method are in agreement with N-body and Monte Carlo studies for the majority of clusters. We conclude that our method is able to find reliable posteriors for the determined initial mass and half-mass radius for observed star clusters, and thus forms an suitable starting point for modelling an observed cluster's evolution.

  10. Clustering: a neural network approach.

    Science.gov (United States)

    Du, K-L

    2010-01-01

    Clustering is a fundamental data analysis method. It is widely used for pattern recognition, feature extraction, vector quantization (VQ), image segmentation, function approximation, and data mining. As an unsupervised classification technique, clustering identifies some inherent structures present in a set of objects based on a similarity measure. Clustering methods can be based on statistical model identification (McLachlan & Basford, 1988) or competitive learning. In this paper, we give a comprehensive overview of competitive learning based clustering methods. Importance is attached to a number of competitive learning based clustering neural networks such as the self-organizing map (SOM), the learning vector quantization (LVQ), the neural gas, and the ART model, and clustering algorithms such as the C-means, mountain/subtractive clustering, and fuzzy C-means (FCM) algorithms. Associated topics such as the under-utilization problem, fuzzy clustering, robust clustering, clustering based on non-Euclidean distance measures, supervised clustering, hierarchical clustering as well as cluster validity are also described. Two examples are given to demonstrate the use of the clustering methods.

  11. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number

    Directory of Open Access Journals (Sweden)

    Cooper James B

    2010-03-01

    Full Text Available Abstract Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome.

  12. Chelating agent-free, vapor-assisted crystallization method to synthesize hierarchical microporous/mesoporous MIL-125 (Ti).

    Science.gov (United States)

    McNamara, Nicholas D; Hicks, Jason C

    2015-03-11

    Titanium-based microporous heterogeneous catalysts are widely studied but are often limited by the accessibility of reactants to active sites. Metal-organic frameworks (MOFs), such as MIL-125 (Ti), exhibit enhanced surface areas due to their high intrinsic microporosity, but the pore diameters of most microporous MOFs are often too small to allow for the diffusion of larger reactants (>7 Å) relevant to petroleum and biomass upgrading. In this work, hierarchical microporous MIL-125 exhibiting significantly enhanced interparticle mesoporosity was synthesized using a chelating-free, vapor-assisted crystallization method. The resulting hierarchical MOF was examined as an active catalyst for the oxidation of dibenzothiophene (DBT) with tert-butyl hydroperoxide and outperformed the solely microporous analogue. This was attributed to greater access of the substrate to surface active sites, as the pores in the microporous analogues were of inadequate size to accommodate DBT. Moreover, thiophene adsorption studies suggested the mesoporous MOF contained larger amounts of unsaturated metal sites that could enhance the observed catalytic activity.

  13. Hierarchical design of an electro-hydraulic actuator based on robust LPV methods

    Science.gov (United States)

    Németh, Balázs; Varga, Balázs; Gáspár, Péter

    2015-08-01

    The paper proposes a hierarchical control design of an electro-hydraulic actuator, which is used to improve the roll stability of vehicles. The purpose of the control system is to generate a reference torque, which is required by the vehicle dynamic control. The control-oriented model of the actuator is formulated in two subsystems. The high-level hydromotor is described in a linear form, while the low-level spool valve is a polynomial system. These subsystems require different control strategies. At the high level, a linear parameter-varying control is used to guarantee performance specifications. At the low level, a control Lyapunov-function-based algorithm, which creates discrete control input values of the valve, is proposed. The interaction between the two subsystems is guaranteed by the spool displacement, which is control input at the high level and must be tracked at the low-level control. The spool displacement has physical constraints, which must also be incorporated into the control design. The robust design of the high-level control incorporates the imprecision of the low-level control as an uncertainty of the system.

  14. Application of single-linkage clustering method in the analysis of ...

    African Journals Online (AJOL)

    Single-linkage is one of the methods in cluster analysis, which is used, for determining natural groupings in multi-variate data. Given a data set with one or more characteristics, singlelinkage system classifies the data into clusters so that they are as similar as possible within each cluster and as different as possible between ...

  15. A Novel Cluster Head Selection Algorithm Based on Fuzzy Clustering and Particle Swarm Optimization.

    Science.gov (United States)

    Ni, Qingjian; Pan, Qianqian; Du, Huimin; Cao, Cen; Zhai, Yuqing

    2017-01-01

    An important objective of wireless sensor network is to prolong the network life cycle, and topology control is of great significance for extending the network life cycle. Based on previous work, for cluster head selection in hierarchical topology control, we propose a solution based on fuzzy clustering preprocessing and particle swarm optimization. More specifically, first, fuzzy clustering algorithm is used to initial clustering for sensor nodes according to geographical locations, where a sensor node belongs to a cluster with a determined probability, and the number of initial clusters is analyzed and discussed. Furthermore, the fitness function is designed considering both the energy consumption and distance factors of wireless sensor network. Finally, the cluster head nodes in hierarchical topology are determined based on the improved particle swarm optimization. Experimental results show that, compared with traditional methods, the proposed method achieved the purpose of reducing the mortality rate of nodes and extending the network life cycle.

  16. Hierarchical structure of Turkey's foreign trade

    Science.gov (United States)

    Kantar, Ersin; Deviren, Bayram; Keskin, Mustafa

    2011-10-01

    We examine the hierarchical structures of Turkey's foreign trade by using real prices of their commodity export and import move together over time. We obtain the topological properties among the countries based on Turkey's foreign trade during the 1996-2010 period by using the concept of hierarchical structure methods (minimal spanning tree, (MST) and hierarchical tree, (HT)). These periods are divided into two subperiods, such as 1996-2002 and 2003-2010, in order to test various time-window and observe the temporal evolution. We perform the bootstrap techniques to investigate a value of the statistical reliability to the links of the MSTs and HTs. We also use a clustering linkage procedure in order to observe the cluster structure much better. From the structural topologies of these trees, we identify different clusters of countries according to their geographical location and economic ties. Our results show that the DE (Germany), UK (United Kingdom), FR (France), IT (Italy) and RU (Russia) are more important within the network, due to a tighter connection with other countries. We have also found that these countries play a significant role for Turkey's foreign trade and have important implications for the design of portfolio and investment strategies.

  17. The Cluster Variation Method: A Primer for Neuroscientists

    Directory of Open Access Journals (Sweden)

    Alianna J. Maren

    2016-09-01

    Full Text Available Effective Brain–Computer Interfaces (BCIs require that the time-varying activation patterns of 2-D neural ensembles be modelled. The cluster variation method (CVM offers a means for the characterization of 2-D local pattern distributions. This paper provides neuroscientists and BCI researchers with a CVM tutorial that will help them to understand how the CVM statistical thermodynamics formulation can model 2-D pattern distributions expressing structural and functional dynamics in the brain. The premise is that local-in-time free energy minimization works alongside neural connectivity adaptation, supporting the development and stabilization of consistent stimulus-specific responsive activation patterns. The equilibrium distribution of local patterns, or configuration variables, is defined in terms of a single interaction enthalpy parameter (h for the case of an equiprobable distribution of bistate (neural/neural ensemble units. Thus, either one enthalpy parameter (or two, for the case of non-equiprobable distribution yields equilibrium configuration variable values. Modeling 2-D neural activation distribution patterns with the representational layer of a computational engine, we can thus correlate variational free energy minimization with specific configuration variable distributions. The CVM triplet configuration variables also map well to the notion of a M = 3 functional motif. This paper addresses the special case of an equiprobable unit distribution, for which an analytic solution can be found.

  18. Unsupervised Image Steganalysis Method Using Self-Learning Ensemble Discriminant Clustering

    National Research Council Canada - National Science Library

    CAO, Bing; FENG, Guorui; YIN, Zhaoxia; FAN, Lingyan

    2017-01-01

    ... and detecting steganographic method. In this paper, we just attempt to process unsupervised learning problem and propose a detection model called self-learning ensemble discriminant clustering (SEDC...

  19. cluster

    Indian Academy of Sciences (India)

    electron transfer chains involved in a number of biologi- cal systems including respiration and photosynthesis.1. The most common iron–sulphur clusters found as active centres in iron–sulphur proteins are [Fe2S2], [Fe3S4] and [Fe4S4], in which Fe(III) ions are coordinated to cysteines from the peptide and are linked to each ...

  20. A clustering method of Chinese medicine prescriptions based on modified firefly algorithm.

    Science.gov (United States)

    Yuan, Feng; Liu, Hong; Chen, Shou-Qiang; Xu, Liang

    2016-12-01

    This paper is aimed to study the clustering method for Chinese medicine (CM) medical cases. The traditional K-means clustering algorithm had shortcomings such as dependence of results on the selection of initial value, trapping in local optimum when processing prescriptions form CM medical cases. Therefore, a new clustering method based on the collaboration of firefly algorithm and simulated annealing algorithm was proposed. This algorithm dynamically determined the iteration of firefly algorithm and simulates sampling of annealing algorithm by fitness changes, and increased the diversity of swarm through expansion of the scope of the sudden jump, thereby effectively avoiding premature problem. The results from confirmatory experiments for CM medical cases suggested that, comparing with traditional K-means clustering algorithms, this method was greatly improved in the individual diversity and the obtained clustering results, the computing results from this method had a certain reference value for cluster analysis on CM prescriptions.

  1. Accounting hierarchical heterogeneity of rock during its working off by explosive methods

    Science.gov (United States)

    Hachay, Olga; Khachay, Oleg

    2017-04-01

    . Because the information about the structure and state of the environment can be obtained from the geophysical data by interpreting them in frames of the model, which is an approximation to the real environment, therefore you must select it from the class of physically and geologically reasonable. For a description of the geological environment in the form of a rock massif with its natural and technogenic heterogeneity we should use more adequate description as is a discrete model of the environment in the form of a piece wise non-homogeneous block media with embedded heterogeneities of lower rank than the block size . This nesting can be traced back several times, ie, changing the scale of the study, we see that the heterogeneity of lower rank now appear as blocks for the irregularities of the next rank. The simple average of the measured geophysical parameters can lead to a distorted view of the structure of the environment and its evolution. The Institute of Geophysics, UB RAS has developed a hardware-methodological and interpretative system for studying the structure and state of complex geological environment, which has the potential instability and the ability to rebuild the hierarchy structure with significant external influence. The basis of this complex is the developed 3-D technique planshet electromagnetic induction studies in frequency geometrical variant, resting on one side on the interpretation software system for 3-D alternating electromagnetic fields, and on the other hand on developed by Ph.D. A.I.Chelovechkov device for carrying out the inductive research. On the basis of this technology the active monitoring of the structure and state of the rock massif inside the mines of different material composition can be provided, it can be carried out to detect short-term precursors of strong dynamic phenomena according to the electromagnetic induction monitoring. There are developed algorithms for modeling of electromagnetic fields in hierarchic heterogeneous

  2. Cluster forest based fuzzy logic for massive data clustering

    Science.gov (United States)

    Lahmar, Ines; Ben Ayed, Abdelkarim; Ben Halima, Mohamed; Alimi, Adel M.

    2017-03-01

    This article is focused in developing an improved cluster ensemble method based cluster forests. Cluster forests (CF) is considered as a version of clustering inspired from Random Forests (RF) in the context of clustering for massive data. It aggregates intermediate Fuzzy C-Means (FCM) clustering results via spectral clustering since pseudo-clustering results are presented in the spectral space in order to classify these data sets in the multidimensional data space. One of the main advantages is the use of FCM, which allows building fuzzy membership to all partitions of the datasets due to the fuzzy logic whereas the classical algorithms as K-means permitted to build just hard partitions. In the first place, we ameliorate the CF clustering algorithm with the integration of fuzzy FCM and we compare it with other existing clustering methods. In the second place, we compare K-means and FCM clustering methods with the agglomerative hierarchical clustering (HAC) and other theory presented methods using data benchmarks from UCI repository.

  3. Anharmonic effects in the quantum cluster equilibrium method

    Science.gov (United States)

    von Domaros, Michael; Perlt, Eva

    2017-03-01

    The well-established quantum cluster equilibrium (QCE) model provides a statistical thermodynamic framework to apply high-level ab initio calculations of finite cluster structures to macroscopic liquid phases using the partition function. So far, the harmonic approximation has been applied throughout the calculations. In this article, we apply an important correction in the evaluation of the one-particle partition function and account for anharmonicity. Therefore, we implemented an analytical approximation to the Morse partition function and the derivatives of its logarithm with respect to temperature, which are required for the evaluation of thermodynamic quantities. This anharmonic QCE approach has been applied to liquid hydrogen chloride and cluster distributions, and the molar volume, the volumetric thermal expansion coefficient, and the isobaric heat capacity have been calculated. An improved description for all properties is observed if anharmonic effects are considered.

  4. Research on a Hierarchical Dynamic Automatic Voltage Control System Based on the Discrete Event-Driven Method

    Directory of Open Access Journals (Sweden)

    Yong Min

    2013-06-01

    Full Text Available In this paper, concepts and methods of hybrid control systems are adopted to establish a hierarchical dynamic automatic voltage control (HD-AVC system, realizing the dynamic voltage stability of power grids. An HD-AVC system model consisting of three layers is built based on the hybrid control method and discrete event-driven mechanism. In the Top Layer, discrete events are designed to drive the corresponding control block so as to avoid solving complex multiple objective functions, the power system’s characteristic matrix is formed and the minimum amplitude eigenvalue (MAE is calculated through linearized differential-algebraic equations. MAE is applied to judge the system’s voltage stability and security and construct discrete events. The Middle Layer is responsible for management and operation, which is also driven by discrete events. Control values of the control buses are calculated based on the characteristics of power systems and the sensitivity method. Then control values generate control strategies through the interface block. In the Bottom Layer, various control devices receive and implement the control commands from the Middle Layer. In this way, a closed-loop power system voltage control is achieved. Computer simulations verify the validity and accuracy of the HD-AVC system, and verify that the proposed HD-AVC system is more effective than normal voltage control methods.

  5. Method for discovering relationships in data by dynamic quantum clustering

    Science.gov (United States)

    Weinstein, Marvin; Horn, David

    2014-10-28

    Data clustering is provided according to a dynamical framework based on quantum mechanical time evolution of states corresponding to data points. To expedite computations, we can approximate the time-dependent Hamiltonian formalism by a truncated calculation within a set of Gaussian wave-functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data-points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition and/or feature filtering.

  6. Correlation-based iterative clustering methods for time course data: The identification of temporal gene response modules for influenza infection in humans

    Directory of Open Access Journals (Sweden)

    Michelle Carey

    2016-10-01

    Full Text Available Many pragmatic clustering methods have been developed to group data vectors or objects into clusters so that the objects in one cluster are very similar and objects in different clusters are distinct based on some similarity measure. The availability of time course data has motivated researchers to develop methods, such as mixture and mixed-effects modelling approaches, that incorporate the temporal information contained in the shape of the trajectory of the data. However, there is still a need for the development of time-course clustering methods that can adequately deal with inhomogeneous clusters (some clusters are quite large and others are quite small. Here we propose two such methods, hierarchical clustering (IHC and iterative pairwise-correlation clustering (IPC. We evaluate and compare the proposed methods to the Markov Cluster Algorithm (MCL and the generalised mixed-effects model (GMM using simulation studies and an application to a time course gene expression data set from a study containing human subjects who were challenged by a live influenza virus. We identify four types of temporal gene response modules to influenza infection in humans, i.e., single-gene modules (SGM, small-size modules (SSM, medium-size modules (MSM and large-size modules (LSM. The LSM contain genes that perform various fundamental biological functions that are consistent across subjects. The SSM and SGM contain genes that perform either different or similar biological functions that have complex temporal responses to the virus and are unique to each subject. We show that the temporal response of the genes in the LSM have either simple patterns with a single peak or trough a consequence of the transient stimuli sustained or state-transitioning patterns pertaining to developmental cues and that these modules can differentiate the severity of disease outcomes. Additionally, the size of gene response modules follows a power-law distribution with a consistent

  7. Clustering scientific publications based on citation relations: A systematic comparison of different methods

    CERN Document Server

    Šubelj, Lovro; Waltman, Ludo

    2015-01-01

    Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between di...

  8. A Hierarchical Reliability Control Method for a Space Manipulator Based on the Strategy of Autonomous Decision-Making

    Directory of Open Access Journals (Sweden)

    Xin Gao

    2016-01-01

    Full Text Available In order to maintain and enhance the operational reliability of a robotic manipulator deployed in space, an operational reliability system control method is presented in this paper. First, a method to divide factors affecting the operational reliability is proposed, which divides the operational reliability factors into task-related factors and cost-related factors. Then the models describing the relationships between the two kinds of factors and control variables are established. Based on this, a multivariable and multiconstraint optimization model is constructed. Second, a hierarchical system control model which incorporates the operational reliability factors is constructed. The control process of the space manipulator is divided into three layers: task planning, path planning, and motion control. Operational reliability related performance parameters are measured and used as the system’s feedback. Taking the factors affecting the operational reliability into consideration, the system can autonomously decide which control layer of the system should be optimized and how to optimize it using a control level adjustment decision module. The operational reliability factors affect these three control levels in the form of control variable constraints. Simulation results demonstrate that the proposed method can achieve a greater probability of meeting the task accuracy requirements, while extending the expected lifetime of the space manipulator.

  9. Hierarchical Modeling of Mastic Asphalt in Layered Road Structures Based on the Mori-Tanaka Method

    Directory of Open Access Journals (Sweden)

    Richard Valenta

    2012-01-01

    Full Text Available We present an application of the Mori-Tanaka micromechanical model for a description of the highly nonlinear behavior of asphalt mixtures. This method is expected to replace an expensive finite element-based fully-coupled multi-scale analysis while still providing useful information about local fields on the meso-scale that are not predictable by strictly macroscopic simulations. Drawing on our recent results from extensive experimental and also numerical investigations this paper concentrates on principal limitations of the Mori-Tanaka method, typical of all two-point averaging schemes, when appliedto material systems prone to evolving highly localized deformation patterns such as a network of shear bands. The inability of the Mori-Tanaka method to properly capture the correct stress transfer between phases with increasing compliance of the matrix phase is remedied here by introducing a damage like parameter into the local constitutive equation of reinforcements (stones to control an amount of stress taken by this phase. A deficiency of the Mori-Tanaka method in the prediction of creep response is also mentioned particularly in the light of large scale simulations. A comparison with the application of macroscopic homogenized constitutive model for an asphalt mixture is also presented.

  10. Hierarchical multiscale modeling for flows in fractured media using generalized multiscale finite element method

    KAUST Repository

    Efendiev, Yalchin R.

    2015-06-05

    In this paper, we develop a multiscale finite element method for solving flows in fractured media. Our approach is based on generalized multiscale finite element method (GMsFEM), where we represent the fracture effects on a coarse grid via multiscale basis functions. These multiscale basis functions are constructed in the offline stage via local spectral problems following GMsFEM. To represent the fractures on the fine grid, we consider two approaches (1) discrete fracture model (DFM) (2) embedded fracture model (EFM) and their combination. In DFM, the fractures are resolved via the fine grid, while in EFM the fracture and the fine grid block interaction is represented as a source term. In the proposed multiscale method, additional multiscale basis functions are used to represent the long fractures, while short-size fractures are collectively represented by a single basis functions. The procedure is automatically done via local spectral problems. In this regard, our approach shares common concepts with several approaches proposed in the literature as we discuss. We would like to emphasize that our goal is not to compare DFM with EFM, but rather to develop GMsFEM framework which uses these (DFM or EFM) fine-grid discretization techniques. Numerical results are presented, where we demonstrate how one can adaptively add basis functions in the regions of interest based on error indicators. We also discuss the use of randomized snapshots (Calo et al. Randomized oversampling for generalized multiscale finite element methods, 2014), which reduces the offline computational cost.

  11. Numerical Simulation of Bubble Cluster Induced Flow by Three-Dimensional Vortex-in-Cell Method.

    Science.gov (United States)

    Chen, Bin; Wang, Zhiwei; Uchiyama, Tomomi

    2014-08-01

    The behavior of air bubble clusters rising in water and the induced flow field are numerically studied using a three-dimensional two-way coupling algorithm based on a vortex-in-cell (VIC) method. In this method, vortex elements are convected in the Lagrangian frame and the liquid velocity field is solved from the Poisson equation of potential on the Eulerian grid. Two-way coupling is implemented by introducing a vorticity source term induced by the gradient of void fraction. Present simulation results are favorably compared with the measured results of bubble plume, which verifies the validity of the proposed VIC method. The rising of a single bubble cluster as well as two tandem bubble clusters are simulated. The mechanism of the aggregation effect in the rising process of bubble cluster is revealed and the transient processes of the generation, rising, strengthening, and separation of a vortex ring structure with bubble clusters are illustrated and analyzed in detail. Due to the aggregation, the average rising velocity increases with void fraction and is larger than the terminal rising velocity of single bubble. For the two tandem bubble cluster cases, the aggregation effect is stronger for smaller initial cluster distance, and both the strength of the induced vortex structure and the average bubble rising velocity are larger. For the 20 mm cluster distance case, the peak velocity of the lower cluster is about 2.7 times that of the terminal velocity of the single bubble and the peak average velocity of two clusters is about 2 times larger. While for the 30 mm cluster distance case, both the peak velocity of the lower cluster and two clusters are about 1.7 times that of the terminal velocity of the single bubble.

  12. Whole‐brain cortical parcellation: A hierarchical method based on dMRI tractography

    OpenAIRE

    Moreno-Dominguez, D.

    2014-01-01

    In modern neuroscience there is general agreement that brain function relies on networks and that connectivity is therefore of paramount importance for brain function. Accordingly, the delineation of functional brain areas on the basis of diffusion magnetic resonance imaging (dMRI) and tractography may lead to highly relevant brain maps. Existing methods typically aim to find a predefined number of areas and/or are limited to small regions of grey matter. However, it is in general not likely ...

  13. A hierarchical method for whole-brain connectivity-based parcellation

    OpenAIRE

    Moreno-Dominguez, D.; Anwander, A.; Knösche, T.

    2014-01-01

    In modern neuroscience there is general agreement that brain function relies on networks and that connectivity is therefore of paramount importance for brain function. Accordingly, the delineation of functional brain areas on the basis of diffusion magnetic resonance imaging (dMRI) and tractography may lead to highly relevant brain maps. Existing methods typically aim to find a predefined number of areas and/or are limited to small regions of grey matter. However, it is in general not likely ...

  14. Estimation of Mental Disorders Prevalence in High School Students Using Small Area Methods: A Hierarchical Bayesian Approach

    Directory of Open Access Journals (Sweden)

    Ali Reza Soltanian

    2016-08-01

    Full Text Available Background Adolescence is one of the most important periods in the course of human evolution and the prevalence of mental disorders among adolescence in different regions of Iran, especially in southern Iran. Objectives This study was conducted to determine the prevalence of mental disorders among high school students in Bushehr province, south of Iran. Methods In this cross-sectional study, 286 high school students were recruited by a multi-stage random sampling in Bushehr province in 2015. A general health questionnaire (GHQ-28 was used to assess mental disorders. The small area method, under the hierarchical Bayesian approach, was used to determine the prevalence of mental disorders and data analysis. Results From 286 questionnaires only 182 were completely filed and evaluated (the response rate was 70.5%. Of the students, 58.79% and 41.21% were male and female, respectively. Of all students, the prevalence of mental disorders in Bushehr, Dayyer, Deylam, Kangan, Dashtestan, Tangestan, Genaveh, and Dashty were 0.48, 0.42, 0.45, 0.52, 0.41, 0.47, 0.42, and 0.43, respectively. Conclusions Based on this study, the prevalence of mental disorders among adolescents was increasing in Bushehr Province counties. The lack of a national policy in this way is a serious obstacle to mental health and wellbeing access.

  15. Particle number projecting method for description of pairing effects in metal clusters

    Energy Technology Data Exchange (ETDEWEB)

    Kuzmenko, N. [Khlopin Radium Institute, St. Peterburg (Russian Federation); Nesterenko, V.; Pashkevich, V. [Joint Institute for Nuclear Research, Dubna (Russian Federation). Lab. of Theoretical Physics; Frauendorf, S. [Institut fuer Kern- und Hadronenphysik, Forshungszentrum Rossendorf, Dresden (Germany)

    1996-05-01

    The particle number projecting method for the description of pairing effects in metal clusters is proposed. In contrast with the Bardeen-Cooper-Schrieffer method (BCS) which does not conserve the particle number (thus not providing the necessary accuracy of calculations for small clusters) and has no solutions at sufficiently weak pairing, the projecting method can be applied to both small and large clusters with any pairing strength. As an example, the projection method is used to check the assertion on the pairing origin of the odd-even staggering (OES) in the ionization potentials (IP) of sodium clusters. Both effects of pairing and shape deformation are taken into account simultaneously. In general, the results obtained show that the existence of pairing in sodium clusters is doubtful.

  16. An empirical method to cluster objective nebulizer adherence data among adults with cystic fibrosis.

    Science.gov (United States)

    Hoo, Zhe H; Campbell, Michael J; Curley, Rachael; Wildman, Martin J

    2017-01-01

    The purpose of using preventative inhaled treatments in cystic fibrosis is to improve health outcomes. Therefore, understanding the relationship between adherence to treatment and health outcome is crucial. Temporal variability, as well as absolute magnitude of adherence affects health outcomes, and there is likely to be a threshold effect in the relationship between adherence and outcomes. We therefore propose a pragmatic algorithm-based clustering method of objective nebulizer adherence data to better understand this relationship, and potentially, to guide clinical decisions. This clustering method consists of three related steps. The first step is to split adherence data for the previous 12 months into four 3-monthly sections. The second step is to calculate mean adherence for each section and to score the section based on mean adherence. The third step is to aggregate the individual scores to determine the final cluster ("cluster 1" = very low adherence; "cluster 2" = low adherence; "cluster 3" = moderate adherence; "cluster 4" = high adherence), and taking into account adherence trend as represented by sequential individual scores. The individual scores should be displayed along with the final cluster for clinicians to fully understand the adherence data. We present three cases to illustrate the use of the proposed clustering method. This pragmatic clustering method can deal with adherence data of variable duration (ie, can be used even if 12 months' worth of data are unavailable) and can cluster adherence data in real time. Empirical support for some of the clustering parameters is not yet available, but the suggested classifications provide a structure to investigate parameters in future prospective datasets in which there are accurate measurements of nebulizer adherence and health outcomes.

  17. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis

    Directory of Open Access Journals (Sweden)

    Huanhuan Li

    2017-08-01

    Full Text Available The Shipboard Automatic Identification System (AIS is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW, a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our

  18. Factorization and the Dressing Method for the Gel'fand-Dikii Hierarch

    CERN Document Server

    Sattinger, D H

    1998-01-01

    The isospectral flows of an $n^{th}$ order linear scalar differential operator $L$ under the hypothesis that it possess a Baker-Akhiezer function were originally investigated by Segal and Wilson from the point of view of infinite dimensional Grassmanians, and the reduction of the KP hierarchy to the Gel'fand-Dikii hierarchy. The associated first order systems and their formal asymptotic solutions have a rich Lie algebraic structure which was investigated by Drinfeld and Sokolov. We investigate the matrix Riemann-Hilbert factorizations for these systems, and show that different factorizations lead respectively to the potential, modified, and ordinary Gel'fand-Dikii flows. Lie algebra decompositions (the Adler-Kostant-Symes method) are obtained for the modified and potential flows. For $n>3$ the appropriate factorization for the Gel'fand-Dikii flows is not a group factorization, as would be expected; yet a modification of the dressing method still works. A direct proof, based on a Fredholm determinant associate...

  19. Packaging Glass with a Hierarchically Nanostructured Surface: A Universal Method to Achieve Self-Cleaning Omnidirectional Solar Cells

    KAUST Repository

    Lin, Chin An

    2015-12-01

    Fused-silica packaging glass fabricated with a hierarchical structure by integrating small (ultrathin nanorods) and large (honeycomb nanowalls) structures was demonstrated with exceptional light-harvesting solar performance, which is attributed to the subwavelength feature of the nanorods and an efficient scattering ability of the honeycomb nanowalls. Si solar cells covered with the hierarchically structured packaging glass exhibit enhanced conversion efficiency by 5.2% at normal incidence, and the enhancement went up to 46% at the incident angle of 60°. The hierarchical structured packaging glass shows excellent self-cleaning characteristics: 98.8% of the efficiency is maintained after 6 weeks of outdoor exposure, indicating that the nanostructured surface effectively repels polluting dust/particles. The presented self-cleaning omnidirectional light-harvesting design using the hierarchical structured packaging glass is a potential universal scheme for practical solar applications.

  20. An extended affinity propagation clustering method based on different data density types.

    Science.gov (United States)

    Zhao, XiuLi; Xu, WeiXiang

    2015-01-01

    Affinity propagation (AP) algorithm, as a novel clustering method, does not require the users to specify the initial cluster centers in advance, which regards all data points as potential exemplars (cluster centers) equally and groups the clusters totally by the similar degree among the data points. But in many cases there exist some different intensive areas within the same data set, which means that the data set does not distribute homogeneously. In such situation the AP algorithm cannot group the data points into ideal clusters. In this paper, we proposed an extended AP clustering algorithm to deal with such a problem. There are two steps in our method: firstly the data set is partitioned into several data density types according to the nearest distances of each data point; and then the AP clustering method is, respectively, used to group the data points into clusters in each data density type. Two experiments are carried out to evaluate the performance of our algorithm: one utilizes an artificial data set and the other uses a real seismic data set. The experiment results show that groups are obtained more accurately by our algorithm than OPTICS and AP clustering algorithm itself.

  1. An Extended Affinity Propagation Clustering Method Based on Different Data Density Types

    Directory of Open Access Journals (Sweden)

    XiuLi Zhao

    2015-01-01

    Full Text Available Affinity propagation (AP algorithm, as a novel clustering method, does not require the users to specify the initial cluster centers in advance, which regards all data points as potential exemplars (cluster centers equally and groups the clusters totally by the similar degree among the data points. But in many cases there exist some different intensive areas within the same data set, which means that the data set does not distribute homogeneously. In such situation the AP algorithm cannot group the data points into ideal clusters. In this paper, we proposed an extended AP clustering algorithm to deal with such a problem. There are two steps in our method: firstly the data set is partitioned into several data density types according to the nearest distances of each data point; and then the AP clustering method is, respectively, used to group the data points into clusters in each data density type. Two experiments are carried out to evaluate the performance of our algorithm: one utilizes an artificial data set and the other uses a real seismic data set. The experiment results show that groups are obtained more accurately by our algorithm than OPTICS and AP clustering algorithm itself.

  2. Consensus of satellite cluster flight using an energy-matching optimal control method

    Science.gov (United States)

    Luo, Jianjun; Zhou, Liang; Zhang, Bo

    2017-11-01

    This paper presents an optimal control method for consensus of satellite cluster flight under a kind of energy matching condition. Firstly, the relation between energy matching and satellite periodically bounded relative motion is analyzed, and the satellite energy matching principle is applied to configure the initial conditions. Then, period-delayed errors are adopted as state variables to establish the period-delayed errors dynamics models of a single satellite and the cluster. Next a novel satellite cluster feedback control protocol with coupling gain is designed, so that the satellite cluster periodically bounded relative motion consensus problem (period-delayed errors state consensus problem) is transformed to the stability of a set of matrices with the same low dimension. Based on the consensus region theory in the research of multi-agent system consensus issues, the coupling gain can be obtained to satisfy the requirement of consensus region and decouple the satellite cluster information topology and the feedback control gain matrix, which can be determined by Linear quadratic regulator (LQR) optimal method. This method can realize the consensus of satellite cluster period-delayed errors, leading to the consistency of semi-major axes (SMA) and the energy-matching of satellite cluster. Then satellites can emerge the global coordinative cluster behavior. Finally the feasibility and effectiveness of the present energy-matching optimal consensus for satellite cluster flight is verified through numerical simulations.

  3. Parallel hierarchical radiosity rendering

    Energy Technology Data Exchange (ETDEWEB)

    Carter, Michael [Iowa State Univ., Ames, IA (United States)

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  4. Biomimetic hydrophobic surface fabricated by chemical etching method from hierarchically structured magnesium alloy substrate

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Yan; Yin, Xiaoming; Zhang, Jijia [Key Laboratory of Bionic Engineering (Ministry of Education), Jilin University, Changchun 130022 (China); Wang, Yaming [Institute for Advanced Ceramics, Harbin Institute of Technology, Harbin 150001 (China); Han, Zhiwu, E-mail: zwhan@jlu.edu.cn [Key Laboratory of Bionic Engineering (Ministry of Education), Jilin University, Changchun 130022 (China); Ren, Luquan [Key Laboratory of Bionic Engineering (Ministry of Education), Jilin University, Changchun 130022 (China)

    2013-09-01

    As one of the lightest metal materials, magnesium alloy plays an important role in industry such as automobile, airplane and electronic product. However, magnesium alloy is hindered due to its high chemical activity and easily corroded. Here, inspired by typical plant surfaces such as lotus leaves and petals of red rose with super-hydrophobic character, the new hydrophobic surface is fabricated on magnesium alloy to improve anti-corrosion by two-step methodology. The procedure is that the samples are processed by laser first and then immersed and etched in the aqueous AgNO{sub 3} solution concentrations of 0.1 mol/L, 0.3 mol/L and 0.5 mol/L for different times of 15 s, 40 s and 60 s, respectively, finally modified by DTS (CH{sub 3}(CH{sub 2}){sub 11}Si(OCH{sub 3}){sub 3}). The microstructure, chemical composition, wettability and anti-corrosion are characterized by means of SEM, XPS, water contact angle measurement and electrochemical method. The hydrophobic surfaces with microscale crater-like and nanoscale flower-like binary structure are obtained. The low-energy material is contained in surface after DTS treatment. The contact angles could reach up to 138.4 ± 2°, which hydrophobic property is both related to the micro–nano binary structure and chemical composition. The results of electrochemical measurements show that anti-corrosion property of magnesium alloy is improved. Furthermore, our research is expected to create some ideas from natural enlightenment to improve anti-corrosion property of magnesium alloy while this method can be easily extended to other metal materials.

  5. TWO-STAGE CHARACTER CLASSIFICATION : A COMBINED APPROACH OF CLUSTERING AND SUPPORT VECTOR CLASSIFIERS

    NARCIS (Netherlands)

    Vuurpijl, L.; Schomaker, L.

    2000-01-01

    This paper describes a two-stage classification method for (1) classification of isolated characters and (2) verification of the classification result. Character prototypes are generated using hierarchical clustering. For those prototypes known to sometimes produce wrong classification results, a

  6. Cluster Physics with Merging Galaxy Clusters

    Directory of Open Access Journals (Sweden)

    Sandor M. Molnar

    2016-02-01

    Full Text Available Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard LCDM model, where the total density is dominated by the cosmological constant ($Lambda$ and the matter density by cold dark matter (CDM, structure formation is hierarchical, and clusters grow mostly by merging.Mergers of two massive clusters are the most energetic events in the universe after the Big Bang,hence they provide a unique laboratory to study cluster physics.The two main mass components in clusters behave differently during collisions:the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulenceare developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thusour review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clustersis to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses.New high spatial and spectral resolution ground and space based telescopeswill come online in the near future. Motivated by these new opportunities,we briefly discuss methods which will be feasible in the near future in studying merging clusters.

  7. Tune Your Brown Clustering, Please

    DEFF Research Database (Denmark)

    Derczynski, Leon; Chester, Sean; Bøgh, Kenneth Sejdenfaden

    2015-01-01

    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly...

  8. Green method for producing hierarchically assembled pristine porous ZnO nanoparticles with narrow particle size distribution

    Energy Technology Data Exchange (ETDEWEB)

    Escobedo-Morales, A., E-mail: alejandro.escobedo@correo.buap.mx [Facultad de Ingeniería Química, Benemérita Universidad Autónoma de Puebla, C.P. 72570 Puebla, Pue. (Mexico); Téllez-Flores, D.; Ruiz Peralta, Ma. de Lourdes [Facultad de Ingeniería Química, Benemérita Universidad Autónoma de Puebla, C.P. 72570 Puebla, Pue. (Mexico); Garcia-Serrano, J.; Herrera-González, Ana M. [Centro de Investigaciones en Materiales y Metalurgia, Universidad Autónoma del Estado de Hidalgo, Carretera Pachuca Tulancingo Km 4.5, Pachuca, Hidalgo (Mexico); Rubio-Rosas, E. [Centro Universitario de Vinculación y Transferencia de Tecnología, Benemérita Universidad Autónoma de Puebla, C.P. 72570 Puebla, Pue. (Mexico); Sánchez-Mora, E. [Instituto de Física, Benemérita Universidad Autónoma de Puebla, Apdo. Postal J-48, 72570 Puebla, Pue. (Mexico); Olivares Xometl, O. [Facultad de Ingeniería Química, Benemérita Universidad Autónoma de Puebla, C.P. 72570 Puebla, Pue. (Mexico)

    2015-02-01

    A green method for producing pristine porous ZnO nanoparticles with narrow particle size distribution is reported. This method consists in synthesizing ZnO{sub 2} nanopowders via a hydrothermal route using cheap and non-toxic reagents, and its subsequent thermal decomposition at low temperature under a non-protective atmosphere (air). The morphology, structural and optical properties of the obtained porous ZnO nanoparticles were studied by means of powder X-ray diffraction, scanning electron microscopy, transmission electron microscopy, Raman spectroscopy, and nitrogen adsorption–desorption measurements. It was found that after thermal decomposition of the ZnO{sub 2} powders, pristine ZnO nanoparticles are obtained. These particles are round-shaped with narrow size distribution. A further analysis of the obtained ZnO nanoparticles reveals that they are hierarchical self-assemblies of primary ZnO particles. The agglomeration of these primary particles at the very early stage of the thermal decomposition of ZnO{sub 2} powders provides to the resulting ZnO nanoparticles a porous nature. The possibility of using the synthesized porous ZnO nanoparticles as photocatalysts has been evaluated on the degradation of rhodamine B dye. - Highlights: • A green synthesis method for obtaining porous ZnO nanoparticles is reported. • The obtained ZnO nanoparticles have narrow particle size distribution. • This method allows obtaining pristine ZnO nanoparticles avoiding unintentional doping. • A growth mechanism for the obtained porous ZnO nanoparticles is proposed.

  9. A geostatistics-informed hierarchical sensitivity analysis method for complex groundwater flow and transport modeling: GEOSTATISTICAL SENSITIVITY ANALYSIS

    Energy Technology Data Exchange (ETDEWEB)

    Dai, Heng [Pacific Northwest National Laboratory, Richland Washington USA; Chen, Xingyuan [Pacific Northwest National Laboratory, Richland Washington USA; Ye, Ming [Department of Scientific Computing, Florida State University, Tallahassee Florida USA; Song, Xuehang [Pacific Northwest National Laboratory, Richland Washington USA; Zachara, John M. [Pacific Northwest National Laboratory, Richland Washington USA

    2017-05-01

    Sensitivity analysis is an important tool for quantifying uncertainty in the outputs of mathematical models, especially for complex systems with a high dimension of spatially correlated parameters. Variance-based global sensitivity analysis has gained popularity because it can quantify the relative contribution of uncertainty from different sources. However, its computational cost increases dramatically with the complexity of the considered model and the dimension of model parameters. In this study we developed a hierarchical sensitivity analysis method that (1) constructs an uncertainty hierarchy by analyzing the input uncertainty sources, and (2) accounts for the spatial correlation among parameters at each level of the hierarchy using geostatistical tools. The contribution of uncertainty source at each hierarchy level is measured by sensitivity indices calculated using the variance decomposition method. Using this methodology, we identified the most important uncertainty source for a dynamic groundwater flow and solute transport in model at the Department of Energy (DOE) Hanford site. The results indicate that boundary conditions and permeability field contribute the most uncertainty to the simulated head field and tracer plume, respectively. The relative contribution from each source varied spatially and temporally as driven by the dynamic interaction between groundwater and river water at the site. By using a geostatistical approach to reduce the number of realizations needed for the sensitivity analysis, the computational cost of implementing the developed method was reduced to a practically manageable level. The developed sensitivity analysis method is generally applicable to a wide range of hydrologic and environmental problems that deal with high-dimensional spatially-distributed parameters.

  10. A NEW METHOD TO QUANTIFY X-RAY SUBSTRUCTURES IN CLUSTERS OF GALAXIES

    Energy Technology Data Exchange (ETDEWEB)

    Andrade-Santos, Felipe; Lima Neto, Gastao B.; Lagana, Tatiana F. [Departamento de Astronomia, Instituto de Astronomia, Geofisica e Ciencias Atmosfericas, Universidade de Sao Paulo, Geofisica e Ciencias Atmosfericas, Rua do Matao 1226, Cidade Universitaria, 05508-090 Sao Paulo, SP (Brazil)

    2012-02-20

    We present a new method to quantify substructures in clusters of galaxies, based on the analysis of the intensity of structures. This analysis is done in a residual image that is the result of the subtraction of a surface brightness model, obtained by fitting a two-dimensional analytical model ({beta}-model or Sersic profile) with elliptical symmetry, from the X-ray image. Our method is applied to 34 clusters observed by the Chandra Space Telescope that are in the redshift range z in [0.02, 0.2] and have a signal-to-noise ratio (S/N) greater than 100. We present the calibration of the method and the relations between the substructure level with physical quantities, such as the mass, X-ray luminosity, temperature, and cluster redshift. We use our method to separate the clusters in two sub-samples of high- and low-substructure levels. We conclude, using Monte Carlo simulations, that the method recuperates very well the true amount of substructure for small angular core radii clusters (with respect to the whole image size) and good S/N observations. We find no evidence of correlation between the substructure level and physical properties of the clusters such as gas temperature, X-ray luminosity, and redshift; however, analysis suggest a trend between the substructure level and cluster mass. The scaling relations for the two sub-samples (high- and low-substructure level clusters) are different (they present an offset, i.e., given a fixed mass or temperature, low-substructure clusters tend to be more X-ray luminous), which is an important result for cosmological tests using the mass-luminosity relation to obtain the cluster mass function, since they rely on the assumption that clusters do not present different scaling relations according to their dynamical state.

  11. Microparticles with hierarchical porosity

    Science.gov (United States)

    Petsev, Dimiter N; Atanassov, Plamen; Pylypenko, Svitlana; Carroll, Nick; Olson, Tim

    2012-12-18

    The present disclosure provides oxide microparticles with engineered hierarchical porosity and methods of manufacturing the same. Also described are structures that are formed by templating, impregnating, and/or precipitating the oxide microparticles and method for forming the same. Suitable applications include catalysts, electrocatalysts, electrocatalysts support materials, capacitors, drug delivery systems, sensors and chromatography.

  12. An Efficient Hierarchical Multiscale Finite Element Method for Stokes Equations in Slowly Varying Media

    KAUST Repository

    Brown, Donald L.

    2013-01-01

    Direct numerical simulation (DNS) of fluid flow in porous media with many scales is often not feasible, and an effective or homogenized description is more desirable. To construct the homogenized equations, effective properties must be computed. Computation of effective properties for nonperiodic microstructures can be prohibitively expensive, as many local cell problems must be solved for different macroscopic points. In addition, the local problems may also be computationally expensive. When the microstructure varies slowly, we develop an efficient numerical method for two scales that achieves essentially the same accuracy as that for the full resolution solve of every local cell problem. In this method, we build a dense hierarchy of macroscopic grid points and a corresponding nested sequence of approximation spaces. Essentially, solutions computed in high accuracy approximation spaces at select points in the the hierarchy are used as corrections for the error of the lower accuracy approximation spaces at nearby macroscopic points. We give a brief overview of slowly varying media and formal Stokes homogenization in such domains. We present a general outline of the algorithm and list reasonable and easily verifiable assumptions on the PDEs, geometry, and approximation spaces. With these assumptions, we achieve the same accuracy as the full solve. To demonstrate the elements of the proof of the error estimate, we use a hierarchy of macro-grid points in [0, 1]2 and finite element (FE) approximation spaces in [0, 1]2. We apply this algorithm to Stokes equations in a slowly porous medium where the microstructure is obtained from a reference periodic domain by a known smooth map. Using the arbitrary Lagrange-Eulerian (ALE) formulation of the Stokes equations (cf. [G. P. Galdi and R. Rannacher, Fundamental Trends in Fluid-Structure Interaction, Contemporary Challenges in Mathematical Fluid Dynamics and Its Applications 1, World Scientific, Singapore, 2010]), we obtain

  13. Hierarchically rough, mechanically durable and superhydrophobic epoxy coatings through rapid evaporation spray method

    Energy Technology Data Exchange (ETDEWEB)

    Simovich, Tomer; Wu, Alex H.; Lamb, Robert N., E-mail: rnlamb@unimelb.edu.au

    2015-08-31

    A mechanically durable and scalable superhydrophobic coating was fabricated by combining the advantages of both bottom-up and top-down approaches into a one-pot, one-step application method. This is achieved by spray coating a solution consisting of silica nanoparticles, which are embedded within epoxy resin, onto a heated substrate to rapidly drive both solvent evaporation and curing simultaneously. By maintaining a high substrate temperature, the arrival of spray-delivered micrometer-sized droplets are rapidly cured onto the substrate to form surface microroughness, while simultaneously, rapid solvent evaporation within each droplet results in the formation of a nanoporous structure. SEM, dual-beam FIB, and cross-sectional TEM/EDAX elemental mapping were used to confirm both the chemistry and the requisite micro- and nano-porosity within the coating structure requisite for superhydrophobicity. The resultant coatings exhibit contact angles greater than 150° (153.8° ± 0.8°) and roll-off angles of 8° ± 2°, with a coating hardness of 6H on the pencil hardness scale, and a rating of 5 on an ASTM crosshatch test. - Highlights: • A highly superhydrophobic coating was fabricated utilizing epoxy and nanoparticles. • The coating was demonstrated to be very durable and abrasion resistant. • The fabrication involves a novel, scalable one-pot synthesis technique.

  14. Ensemble ROCK Methods and Ensemble SWFM Methods for Clustering of Cross Citrus Accessions Based on Mixed Numerical and Categorical Dataset

    Science.gov (United States)

    Alvionita; Sutikno; Suharsono, A.

    2017-03-01

    Cluster analysis is a technique in multivariate analysis methods that reduces (classifying) data. This analysis has the main purpose to classify the objects of observation into groups based on characteristics. In the process, a cluster analysis is not only used for numerical data or categorical data but also developed for mixed data. There are several methods in analyzing the mixed data as ensemble methods and methods Similarity Weight and Filter Methods (SWFM). There is a lot of research on these methods, but the study did not compare the performance given by both of these methods. Therefore, this paper will be compared the performance between the clustering ensemble ROCK methods and ensemble SWFM methods. These methods will be used in clustering cross citrus accessions based on the characteristics of fruit and leaves that involve variables that are a mixture of numerical and categorical. Clustering methods with the best performance determined by looking at the ratio of standard deviation values within groups (SW) with a standard deviation between groups (SB). Methods with the best performance has the smallest ratio. From the result, we get that the performance of ensemble ROCK methods is better than ensemble SWFM methods.

  15. Load balancing prediction method of cloud storage based on analytic hierarchy process and hybrid hierarchical genetic algorithm.

    Science.gov (United States)

    Zhou, Xiuze; Lin, Fan; Yang, Lvqing; Nie, Jing; Tan, Qian; Zeng, Wenhua; Zhang, Nian

    2016-01-01

    With the continuous expansion of the cloud computing platform scale and rapid growth of users and applications, how to efficiently use system resources to improve the overall performance of cloud computing has become a crucial issue. To address this issue, this paper proposes a method that uses an analytic hierarchy process group decision (AHPGD) to evaluate the load state of server nodes. Training was carried out by using a hybrid hierarchical genetic algorithm (HHGA) for optimizing a radial basis function neural network (RBFNN). The AHPGD makes the aggregative indicator of virtual machines in cloud, and become input parameters of predicted RBFNN. Also, this paper proposes a new dynamic load balancing scheduling algorithm combined with a weighted round-robin algorithm, which uses the predictive periodical load value of nodes based on AHPPGD and RBFNN optimized by HHGA, then calculates the corresponding weight values of nodes and makes constant updates. Meanwhile, it keeps the advantages and avoids the shortcomings of static weighted round-robin algorithm.

  16. Investigation of the cluster formation in lithium niobate crystals by computer modeling method

    Energy Technology Data Exchange (ETDEWEB)

    Voskresenskii, V. M.; Starodub, O. R., E-mail: ol-star@mail.ru; Sidorov, N. V.; Palatnikov, M. N. [Russian Academy of Sciences, Tananaev Institute of Chemistry and Technology of Rare Earth Elements and Mineral Raw Materials, Kola Science Centre (Russian Federation)

    2017-03-15

    The processes occurring upon the formation of energetically equilibrium oxygen-octahedral clusters in the ferroelectric phase of a stoichiometric lithium niobate (LiNbO{sub 3}) crystal have been investigated by the computer modeling method within the semiclassical atomistic model. An energetically favorable cluster size (at which a structure similar to that of a congruent crystal is organized) is shown to exist. A stoichiometric cluster cannot exist because of the electroneutrality loss. The most energetically favorable cluster is that with a Li/Nb ratio of about 0.945, a value close to the lithium-to-niobium ratio for a congruent crystal.

  17. Communication: Improved pair approximations in local coupled-cluster methods

    Energy Technology Data Exchange (ETDEWEB)

    Schwilk, Max; Werner, Hans-Joachim [Institut für Theoretische Chemie, Universität Stuttgart, Pfaffenwaldring 55, D-70569 Stuttgart (Germany); Usvyat, Denis [Institute for Physical and Theoretical Chemistry, Universität Regensburg, Universitätsstrasse 31, D-93040 Regensburg (Germany)

    2015-03-28

    In local coupled cluster treatments the electron pairs can be classified according to the magnitude of their energy contributions or distances into strong, close, weak, and distant pairs. Different approximations are introduced for the latter three classes. In this communication, an improved simplified treatment of close and weak pairs is proposed, which is based on long-range cancellations of individually slowly decaying contributions in the amplitude equations. Benchmark calculations for correlation, reaction, and activation energies demonstrate that these approximations work extremely well, while pair approximations based on local second-order Møller-Plesset theory can lead to errors that are 1-2 orders of magnitude larger.

  18. A simple and fast method to determine the parameters for fuzzy c-means cluster analysis

    DEFF Research Database (Denmark)

    Schwämmle, Veit; Jensen, Ole Nørregaard

    2010-01-01

    MOTIVATION: Fuzzy c-means clustering is widely used to identify cluster structures in high-dimensional datasets, such as those obtained in DNA microarray and quantitative proteomics experiments. One of its main limitations is the lack of a computationally fast method to set optimal values...

  19. Hierarchical Network Design

    DEFF Research Database (Denmark)

    Thomadsen, Tommy

    2005-01-01

    design. The papers have all been submitted for journals, and except for two papers, are awaiting review. The papers are mostly concerned with optimal methods and, in a few cases, heuristics for designing hierarchical and ring networks. All papers develop bounds which are used in the optimal methods...... danne grundlag for et studie af design af hierarkiske netværk. Afhandlings vigtigste bidrag best ar af syv artikler, der er inkluderet i appendiks. Artiklerne handler om design af hierarkisk netværk og ring netværk. Artiklerne er alle indsendt til videnskablige journaler og afventer bedømmelse, bortset......Communication networks are immensely important today, since both companies and individuals use numerous services that rely on them. This thesis considers the design of hierarchical (communication) networks. Hierarchical networks consist of layers of networks and are well-suited for coping...

  20. PARTIAL TRAINING METHOD FOR HEURISTIC ALGORITHM OF POSSIBLE CLUSTERIZATION UNDER UNKNOWN NUMBER OF CLASSES

    Directory of Open Access Journals (Sweden)

    D. A. Viattchenin

    2009-01-01

    Full Text Available A method for constructing a subset of labeled objects which is used in a heuristic algorithm of possible  clusterization with partial  training is proposed in the  paper.  The  method  is  based  on  data preprocessing by the heuristic algorithm of possible clusterization using a transitive closure of a fuzzy tolerance. Method efficiency is demonstrated by way of an illustrative example.

  1. Method for exploratory cluster analysis and visualisation of single-trial ERP ensembles.

    Science.gov (United States)

    Williams, N J; Nasuto, S J; Saddy, J D

    2015-07-30

    The validity of ensemble averaging on event-related potential (ERP) data has been questioned, due to its assumption that the ERP is identical across trials. Thus, there is a need for preliminary testing for cluster structure in the data. We propose a complete pipeline for the cluster analysis of ERP data. To increase the signal-to-noise (SNR) ratio of the raw single-trials, we used a denoising method based on Empirical Mode Decomposition (EMD). Next, we used a bootstrap-based method to determine the number of clusters, through a measure called the Stability Index (SI). We then used a clustering algorithm based on a Genetic Algorithm (GA) to define initial cluster centroids for subsequent k-means clustering. Finally, we visualised the clustering results through a scheme based on Principal Component Analysis (PCA). After validating the pipeline on simulated data, we tested it on data from two experiments - a P300 speller paradigm on a single subject and a language processing study on 25 subjects. Results revealed evidence for the existence of 6 clusters in one experimental condition from the language processing study. Further, a two-way chi-square test revealed an influence of subject on cluster membership. Our analysis operates on denoised single-trials, the number of clusters are determined in a principled manner and the results are presented through an intuitive visualisation. Given the cluster structure in some experimental conditions, we suggest application of cluster analysis as a preliminary step before ensemble averaging. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Analysis of a Gibbs sampler method for model-based clustering of gene expression data.

    Science.gov (United States)

    Joshi, Anagha; Van de Peer, Yves; Michoel, Tom

    2008-01-15

    Over the last decade, a large variety of clustering algorithms have been developed to detect coregulatory relationships among genes from microarray gene expression data. Model-based clustering approaches have emerged as statistically well-grounded methods, but the properties of these algorithms when applied to large-scale data sets are not always well understood. An in-depth analysis can reveal important insights about the performance of the algorithm, the expected quality of the output clusters, and the possibilities for extracting more relevant information out of a particular data set. We have extended an existing algorithm for model-based clustering of genes to simultaneously cluster genes and conditions, and used three large compendia of gene expression data for Saccharomyces cerevisiae to analyze its properties. The algorithm uses a Bayesian approach and a Gibbs sampling procedure to iteratively update the cluster assignment of each gene and condition. For large-scale data sets, the posterior distribution is strongly peaked on a limited number of equiprobable clusterings. A GO annotation analysis shows that these local maxima are all biologically equally significant, and that simultaneously clustering genes and conditions performs better than only clustering genes and assuming independent conditions. A collection of distinct equivalent clusterings can be summarized as a weighted graph on the set of genes, from which we extract fuzzy, overlapping clusters using a graph spectral method. The cores of these fuzzy clusters contain tight sets of strongly coexpressed genes, while the overlaps exhibit relations between genes showing only partial coexpression. GaneSh, a Java package for coclustering, is available under the terms of the GNU General Public License from our website at http://bioinformatics.psb.ugent.be/software

  3. A two-stage method for microcalcification cluster segmentation in mammography by deformable models

    Energy Technology Data Exchange (ETDEWEB)

    Arikidis, N.; Kazantzi, A.; Skiadopoulos, S.; Karahaliou, A.; Costaridou, L., E-mail: costarid@upatras.gr [Department of Medical Physics, School of Medicine, University of Patras, Patras 26504 (Greece); Vassiou, K. [Department of Anatomy, School of Medicine, University of Thessaly, Larissa 41500 (Greece)

    2015-10-15

    Purpose: Segmentation of microcalcification (MC) clusters in x-ray mammography is a difficult task for radiologists. Accurate segmentation is prerequisite for quantitative image analysis of MC clusters and subsequent feature extraction and classification in computer-aided diagnosis schemes. Methods: In this study, a two-stage semiautomated segmentation method of MC clusters is investigated. The first stage is targeted to accurate and time efficient segmentation of the majority of the particles of a MC cluster, by means of a level set method. The second stage is targeted to shape refinement of selected individual MCs, by means of an active contour model. Both methods are applied in the framework of a rich scale-space representation, provided by the wavelet transform at integer scales. Segmentation reliability of the proposed method in terms of inter and intraobserver agreements was evaluated in a case sample of 80 MC clusters originating from the digital database for screening mammography, corresponding to 4 morphology types (punctate: 22, fine linear branching: 16, pleomorphic: 18, and amorphous: 24) of MC clusters, assessing radiologists’ segmentations quantitatively by two distance metrics (Hausdorff distance—HDIST{sub cluster}, average of minimum distance—AMINDIST{sub cluster}) and the area overlap measure (AOM{sub cluster}). The effect of the proposed segmentation method on MC cluster characterization accuracy was evaluated in a case sample of 162 pleomorphic MC clusters (72 malignant and 90 benign). Ten MC cluster features, targeted to capture morphologic properties of individual MCs in a cluster (area, major length, perimeter, compactness, and spread), were extracted and a correlation-based feature selection method yielded a feature subset to feed in a support vector machine classifier. Classification performance of the MC cluster features was estimated by means of the area under receiver operating characteristic curve (Az ± Standard Error) utilizing

  4. Heuristic methods using grasp, path relinking and variable neighborhood search for the clustered traveling salesman problem

    Directory of Open Access Journals (Sweden)

    Mário Mestria

    2013-08-01

    Full Text Available The Clustered Traveling Salesman Problem (CTSP is a generalization of the Traveling Salesman Problem (TSP in which the set of vertices is partitioned into disjoint clusters and objective is to find a minimum cost Hamiltonian cycle such that the vertices of each cluster are visited contiguously. The CTSP is NP-hard and, in this context, we are proposed heuristic methods for the CTSP using GRASP, Path Relinking and Variable Neighborhood Descent (VND. The heuristic methods were tested using Euclidean instances with up to 2000 vertices and clusters varying between 4 to 150 vertices. The computational tests were performed to compare the performance of the heuristic methods with an exact algorithm using the Parallel CPLEX software. The computational results showed that the hybrid heuristic method using VND outperforms other heuristic methods.

  5. The Case for a Hierarchical Cosmology

    Science.gov (United States)

    Vaucouleurs, G. de

    1970-01-01

    The development of modern theoretical cosmology is presented and some questionable assumptions of orthodox cosmology are pointed out. Suggests that recent observations indicate that hierarchical clustering is a basic factor in cosmology. The implications of hierarchical models of the universe are considered. Bibliography. (LC)

  6. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

    KAUST Repository

    Xu, Zhiqiang

    2017-02-16

    Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

  7. Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods.

    Science.gov (United States)

    Šubelj, Lovro; van Eck, Nees Jan; Waltman, Ludo

    2016-01-01

    Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community.

  8. Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods.

    Directory of Open Access Journals (Sweden)

    Lovro Šubelj

    Full Text Available Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community.

  9. Cluster membership probabilities from proper motions and multi-wavelength photometric catalogues. I. Method and application to the Pleiades cluster

    Science.gov (United States)

    Sarro, L. M.; Bouy, H.; Berihuete, A.; Bertin, E.; Moraux, E.; Bouvier, J.; Cuillandre, J.-C.; Barrado, D.; Solano, E.

    2014-03-01

    Context. With the advent of deep wide surveys, large photometric and astrometric catalogues of literally all nearby clusters and associations have been produced. The unprecedented accuracy and sensitivity of these data sets and their broad spatial, temporal and wavelength coverage make obsolete the classical membership selection methods that were based on a handful of colours and luminosities. We present a new technique designed to take full advantage of the high dimensionality (photometric, astrometric, temporal) of such a survey to derive self-consistent and robust membership probabilities of the Pleiades cluster. Aims: We aim at developing a methodology to infer membership probabilities to the Pleiades cluster from the DANCe multidimensional astro-photometric data set in a consistent way throughout the entire derivation. The determination of the membership probabilities has to be applicable to censored data and must incorporate the measurement uncertainties into the inference procedure. Methods: We use Bayes' theorem and a curvilinear forward model for the likelihood of the measurements of cluster members in the colour-magnitude space, to infer posterior membership probabilities. The distribution of the cluster members proper motions and the distribution of contaminants in the full multidimensional astro-photometric space is modelled with a mixture-of-Gaussians likelihood. Results: We analyse several representation spaces composed of the proper motions plus a subset of the available magnitudes and colour indices. We select two prominent representation spaces composed of variables selected using feature relevance determination techniques based in Random Forests, and analyse the resulting samples of high probability candidates. We consistently find lists of high probability (p > 0.9975) candidates with ≈1000 sources, 4 to 5 times more than obtained in the most recent astro-photometric studies of the cluster. Conclusions: Multidimensional data sets require

  10. Heuristic methods using variable neighborhood random local search for the clustered traveling salesman problem

    Directory of Open Access Journals (Sweden)

    Mário Mestria

    2014-11-01

    Full Text Available In this paper, we propose new heuristic methods for solver the Clustered Traveling Salesman Problem (CTSP. The CTSP is a generalization of the Traveling Salesman Problem (TSP in which the set of vertices is partitioned into disjoint clusters and objective is to find a minimum cost Hamiltonian cycle such that the vertices of each cluster are visited contiguously. We develop two Variable Neighborhood Random Descent with Iterated Local for solver the CTSP. The heuristic methods proposed were tested in types of instances with data at different level of granularity for the number of vertices and clusters. The computational results showed that the heuristic methods outperform recent existing methods in the literature and they are competitive with an exact algorithm using the Parallel CPLEX software.

  11. Using the SaTScan method to detect local malaria clusters for guiding malaria control programmes

    Directory of Open Access Journals (Sweden)

    Kok Gerdalize

    2009-04-01

    Full Text Available Abstract Background Mpumalanga Province, South Africa is a low malaria transmission area that is subject to malaria epidemics. SaTScan methodology was used by the malaria control programme to detect local malaria clusters to assist disease control planning. The third season for case cluster identification overlapped with the first season of implementing an outbreak identification and response system in the area. Methods SaTScan™ software using the Kulldorf method of retrospective space-time permutation and the Bernoulli purely spatial model was used to identify malaria clusters using definitively confirmed individual cases in seven towns over three malaria seasons. Following passive case reporting at health facilities during the 2002 to 2005 seasons, active case detection was carried out in the communities, this assisted with determining the probable source of infection. The distribution and statistical significance of the clusters were explored by means of Monte Carlo replication of data sets under the null hypothesis with replications greater than 999 to ensure adequate power for defining clusters. Results and discussion SaTScan detected five space-clusters and two space-time clusters during the study period. There was strong concordance between recognized local clustering of cases and outbreak declaration in specific towns. Both Albertsnek and Thambokulu reported malaria outbreaks in the same season as space-time clusters. This synergy may allow mutual validation of the two systems in confirming outbreaks demanding additional resources and cluster identification at local level to better target resources. Conclusion Exploring the clustering of cases assisted with the planning of public health activities, including mobilizing health workers and resources. Where appropriate additional indoor residual spraying, focal larviciding and health promotion activities, were all also carried out.

  12. Hierarchical Porous Structures

    Energy Technology Data Exchange (ETDEWEB)

    Grote, Christopher John [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-06-07

    Materials Design is often at the forefront of technological innovation. While there has always been a push to generate increasingly low density materials, such as aero or hydrogels, more recently the idea of bicontinuous structures has gone more into play. This review will cover some of the methods and applications for generating both porous, and hierarchically porous structures.

  13. A Spatial Division Clustering Method and Low Dimensional Feature Extraction Technique Based Indoor Positioning System

    Directory of Open Access Journals (Sweden)

    Yun Mo

    2014-01-01

    Full Text Available Indoor positioning systems based on the fingerprint method are widely used due to the large number of existing devices with a wide range of coverage. However, extensive positioning regions with a massive fingerprint database may cause high computational complexity and error margins, therefore clustering methods are widely applied as a solution. However, traditional clustering methods in positioning systems can only measure the similarity of the Received Signal Strength without being concerned with the continuity of physical coordinates. Besides, outage of access points could result in asymmetric matching problems which severely affect the fine positioning procedure. To solve these issues, in this paper we propose a positioning system based on the Spatial Division Clustering (SDC method for clustering the fingerprint dataset subject to physical distance constraints. With the Genetic Algorithm and Support Vector Machine techniques, SDC can achieve higher coarse positioning accuracy than traditional clustering algorithms. In terms of fine localization, based on the Kernel Principal Component Analysis method, the proposed positioning system outperforms its counterparts based on other feature extraction methods in low dimensionality. Apart from balancing online matching computational burden, the new positioning system exhibits advantageous performance on radio map clustering, and also shows better robustness and adaptability in the asymmetric matching problem aspect.

  14. A spatial division clustering method and low dimensional feature extraction technique based indoor positioning system.

    Science.gov (United States)

    Mo, Yun; Zhang, Zhongzhao; Meng, Weixiao; Ma, Lin; Wang, Yao

    2014-01-22

    Indoor positioning systems based on the fingerprint method are widely used due to the large number of existing devices with a wide range of coverage. However, extensive positioning regions with a massive fingerprint database may cause high computational complexity and error margins, therefore clustering methods are widely applied as a solution. However, traditional clustering methods in positioning systems can only measure the similarity of the Received Signal Strength without being concerned with the continuity of physical coordinates. Besides, outage of access points could result in asymmetric matching problems which severely affect the fine positioning procedure. To solve these issues, in this paper we propose a positioning system based on the Spatial Division Clustering (SDC) method for clustering the fingerprint dataset subject to physical distance constraints. With the Genetic Algorithm and Support Vector Machine techniques, SDC can achieve higher coarse positioning accuracy than traditional clustering algorithms. In terms of fine localization, based on the Kernel Principal Component Analysis method, the proposed positioning system outperforms its counterparts based on other feature extraction methods in low dimensionality. Apart from balancing online matching computational burden, the new positioning system exhibits advantageous performance on radio map clustering, and also shows better robustness and adaptability in the asymmetric matching problem aspect.

  15. WEIGHING GALAXY CLUSTERS WITH GAS. I. ON THE METHODS OF COMPUTING HYDROSTATIC MASS BIAS

    Energy Technology Data Exchange (ETDEWEB)

    Lau, Erwin T.; Nagai, Daisuke [Department of Physics, Yale University, New Haven, CT 06520 (United States); Nelson, Kaylea, E-mail: erwin.lau@yale.edu [Department of Astronomy, Yale University, New Haven, CT 06520 (United States)

    2013-11-10

    Mass estimates of galaxy clusters from X-ray and Sunyeav-Zel'dovich observations assume the intracluster gas is in hydrostatic equilibrium with their gravitational potential. However, since galaxy clusters are dynamically active objects whose dynamical states can deviate significantly from the equilibrium configuration, the departure from the hydrostatic equilibrium assumption is one of the largest sources of systematic uncertainties in cluster cosmology. In the literature there have been two methods for computing the hydrostatic mass bias based on the Euler and the modified Jeans equations, respectively, and there has been some confusion about the validity of these two methods. The word 'Jeans' was a misnomer, which incorrectly implies that the gas is collisionless. To avoid further confusion, we instead refer these methods as 'summation' and 'averaging' methods respectively. In this work, we show that these two methods for computing the hydrostatic mass bias are equivalent by demonstrating that the equation used in the second method can be derived from taking spatial averages of the Euler equation. Specifically, we identify the correspondences of individual terms in these two methods mathematically and show that these correspondences are valid to within a few percent level using hydrodynamical simulations of galaxy cluster formation. In addition, we compute the mass bias associated with the acceleration of gas and show that its contribution is small in the virialized regions in the interior of galaxy clusters, but becomes non-negligible in the outskirts of massive galaxy clusters. We discuss future prospects of understanding and characterizing biases in the mass estimate of galaxy clusters using both hydrodynamical simulations and observations and their implications for cluster cosmology.

  16. Analysis of cost data in a cluster-randomized, controlled trial: comparison of methods

    DEFF Research Database (Denmark)

    Sokolowski, Ineta; Ørnbøl, Eva; Rosendal, Marianne

    is commonly used for skewed distributions. For health care data, however, we need to recover the total cost in a given patient population. Thus, we focus, on making inferences on population means. Furthermore, a problem of clustered data is added as data related to patients in primary care are organized...... in clusters of general practices.   There have been suggestions to apply different methods, e.g., the non-parametric bootstrap, to highly skewed data from pragmatic randomized trials without clusters, but there is very little information about how to analyse skewed data from cluster-randomized trials. Many......  We consider health care data from a cluster-randomized intervention study in primary care to test whether the average health care costs among study patients differ between the two groups. The problems of analysing cost data are that most data are severely skewed. Median instead of mean...

  17. Social Influence on Information Technology Adoption and Sustained Use in Healthcare: A Hierarchical Bayesian Learning Method Analysis

    Science.gov (United States)

    Hao, Haijing

    2013-01-01

    Information technology adoption and diffusion is currently a significant challenge in the healthcare delivery setting. This thesis includes three papers that explore social influence on information technology adoption and sustained use in the healthcare delivery environment using conventional regression models and novel hierarchical Bayesian…

  18. Fabrication of hierarchical porous N-doping carbon membrane by using ;confined nanospace deposition; method for supercapacitor

    Science.gov (United States)

    Wang, Guoxu; Liu, Meng; Du, Juan; Liu, Lei; Yu, Yifeng; Sha, Jitong; Chen, Aibing

    2018-03-01

    The membrane carbon materials with hierarchical porous architecture are attractive because they can provide more channels for ion transport and shorten the ions transport path. Herein, we develop a facile way based on "confined nanospace deposition" to fabricate N-dopi-ng three dimensional hierarchical porous membrane carbon material (N-THPMC) via coating the nickel nitrate, silicate oligomers and triblock copolymer P123 on the branches of commercial polyamide membrane (PAM). During high temperature treatment, the mesoporous silica layer and Ni species serve as a "confined nanospace" and catalyst respectively, which are indispensable elements for formation of carbon framework, and the gas-phase carbon precursors which derive from the decomposition of PAM are deposited into the "confined nanospace" forming carbon framework. The N-THPMC with hierarchical macro/meso/microporous structure, N-doping (2.9%) and large specific surface area (994m2 g-1) well inherits the membrane morphology and hierarchical porous structure of PAM. The N-THPMC as electrode without binder exhibits a specific capacitance of 252 F g-1 at the current density of 1 A g-1 in 6 M KOH electrolyte and excellent cycling stability of 92.7% even after 5000 cycles.

  19. Comparison of Bayesian clustering and edge detection methods for inferring boundaries in landscape genetics

    Science.gov (United States)

    Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.

    2011-01-01

    Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.

  20. Unsupervised Learning of Structural Representation of Percussive Audio Using a Hierarchical Dirichlet Process Hidden Markov Model

    DEFF Research Database (Denmark)

    Antich, Jose Luis Diez; Paterna, Mattia; Marxer, Richard

    2016-01-01

    A method is proposed that extracts a structural representation of percussive audio in an unsupervised manner. It consists of two parts: 1) The input signal is segmented into blocks of approximately even duration, aligned to a metrical grid, using onset and timbre feature extraction, agglomerative...... single-linkage clustering, metrical regularity calculation and beat detection. 2) The approx. equal length blocks are clustered into k clusters and the resulting cluster sequence is modelled by transition probabilities between clusters. The Hierarchical Dirichlet Process Hidden Markov Model is employed...... to jointly estimate the optimal number of sound clusters, to cluster the blocks, and to estimate the transition probabilities between clusters. The result is a segmentation of the input into a sequence of symbols (typically corresponding to hits of hi-hat, snare, bass, cymbal, etc.) that can be evaluated...

  1. Identification of specific gait patterns in patients with cerebellar ataxia, spastic paraplegia, and Parkinson's disease: A non-hierarchical cluster analysis.

    Science.gov (United States)

    Serrao, Mariano; Chini, Giorgia; Bergantino, Matteo; Sarnari, Diego; Casali, Carlo; Conte, Carmela; Ranavolo, Alberto; Marcotulli, Christian; Rinaldi, Martina; Coppola, Gianluca; Bini, Fabiano; Pierelli, Francesco; Marinozzi, Franco

    2017-09-26

    Patients with degenerative neurological diseases such as cerebellar ataxia, spastic paraplegia, and Parkinson's disease often display progressive gait function decline that inexorably impacts their autonomy and quality of life. Therefore, considering the related social and economic costs, one of the most important areas of intervention in neurorehabilitation should be the treatment of gait abnormalities. This study aims to determine whether an entire dataset of gait parameters recorded in patients with degenerative neurological diseases can be clustered into homogeneous groups distinct from each other and from healthy subjects. Patients affected by three different types of primary degenerative neurological diseases were studied. These diseases were: i) cerebellar ataxia (28 patients), ii) hereditary spastic paraplegia (31 patients), and iii) Parkinson's disease (70 patients). Sixty-five gender-age-matched healthy subjects were enrolled as a control group. An optoelectronic motion analysis system was used to measure time-distance parameters and lower limb joint kinematics during gait in both patients and healthy controls. When clustering single parameters, step width and ankle joint range of motion (RoM) in the sagittal plane differentiated cerebellar ataxia group from the other groups. When clustering sets of two, three, or four parameters, several pairs, triples, and quadruples of clusters differentiated the cerebellar ataxia group from the other groups. Interestingly, the ankle joint RoM parameter was present in 100% of the clusters and the step width in approximately 50% of clusters. In addition, in almost all clusters, patients with cerebellar ataxia showed the lowest ankle joint RoM and the largest step width values compared to healthy controls, patients with hereditary spastic paraplegia, and Parkinson's disease subjects. This study identified several clusters reflecting specific gait patterns in patients with degenerative neurological diseases. In particular

  2. A new method to search for high-redshift clusters using photometric redshifts

    Energy Technology Data Exchange (ETDEWEB)

    Castignani, G.; Celotti, A. [SISSA, Via Bonomea 265, I-34136 Trieste (Italy); Chiaberge, M.; Norman, C., E-mail: castigna@sissa.it [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States)

    2014-09-10

    We describe a new method (Poisson probability method, PPM) to search for high-redshift galaxy clusters and groups by using photometric redshift information and galaxy number counts. The method relies on Poisson statistics and is primarily introduced to search for megaparsec-scale environments around a specific beacon. The PPM is tailored to both the properties of the FR I radio galaxies in the Chiaberge et al. sample, which are selected within the COSMOS survey, and to the specific data set used. We test the efficiency of our method of searching for cluster candidates against simulations. Two different approaches are adopted. (1) We use two z ∼ 1 X-ray detected cluster candidates found in the COSMOS survey and we shift them to higher redshift up to z = 2. We find that the PPM detects the cluster candidates up to z = 1.5, and it correctly estimates both the redshift and size of the two clusters. (2) We simulate spherically symmetric clusters of different size and richness, and we locate them at different redshifts (i.e., z = 1.0, 1.5, and 2.0) in the COSMOS field. We find that the PPM detects the simulated clusters within the considered redshift range with a statistical 1σ redshift accuracy of ∼0.05. The PPM is an efficient alternative method for high-redshift cluster searches that may also be applied to both present and future wide field surveys such as SDSS Stripe 82, LSST, and Euclid. Accurate photometric redshifts and a survey depth similar or better than that of COSMOS (e.g., I < 25) are required.

  3. A Spatial Shape Constrained Clustering Method for Mammographic Mass Segmentation

    Directory of Open Access Journals (Sweden)

    Jian-Yong Lou

    2015-01-01

    error of 7.18% for well-defined masses (or 8.06% for ill-defined masses was obtained by using DACF on MiniMIAS database, with 5.86% (or 5.55% and 6.14% (or 5.27% improvements as compared to the standard DA and fuzzy c-means methods.

  4. Adaptive cluster sampling: An efficient method for assessing inconspicuous species

    Science.gov (United States)

    Andrea M. Silletti; Joan Walker

    2003-01-01

    Restorationistis typically evaluate the success of a project by estimating the population sizes of species that have been planted or seeded. Because total census is raely feasible, they must rely on sampling methods for population estimates. However, traditional random sampling designs may be inefficient for species that, for one reason or another, are challenging to...

  5. A Comparison of Methods for Player Clustering via Behavioral Telemetry

    DEFF Research Database (Denmark)

    Drachen, Anders; Thurau, Christian; Sifa, Rafet

    2013-01-01

    can be exceptionally complex, with features recorded for a varying population of users over a temporal segment that can reach years in duration. Categorization of behaviors, whether through descriptive methods (e.g. segmentation) or unsupervised/supervised learning techniques, is valuable for finding...

  6. Atmospheric Cluster Dynamics Code: a flexible method for solution of the birth-death equations

    Science.gov (United States)

    McGrath, M. J.; Olenius, T.; Ortega, I. K.; Loukonen, V.; Paasonen, P.; Kurtén, T.; Kulmala, M.; Vehkamäki, H.

    2012-03-01

    The Atmospheric Cluster Dynamics Code (ACDC) is presented and explored. This program was created to study the first steps of atmospheric new particle formation by examining the formation of molecular clusters from atmospherically relevant molecules. The program models the cluster kinetics by explicit solution of the birth-death equations, using an efficient computer script for their generation and the MATLAB ode15s routine for their solution. Through the use of evaporation rate coefficients derived from formation free energies calculated by quantum chemical methods for clusters containing dimethylamine or ammonia and sulphuric acid, we have explored the effect of changing various parameters at atmospherically relevant monomer concentrations. We have included in our model clusters with 0-4 base molecules and 0-4 sulfuric acid molecules for which we have commensurable quantum chemical data. The tests demonstrate that large effects can be seen for even small changes in different parameters, due to the non-linearity of the system. In particular, changing the temperature had a significant impact on the steady-state concentrations of all clusters, while the boundary effects (allowing clusters to grow to sizes beyond the largest cluster that the code keeps track of, or forbidding such processes), coagulation sink terms, non-monomer collisions, sticking probabilities and monomer concentrations did not show as large effects under the conditions studied. Removal of coagulation sink terms prevented the system from reaching the steady state when all the initial cluster concentrations were set to the default value of 1 m-3, which is probably an effect caused by studying only relatively small cluster sizes.

  7. K-Line Patterns’ Predictive Power Analysis Using the Methods of Similarity Match and Clustering

    OpenAIRE

    Lv Tao; Yongtao Hao; Hao Yijie; Shen Chunfeng

    2017-01-01

    Stock price prediction based on K-line patterns is the essence of candlestick technical analysis. However, there are some disputes on whether the K-line patterns have predictive power in academia. To help resolve the debate, this paper uses the data mining methods of pattern recognition, pattern clustering, and pattern knowledge mining to research the predictive power of K-line patterns. The similarity match model and nearest neighbor-clustering algorithm are proposed for solving the problem ...

  8. Grey Wolf Optimizer Based on Powell Local Optimization Method for Clustering Analysis

    OpenAIRE

    Sen Zhang; Yongquan Zhou

    2015-01-01

    One heuristic evolutionary algorithm recently proposed is the grey wolf optimizer (GWO), inspired by the leadership hierarchy and hunting mechanism of grey wolves in nature. This paper presents an extended GWO algorithm based on Powell local optimization method, and we call it PGWO. PGWO algorithm significantly improves the original GWO in solving complex optimization problems. Clustering is a popular data analysis and data mining technique. Hence, the PGWO could be applied in solving cluster...

  9. Intraclass Correlation Coefficients in Hierarchical Designs: Evaluation Using Latent Variable Modeling

    Science.gov (United States)

    Raykov, Tenko

    2011-01-01

    Interval estimation of intraclass correlation coefficients in hierarchical designs is discussed within a latent variable modeling framework. A method accomplishing this aim is outlined, which is applicable in two-level studies where participants (or generally lower-order units) are clustered within higher-order units. The procedure can also be…

  10. Morphologically tuned 3D/1D rutile TiO2 hierarchical hybrid microarchitectures engineered by one-step surfactant free hydrothermal method

    Science.gov (United States)

    Maria John, Maria Angelin Sinthiya; Ramamurthi, K.; Sethuraman, K.; Ramesh Babu, R.

    2017-05-01

    Present investigation reports on the surfactant free hydrothermal synthesize of the morphologically tuned hierarchical hybrid rutile titanium oxide (TiO2) microarchitectures showing three dimensional microflower structures and cook pine tree like structures on the one dimensional nanorods formed over TiO2 seed layer coated glass substrates by tuning growth temperature. TiO2 seed layer of ∼100 nm thick was coated on the glass substrates employing sol-gel spin coating method and then rutile TiO2 microarchitectures were synthesized on the TiO2 seed layer by one-step surfactant free hydrothermal method. Deposited samples were characterized by X-ray diffraction, scanning electron microscopy, energy dispersive spectroscopy, UV-vis spectroscopy and photoluminescence spectroscopy techniques. Influence of the growth temperature on the crystallinity, morphology and optical properties along with the growth mechanism to achieve hierarchical microarchitectures was investigated. Present work revealed that the structural, morphological and optical properties of the TiO2 hierarchical microarchitectures strongly depend on the growth temperature. Further we proposed a model for the cause to effect possible morphological changes of rutile TiO2 microarchitectures as a function of growth temperatures on the TiO2 seeded glass substrates.

  11. A multidisciplinary coupling relationship coordination algorithm using the hierarchical control methods of complex systems and its application in multidisciplinary design optimization

    Directory of Open Access Journals (Sweden)

    Rong Yuan

    2016-12-01

    Full Text Available Because of the increasing complexity in engineering systems, multidisciplinary design optimization has attracted increasing attention. High computational expense and organizational complexity are two main challenges of multidisciplinary design optimization. To address these challenges, the hierarchical control method of complex systems is developed in this study. Hierarchical control method is a powerful way which has been utilized widely in the control and coordination of large-scale complex systems. Here, a hierarchical control method–based coupling relationship coordination algorithm is proposed to solve multidisciplinary design optimization problems. Coupling relationship coordination algorithm decouples the involved disciplines of a complex system and then optimizes each discipline objective at sub-system level. Coupling relationship coordination algorithm can maintain the consistency of interaction information (or in other words, sharing design variables and coupling design variables in different disciplines by introducing control parameters. The control parameters are assigned by the coordinator at system level. A mechanical structure multidisciplinary design optimization problem is solved to illustrate the details of the proposed approach.

  12. Pruning method for a cluster-based neural network

    Science.gov (United States)

    Ranney, Kenneth I.; Khatri, Hiralal; Nguyen, Lam H.; Sichina, Jeffrey

    2000-08-01

    Many radar automatic target detection (ATD) algorithms operate on a set of data statistics or features rather than on the raw radar sensor data. These features are selected based on their ability to separate target data samples from background clutter samples. The ATD algorithms often operate on the features through a set of parameters that must be determined from a set of training data that are statistically similar to the data set to be encountered in practice. The designer usually attempts to minimize the number of features used by the algorithm -- a process commonly referred to as pruning. This not only reduces the computational demands of the algorithm, but it also prevents overspecialization to the samples from the training data set. Thus, the algorithm will perform better on a set of test data samples it has not encountered during training. The Optimal Brain Surgeon (OBS) and Divergence Method provide two different approaches to pruning. We apply the two methods to a set of radar data features to determine a new, reduced set of features. We then evaluate the resulting feature sets and discuss the differences between the two methods.

  13. A semantics-based method for clustering of Chinese web search results

    Science.gov (United States)

    Zhang, Hui; Wang, Deqing; Wang, Li; Bi, Zhuming; Chen, Yong

    2014-01-01

    Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable information, however, is still likely submerged in the ocean of search results from those tools. By clustering the results into different groups based on subjects automatically, a search engine with the clustering feature allows users to select most relevant results quickly. In this paper, we propose an online semantics-based method to cluster Chinese web search results. First, we employ the generalised suffix tree to extract the longest common substrings (LCSs) from search snippets. Second, we use the HowNet to calculate the similarities of the words derived from the LCSs, and extract the most representative features by constructing the vocabulary chain. Third, we construct a vector of text features and calculate snippets' semantic similarities. Finally, we improve the Chameleon algorithm to cluster snippets. Extensive experimental results have shown that the proposed algorithm has outperformed over the suffix tree clustering method and other traditional clustering methods.

  14. An application of the KNND method for detecting nearby open clusters based on Gaia-DR1

    Science.gov (United States)

    Gao, Xin-Hua

    2017-05-01

    This paper presents a preliminary test of the k-th nearest neighbor distance (KNND) method for detecting nearby open clusters based on Gaia-DR1. We select 38 386 nearby stars (Ber) open clusters), and obtain 57 reliable cluster members. Based on these cluster members, the distances to the Hyades and Coma Ber clusters are determined to be 46.0±0.2 and 83.5±0.3 pc, respectively. Our results demonstrate that the KNND method can be used to detect open clusters based on a large volume of astrometry data.

  15. Robustness of serial clustering of extratropical cyclones to the choice of tracking method

    Directory of Open Access Journals (Sweden)

    Joaquim G. Pinto

    2016-07-01

    Full Text Available Cyclone clusters are a frequent synoptic feature in the Euro-Atlantic area. Recent studies have shown that serial clustering of cyclones generally occurs on both flanks and downstream regions of the North Atlantic storm track, while cyclones tend to occur more regulary on the western side of the North Atlantic basin near Newfoundland. This study explores the sensitivity of serial clustering to the choice of cyclone tracking method using cyclone track data from 15 methods derived from ERA-Interim data (1979–2010. Clustering is estimated by the dispersion (ratio of variance to mean of winter [December – February (DJF] cyclone passages near each grid point over the Euro-Atlantic area. The mean number of cyclone counts and their variance are compared between methods, revealing considerable differences, particularly for the latter. Results show that all different tracking methods qualitatively capture similar large-scale spatial patterns of underdispersion and overdispersion over the study region. The quantitative differences can primarily be attributed to the differences in the variance of cyclone counts between the methods. Nevertheless, overdispersion is statistically significant for almost all methods over parts of the eastern North Atlantic and Western Europe, and is therefore considered as a robust feature. The influence of the North Atlantic Oscillation (NAO on cyclone clustering displays a similar pattern for all tracking methods, with one maximum near Iceland and another between the Azores and Iberia. The differences in variance between methods are not related with different sensitivities to the NAO, which can account to over 50% of the clustering in some regions. We conclude that the general features of underdispersion and overdispersion of extratropical cyclones over the North Atlantic and Western Europe are robust to the choice of tracking method. The same is true for the influence of the NAO on cyclone dispersion.

  16. Analysis of the Earth's magnetosphere states using the algorithm of adaptive construction of hierarchical neural network classifiers

    Science.gov (United States)

    Dolenko, Sergey; Svetlov, Vsevolod; Isaev, Igor; Myagkova, Irina

    2017-10-01

    This paper presents analysis of the results of clusterization of the array of increases in the flux of relativistic electrons in the outer radiation belt of the Earth by two clustering algorithms. One of them is the algorithm for adaptive construction of hierarchical neural network classifiers developed by the authors, applied in clustering mode; the other one is the well-known k-means clusterization algorithm. The obtained clusters are analysed from the point of view of their possible matching to characteristic types of events, the partitions obtained by both methods are compared with each other.

  17. Analysis of the Earth's magnetosphere states using the algorithm of adaptive construction of hierarchical neural network classifiers

    Directory of Open Access Journals (Sweden)

    Dolenko Sergey

    2017-01-01

    Full Text Available This paper presents analysis of the results of clusterization of the array of increases in the flux of relativistic electrons in the outer radiation belt of the Earth by two clustering algorithms. One of them is the algorithm for adaptive construction of hierarchical neural network classifiers developed by the authors, applied in clustering mode; the other one is the well-known k-means clusterization algorithm. The obtained clusters are analysed from the point of view of their possible matching to characteristic types of events, the partitions obtained by both methods are compared with each other.

  18. A New Method for the Detection of Galaxy Clusters in X-Ray Surveys

    Energy Technology Data Exchange (ETDEWEB)

    Piacentine, J.M.; Marshall, P.J.; Peterson, J.R.; Andersson, K.E.

    2005-01-01

    For many years the power of counting clusters of galaxies as a function of their mass has been recognized as a powerful cosmological probe; however, we are only now beginning to acquire data from dedicated surveys with sufcient sky coverage and sensitivity to measure the cluster population out to distances where the dark energy came to dominate the Universe’s evolution. One such survey uses the XMM X-ray telescope to scan a large area of sky, detecting the X-ray photons from the hot plasma that lies in the deep potential wells of massive clusters of galaxies. These clusters appear as extended (not point-like) objects, each providing just a few hundred photons in a typical observation. The detection of extended sources in such a low signal-to-noise situation is an important problem in astrophysics: we attempt to solve it by using as much prior information as possible, translating our experience with wellmeasured clusters to define a “template” cluster that can be varied and matched to the features seen in the XMM images. In this work we adapt an existing Monte Carlo analysis code for this problem. Two detection templates were dened and their suitability explored using simulated data; the method was then applied to a publically avalable XMM observation of a “blank” field. Presented are the encouraging results of this series of experiments, suggesting that this approach continue to be developed for future cluster-identication endeavours.

  19. Application of AI methods in the clustering of architecture interior forms

    Directory of Open Access Journals (Sweden)

    Maryam Banaei

    2017-09-01

    Full Text Available Form or shape is one of the main aspects of architecture design. A gap exists in scientific studies on categorizing different architecture interior forms according to design. This paper presents a methodology for categorizing interior forms of built places. The main innovation of this study was to evaluate the architecture interior forms of real built places as a base for any analysis on form. We proposed a clustering method by selecting 343 images of living rooms from residential places according to their history and interior design style. We labeled all the images in AutoCAD software depending on form features. The labeling results showed that images had 1104 distinct form features, including sloped, vertical and horizontal linear solids, and edges. Regarding the high dimension of data, we used Graphical Clustering Toolkit software for clustering, which involved the use of correlation coefficients and internal similarity among clusters. The clustering analysis grouped all the images into 25 clusters with the highest internal and the lowest external similarities. The descriptive features of each cluster could show its formal characteristics.

  20. Appropriate statistical methods were infrequently used in cluster-randomized crossover trials.

    Science.gov (United States)

    Arnup, Sarah J; Forbes, Andrew B; Kahan, Brennan C; Morgan, Katy E; McKenzie, Joanne E

    2016-06-01

    To assess the design and statistical methods used in cluster-randomized crossover (CRXO) trials. We undertook a systematic review of CRXO trials. Searches of MEDLINE, EMBASE, and CINAHL Plus; and citation searches of CRXO methodological articles were conducted to December 2014. We extracted data on design characteristics and statistical methods for sample size, data analysis, and handling of missing data. Ninety-one trials including 139 end point analyses met the inclusion criteria. Trials had a median of nine clusters [interquartile range (IQR), 4-21] and median cluster-period size of 30 individuals (IQR, 14-77); 58 (69%) trials had two periods, and 27 trials (30%) included the same individuals in all periods. A rationale for the design was reported in only 25 trials (27%). A sample size justification was provided in 53 (58%) trials. Only nine (10%) trials accounted appropriately for the design in their sample size calculation. Ten of the 12 cluster-level analyses used a method that accounted for the clustering and multiple-period aspects of the design. In contrast, only 4 of the 127 individual-level analyses used a potentially appropriate method. There is a need for improved application of appropriate analysis and sample size methods, and reporting, in CRXO trials. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. An effective trust-based recommendation method using a novel graph clustering algorithm

    Science.gov (United States)

    Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin

    2015-10-01

    Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.

  2. System and Method for Outlier Detection via Estimating Clusters

    Science.gov (United States)

    Iverson, David J. (Inventor)

    2016-01-01

    An efficient method and system for real-time or offline analysis of multivariate sensor data for use in anomaly detection, fault detection, and system health monitoring is provided. Models automatically derived from training data, typically nominal system data acquired from sensors in normally operating conditions or from detailed simulations, are used to identify unusual, out of family data samples (outliers) that indicate possible system failure or degradation. Outliers are determined through analyzing a degree of deviation of current system behavior from the models formed from the nominal system data. The deviation of current system behavior is presented as an easy to interpret numerical score along with a measure of the relative contribution of each system parameter to any off-nominal deviation. The techniques described herein may also be used to "clean" the training data.

  3. Grey Wolf Optimizer Based on Powell Local Optimization Method for Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Sen Zhang

    2015-01-01

    Full Text Available One heuristic evolutionary algorithm recently proposed is the grey wolf optimizer (GWO, inspired by the leadership hierarchy and hunting mechanism of grey wolves in nature. This paper presents an extended GWO algorithm based on Powell local optimization method, and we call it PGWO. PGWO algorithm significantly improves the original GWO in solving complex optimization problems. Clustering is a popular data analysis and data mining technique. Hence, the PGWO could be applied in solving clustering problems. In this study, first the PGWO algorithm is tested on seven benchmark functions. Second, the PGWO algorithm is used for data clustering on nine data sets. Compared to other state-of-the-art evolutionary algorithms, the results of benchmark and data clustering demonstrate the superior performance of PGWO algorithm.

  4. Identification of rural landscape classes through a GIS clustering method

    Directory of Open Access Journals (Sweden)

    Irene Diti

    2013-09-01

    Full Text Available The paper presents a methodology aimed at supporting the rural planning process. The analysis of the state of the art of local and regional policies focused on rural and suburban areas, and the study of the scientific literature in the field of spatial analysis methodologies, have allowed the definition of the basic concept of the research. The proposed method, developed in a GIS, is based on spatial metrics selected and defined to cover various agricultural, environmental, and socio-economic components. The specific goal of the proposed methodology is to identify homogeneous extra-urban areas through their objective characterization at different scales. Once areas with intermediate urban-rural characters have been identified, the analysis is then focused on the more detailed definition of periurban agricultural areas. The synthesis of the results of the analysis of the various landscape components is achieved through an original interpretative key which aims to quantify the potential impacts of rural areas on the urban system. This paper presents the general framework of the methodology and some of the main results of its first implementation through an Italian case study.

  5. Three-dimensional sea-urchin-like hierarchical TiO{sub 2} microspheres synthesized by a one-pot hydrothermal method and their enhanced photocatalytic activity

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Yi, E-mail: zhouyihn@163.com [Department of Chemical and Biological Engineering, Changsha University of Science and Technology, Hunan, 410114 (China); Huang, Yan; Li, Dang; He, Wenhong [Department of Chemical and Biological Engineering, Changsha University of Science and Technology, Hunan, 410114 (China)

    2013-07-15

    Graphical abstract: SEM images of the samples synthesized at different hydrothermal temperatures for 8 h: (a) 75; (b) 100; (c) 120; and (d) 140°C, followed by calcination at 450 °C for 2 h. Highlights: ► Effects of calcination temperature on the phase transformation were studied. ► Effects of hydrothermal temperature and time on the morphology growth were studied. ► A two-stage reaction mechanism for the formation was presented. ► The photocatalytic activity was evaluated under sunlight irradiation. ► Effects of calcination temperature on the photocatalytic activity were studied. - Abstract: Novel three-dimensional sea-urchin-like hierarchical TiO{sub 2} superstructures were synthesized on a Ti plate in a mixture of H{sub 2}O{sub 2} and NaOH aqueous solution by a facile one-pot hydrothermal method at a low temperature, followed by protonation and calcination. The results of series of electron microscopy characterizations suggested that the hierarchical TiO{sub 2} superstructures consisted of numerous one-dimensional nanostructures. The microspheres were approximately 2–4 μm in diameter, and the one-dimensional TiO{sub 2} nanostructures were up to 600–700 nm long. A two-stage reaction mechanism, i.e., initial growth and then assembly, was proposed for the formation of these architectures. The three-dimensional sea-urchin-like hierarchical TiO{sub 2} microstructures showed excellent photocatalytic activity for the degradation of Rhodamine B aqueous solution under sunlight irradiation, which was attributed to the special three-dimensional hierarchical superstructure, and increased number of surface active sites. This novel superstructure has promising use in practical aqueous purification.

  6. A method for determining the radius of an open cluster from stellar proper motions

    Science.gov (United States)

    Sánchez, Néstor; Alfaro, Emilio J.; López-Martínez, Fátima

    2018-01-01

    We propose a method for calculating the radius of an open cluster in an objective way from an astrometric catalogue containing, at least, positions and proper motions. It uses the minimum spanning tree (hereinafter MST) in the proper motion space to discriminate cluster stars from field stars and it quantifies the strength of the cluster-field separation by means of a statistical parameter defined for the first time in this paper. This is done for a range of different sampling radii from where the cluster radius is obtained as the size at which the best cluster-field separation is achieved. The novelty of this strategy is that the cluster radius is obtained independently of how its stars are spatially distributed. We test the reliability and robustness of the method with both simulated and real data from a well-studied open cluster (NGC 188), and apply it to UCAC4 data for five other open clusters with different catalogued radius values. NGC 188, NGC 1647, NGC 6603 and Ruprecht 155 yielded unambiguous radius values of 15.2 ± 1.8, 29.4 ± 3.4, 4.2 ± 1.7 and 7.0 ± 0.3 arcmin, respectively. ASCC 19 and Collinder 471 showed more than one possible solution but it is not possible to know whether this is due to the involved uncertainties or to the presence of complex patterns in their proper motion distributions, something that could be inherent to the physical object or due to the way in which the catalogue was sampled.

  7. Fast Search the Density Peaks and Clustering Method for Check-in Data

    Directory of Open Access Journals (Sweden)

    LIU Meng

    2017-04-01

    Full Text Available Check-in data obtained from Location-based Social Network (LBSN is a sort of crowd geographic data which will reveal daily activities of urban residents. Different check-in behaviors with the same check-in location will produce the phenomenon of location duplication because of location candidate function in LBSN system. The current density-based spatial clustering algorithms have the following problems: ①difficulty to find density peak point. ②clustering error caused by check-in point objects with duplicate positions. In order to solve these problems, we proposed a fast search density peaks and clustering method for check-in data, based on clustering by fast search and find of density peaks (CFSFDP. Firstly, position repetition frequency was introduced and calculated to illustrate the number of the check-in position duplications data. Secondly, a new type of point feature was constructed by adding position repetition frequency of the original check-in position data, which was used as study object to search density peaks. At last, clustering algorithm based on density peak point was constructed in which density connectivity was taken into account to ensure the continuity and integrity of density clusters. Taking check-in data obtained from Sina Microblog as an example, an experiment was designed and implemented. The results demonstrates:①Clustering method can effectively avoid the problem that the outlier location object with high repeatability is chosen as the peak and clustering, and has excellent spatial adaptability as well when comparing with check-in data from other area. ②Extracted density peak points can not only be used to represent the center of the hot zone, but also reflect the concentration trend of the hot zone, which can help to explore the dynamic change of the hot zone.

  8. Use of multiple cluster analysis methods to explore the validity of a community outcomes concept map.

    Science.gov (United States)

    Orsi, Rebecca

    2017-02-01

    Concept mapping is now a commonly-used technique for articulating and evaluating programmatic outcomes. However, research regarding validity of knowledge and outcomes produced with concept mapping is sparse. The current study describes quantitative validity analyses using a concept mapping dataset. We sought to increase the validity of concept mapping evaluation results by running multiple cluster analysis methods and then using several metrics to choose from among solutions. We present four different clustering methods based on analyses using the R statistical software package: partitioning around medoids (PAM), fuzzy analysis (FANNY), agglomerative nesting (AGNES) and divisive analysis (DIANA). We then used the Dunn and Davies-Bouldin indices to assist in choosing a valid cluster solution for a concept mapping outcomes evaluation. We conclude that the validity of the outcomes map is high, based on the analyses described. Finally, we discuss areas for further concept mapping methods research. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. A new method to assign galaxy cluster membership using photometric redshifts

    Science.gov (United States)

    Castignani, G.; Benoist, C.

    2016-11-01

    We introduce a new effective strategy to assign group and cluster membership probabilities Pmem to galaxies using photometric redshift information. Large dynamical ranges both in halo mass and cosmic time are considered. The method takes into account the magnitude distribution of both cluster and field galaxies as well as the radial distribution of galaxies in clusters using a non-parametric formalism, and relies on Bayesian inference to take photometric redshift uncertainties into account. We successfully test the method against 1208 galaxy clusters within redshifts z = 0.05-2.58 and masses 1013.29-14.80M⊙ drawn from wide field simulated galaxy mock catalogs mainly developed for the forthcoming Euclid mission. Median purity and completeness values of and are reached for galaxies brighter than 0.25 L∗ within r200 of each simulated halo and for a statistical photometric redshift accuracy σ((zs-zp)/(1 + zs)) = 0.03. The mean values p̅=56% and c̅=93% are consistent with the median and have negligible sub-percent uncertainties. Accurate photometric redshifts (σ((zs-zp)/(1 + zs)) ≲ 0.05) and robust estimates for the cluster redshift and cluster center coordinates are required. The dependence of the assignments on photometric redshift accuracy, galaxy magnitude and distance from the halo center, and halo properties such as mass, richness, and redshift are investigated. Variations in the mean values of both purity and completeness are globally limited to a few percent. The largest departures from the mean values are found for galaxies associated with distant z ≳ 1.5 halos, faint ( 0.25 L∗) galaxies, and those at the outskirts of the halo (at cluster-centric projected distances r200) for which the purity is decreased, Δp ≃ 20% at most, with respect to the mean value. The proposed method is applied to derive accurate richness estimates. A statistical comparison between the true (Ntrue) vs. estimated richness (λ = ∑ Pmem) yields on average to unbiased

  10. Performance quantification of clustering algorithms for false positive removal in fMRI by ROC curves

    Directory of Open Access Journals (Sweden)

    André Salles Cunha Peres

    Full Text Available Abstract Introduction Functional magnetic resonance imaging (fMRI is a non-invasive technique that allows the detection of specific cerebral functions in humans based on hemodynamic changes. The contrast changes are about 5%, making visual inspection impossible. Thus, statistic strategies are applied to infer which brain region is engaged in a task. However, the traditional methods like general linear model and cross-correlation utilize voxel-wise calculation, introducing a lot of false-positive data. So, in this work we tested post-processing cluster algorithms to diminish the false-positives. Methods In this study, three clustering algorithms (the hierarchical cluster, k-means and self-organizing maps were tested and compared for false-positive removal in the post-processing of cross-correlation analyses. Results Our results showed that the hierarchical cluster presented the best performance to remove the false positives in fMRI, being 2.3 times more accurate than k-means, and 1.9 times more accurate than self-organizing maps. Conclusion The hierarchical cluster presented the best performance in false-positive removal because it uses the inconsistency coefficient threshold, while k-means and self-organizing maps utilize a priori cluster number (centroids and neurons number; thus, the hierarchical cluster avoids clustering scattered voxels, as the inconsistency coefficient threshold allows only the voxels to be clustered that are at a minimum distance to some cluster.

  11. Dedicated Filter for Defects Clustering in Radiographic Image

    Science.gov (United States)

    Sikora, R.; Świadek, K.; Chady, T.

    2009-03-01

    Defect clusters such as linear or clustered porosity are in some cases even more important than single flaws. This paper presents two methods of defect clustering and algorithm for calculation of distances between flaws in digital radiographic image. Dedicated lookup table based filter is used for calculation of distances between objects in the specified range. For defect clustering two functions were developed. First one is based on MMD (Minimum Mean Distance) algorithm. Second one uses hierarchical procedures for clustering defects of various types, shapes and size.

  12. Clustering for data mining a data recovery approach

    CERN Document Server

    Mirkin, Boris

    2005-01-01

    Often considered more as an art than a science, the field of clustering has been dominated by learning through examples and by techniques chosen almost through trial-and-error. Even the most popular clustering methods--K-Means for partitioning the data set and Ward's method for hierarchical clustering--have lacked the theoretical attention that would establish a firm relationship between the two methods and relevant interpretation aids.Rather than the traditional set of ad hoc techniques, Clustering for Data Mining: A Data Recovery Approach presents a theory that not only closes gaps in K-Mean

  13. Clustering method and representative feeder selection for the California solar initiative

    Energy Technology Data Exchange (ETDEWEB)

    Broderick, Robert Joseph; Williams, Joseph R.; Munoz-Ramos, Karina

    2014-02-01

    The screening process for DG interconnection procedures needs to be improved in order to increase the PV deployment level on the distribution grid. A significant improvement in the current screening process could be achieved by finding a method to classify the feeders in a utility service territory and determine the sensitivity of particular groups of distribution feeders to the impacts of high PV deployment levels. This report describes the utility distribution feeder characteristics in California for a large dataset of 8,163 feeders and summarizes the California feeder population including the range of characteristics identified and most important to hosting capacity. The report describes the set of feeders that are identified for modeling and analysis as well as feeders identified for the control group. The report presents a method for separating a utilitys distribution feeders into unique clusters using the k-means clustering algorithm. An approach for determining the feeder variables of interest for use in a clustering algorithm is also described. The report presents an approach for choosing the feeder variables to be utilized in the clustering process and a method is identified for determining the optimal number of representative clusters.

  14. DLTAP: A Network-efficient Scheduling Method for Distributed Deep Learning Workload in Containerized Cluster Environment

    Directory of Open Access Journals (Sweden)

    Qiao Wei

    2017-01-01

    Full Text Available Deep neural networks (DNNs have recently yielded strong results on a range of applications. Training these DNNs using a cluster of commodity machines is a promising approach since training is time consuming and compute-intensive. Furthermore, putting DNN tasks into containers of clusters would enable broader and easier deployment of DNN-based algorithms. Toward this end, this paper addresses the problem of scheduling DNN tasks in the containerized cluster environment. Efficiently scheduling data-parallel computation jobs like DNN over containerized clusters is critical for job performance, system throughput, and resource utilization. It becomes even more challenging with the complex workloads. We propose a scheduling method called Deep Learning Task Allocation Priority (DLTAP which performs scheduling decisions in a distributed manner, and each of scheduling decisions takes aggregation degree of parameter sever task and worker task into account, in particularly, to reduce cross-node network transmission traffic and, correspondingly, decrease the DNN training time. We evaluate the DLTAP scheduling method using a state-of-the-art distributed DNN training framework on 3 benchmarks. The results show that the proposed method can averagely reduce 12% cross-node network traffic, and decrease the DNN training time even with the cluster of low-end servers.

  15. AN EFFICIENT INITIALIZATION METHOD FOR K-MEANS CLUSTERING OF HYPERSPECTRAL DATA

    Directory of Open Access Journals (Sweden)

    A. Alizade Naeini

    2014-10-01

    Full Text Available K-means is definitely the most frequently used partitional clustering algorithm in the remote sensing community. Unfortunately due to its gradient decent nature, this algorithm is highly sensitive to the initial placement of cluster centers. This problem deteriorates for the high-dimensional data such as hyperspectral remotely sensed imagery. To tackle this problem, in this paper, the spectral signatures of the endmembers in the image scene are extracted and used as the initial positions of the cluster centers. For this purpose, in the first step, A Neyman–Pearson detection theory based eigen-thresholding method (i.e., the HFC method has been employed to estimate the number of endmembers in the image. Afterwards, the spectral signatures of the endmembers are obtained using the Minimum Volume Enclosing Simplex (MVES algorithm. Eventually, these spectral signatures are used to initialize the k-means clustering algorithm. The proposed method is implemented on a hyperspectral dataset acquired by ROSIS sensor with 103 spectral bands over the Pavia University campus, Italy. For comparative evaluation, two other commonly used initialization methods (i.e., Bradley & Fayyad (BF and Random methods are implemented and compared. The confusion matrix, overall accuracy and Kappa coefficient are employed to assess the methods’ performance. The evaluations demonstrate that the proposed solution outperforms the other initialization methods and can be applied for unsupervised classification of hyperspectral imagery for landcover mapping.

  16. The association between content of the elements S, Cl, K, Fe, Cu, Zn and Br in normal and cirrhotic liver tissue from Danes and Greenlandic Inuit examined by dual hierarchical clustering analysis

    DEFF Research Database (Denmark)

    Laursen, Jens; Milman, Nils; Pind, N.

    2014-01-01

    PROJECT: Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. PROCEDURE: Normal liver samples from 45 Greenlandic......, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. CONCLUSIONS: In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical...

  17. Unsupervised Learning —A Novel Clustering Method for Rolling Bearing Faults Identification

    Science.gov (United States)

    Kai, Li; Bo, Luo; Tao, Ma; Xuefeng, Yang; Guangming, Wang

    2017-12-01

    To promptly process the massive fault data and automatically provide accurate diagnosis results, numerous studies have been conducted on intelligent fault diagnosis of rolling bearing. Among these studies, such as artificial neural networks, support vector machines, decision trees and other supervised learning methods are used commonly. These methods can detect the failure of rolling bearing effectively, but to achieve better detection results, it often requires a lot of training samples. Based on above, a novel clustering method is proposed in this paper. This novel method is able to find the correct number of clusters automatically the effectiveness of the proposed method is validated using datasets from rolling element bearings. The diagnosis results show that the proposed method can accurately detect the fault types of small samples. Meanwhile, the diagnosis results are also relative high accuracy even for massive samples.

  18. Form gene clustering method about pan-ethnic-group products based on emotional semantic

    Science.gov (United States)

    Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui

    2016-09-01

    The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.

  19. A Hybrid Image Filtering Method for Computer-Aided Detection of Microcalcification Clusters in Mammograms

    Directory of Open Access Journals (Sweden)

    Xiaoyong Zhang

    2013-01-01

    Full Text Available The presence of microcalcification clusters (MCs in mammogram is a major indicator of breast cancer. Detection of an MC is one of the key issues for breast cancer control. In this paper, we present a highly accurate method based on a morphological image processing and wavelet transform technique to detect the MCs in mammograms. The microcalcifications are firstly enhanced by using multistructure elements morphological processing. Then, the candidates of microcalcifications are refined by a multilevel wavelet reconstruction approach. Finally, MCs are detected based on their distributions feature. Experiments are performed on 138 clinical mammograms. The proposed method is capable of detecting 92.9% of true microcalcification clusters with an average of 0.08 false microcalcification clusters detected per image.

  20. Ensemble Classification for Anomalous Propagation Echo Detection with Clustering-Based Subset-Selection Method

    Directory of Open Access Journals (Sweden)

    Hansoo Lee

    2017-01-01

    Full Text Available Several types of non-precipitation echoes appear in radar images and disrupt the weather forecasting process. An anomalous propagation echo is an unwanted observation result similar to a precipitation echo. It occurs through radar-beam ducting because of the temperature, humidity distribution, and other complicated atmospheric conditions. Anomalous propagation echoes should be removed because they make weather forecasting difficult. In this paper, we suggest an ensemble classification method based on an artificial neural network and a clustering-based subset-selection method. This method allows us to implement an efficient classification method when a feature space has complicated distributions. By separating the input data into atomic and non-atomic clusters, each derived cluster will receive its own base classifier. In the experiments, we compared our method with a standalone artificial neural network classifier. The suggested ensemble classifier showed 84.14% performance, which was about 2% higher than that of the k-means clustering-based ensemble classifier and about 4% higher than the standalone artificial neural network classifier.

  1. Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method

    DEFF Research Database (Denmark)

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2014-01-01

    The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probabi......The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models...... the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace...... method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous...

  2. Evaluation of sliding baseline methods for spatial estimation for cluster detection in the biosurveillance system

    Directory of Open Access Journals (Sweden)

    Leuze Michael

    2009-07-01

    Full Text Available Abstract Background The Centers for Disease Control and Prevention's (CDC's BioSense system provides near-real time situational awareness for public health monitoring through analysis of electronic health data. Determination of anomalous spatial and temporal disease clusters is a crucial part of the daily disease monitoring task. Our study focused on finding useful anomalies at manageable alert rates according to available BioSense data history. Methods The study dataset included more than 3 years of daily counts of military outpatient clinic visits for respiratory and rash syndrome groupings. We applied four spatial estimation methods in implementations of space-time scan statistics cross-checked in Matlab and C. We compared the utility of these methods according to the resultant background cluster rate (a false alarm surrogate and sensitivity to injected cluster signals. The comparison runs used a spatial resolution based on the facility zip code in the patient record and a finer resolution based on the residence zip code. Results Simple estimation methods that account for day-of-week (DOW data patterns yielded a clear advantage both in background cluster rate and in signal sensitivity. A 28-day baseline gave the most robust results for this estimation; the preferred baseline is long enough to remove daily fluctuations but short enough to reflect recent disease trends and data representation. Background cluster rates were lower for the rash syndrome counts than for the respiratory counts, likely because of seasonality and the large scale of the respiratory counts. Conclusion The spatial estimation method should be chosen according to characteristics of the selected data streams. In this dataset with strong day-of-week effects, the overall best detection performance was achieved using subregion averages over a 28-day baseline stratified by weekday or weekend/holiday behavior. Changing the estimation method for particular scenarios involving

  3. Statistical method on nonrandom clustering with application to somatic mutations in cancer

    Directory of Open Access Journals (Sweden)

    Rejto Paul A

    2010-01-01

    Full Text Available Abstract Background Human cancer is caused by the accumulation of tumor-specific mutations in oncogenes and tumor suppressors that confer a selective growth advantage to cells. As a consequence of genomic instability and high levels of proliferation, many passenger mutations that do not contribute to the cancer phenotype arise alongside mutations that drive oncogenesis. While several approaches have been developed to separate driver mutations from passengers, few approaches can specifically identify activating driver mutations in oncogenes, which are more amenable for pharmacological intervention. Results We propose a new statistical method for detecting activating mutations in cancer by identifying nonrandom clusters of amino acid mutations in protein sequences. A probability model is derived using order statistics assuming that the location of amino acid mutations on a protein follows a uniform distribution. Our statistical measure is the differences between pair-wise order statistics, which is equivalent to the size of an amino acid mutation cluster, and the probabilities are derived from exact and approximate distributions of the statistical measure. Using data in the Catalog of Somatic Mutations in Cancer (COSMIC database, we have demonstrated that our method detects well-known clusters of activating mutations in KRAS, BRAF, PI3K, and β-catenin. The method can also identify new cancer targets as well as gain-of-function mutations in tumor suppressors. Conclusions Our proposed method is useful to discover activating driver mutations in cancer by identifying nonrandom clusters of somatic amino acid mutations in protein sequences.

  4. Tandem: A Context-Aware Method for Spontaneous Clustering of Dynamic Wireless Sensor Nodes

    NARCIS (Netherlands)

    Marin Perianu, Raluca; Lombriser, C.; Havinga, Paul J.M.; Scholten, Johan; Tröster, G.

    Wireless sensor nodes attached to everyday objects and worn by people are able to collaborate and actively assist users in their activities. We propose a method through which wireless sensor nodes organize spontaneously into clusters based on a common context. Provided that the confidence of sharing

  5. Application of the cluster variation method to ordering in an interstitital solid solution

    DEFF Research Database (Denmark)

    Pekelharing, Marjon I.; Böttger, Amarante; Somers, Marcel A. J.

    1999-01-01

    The tetrahedron approximation of the cluster variation method (CVM) was applied to describe the ordering on the fcc interstitial sublattice of gamma-Fe[N] and gamma'-Fe4N1-x. A Lennard-Jones potential was used to describe the dominantly strain-induced interactions, caused by misfitting of the N a...

  6. Pseudo cluster randomization: a treatment allocation method to minimize contamination and selection bias.

    NARCIS (Netherlands)

    Borm, G.F.; Melis, R.J.F.; Teerenstra, S.; Peer, P.G.M.

    2005-01-01

    In some clinical trials, treatment allocation on a patient level is not feasible, and whole groups or clusters of patients are allocated to the same treatment. If, for example, a clinical trial is investigating the efficacy of various patient coaching methods and randomization is done on a patient

  7. Clustering self-organizing maps (SOM) method for human papillomavirus (HPV) DNA as the main cause of cervical cancer disease

    Science.gov (United States)

    Bustamam, A.; Aldila, D.; Fatimah, Arimbi, M. D.

    2017-07-01

    One of the most widely used clustering method, since it has advantage on its robustness, is Self-Organizing Maps (SOM) method. This paper discusses the application of SOM method on Human Papillomavirus (HPV) DNA which is the main cause of cervical cancer disease, the most dangerous cancer in developing countries. We use 18 types of HPV DNA-based on the newest complete genome. By using open-source-based program R, clustering process can separate 18 types of HPV into two different clusters. There are two types of HPV in the first cluster while 16 others in the second cluster. The analyzing result of 18 types HPV based on the malignancy of the virus (the difficultness to cure). Two of HPV types the first cluster can be classified as tame HPV, while 16 others in the second cluster are classified as vicious HPV.

  8. Statistical properties of convex clustering

    OpenAIRE

    Tan, Kean Ming; Witten, Daniela

    2015-01-01

    In this manuscript, we study the statistical properties of convex clustering. We establish that convex clustering is closely related to single linkage hierarchical clustering and $k$-means clustering. In addition, we derive the range of the tuning parameter for convex clustering that yields a non-trivial solution. We also provide an unbiased estimator of the degrees of freedom, and provide a finite sample bound for the prediction error for convex clustering. We compare convex clustering to so...

  9. Morphologically tuned 3D/1D rutile TiO{sub 2} hierarchical hybrid microarchitectures engineered by one-step surfactant free hydrothermal method

    Energy Technology Data Exchange (ETDEWEB)

    Maria John, Maria Angelin Sinthiya [Crystal Growth and Thin Film Laboratory, Department of Physics and Nanotechnology, Faculty of Engineering and Technology, SRM University, Kattankulathur 603203, Tamil Nadu (India); Ramamurthi, K., E-mail: ramamurthi.k@ktr.srmuniv.ac.in [Crystal Growth and Thin Film Laboratory, Department of Physics and Nanotechnology, Faculty of Engineering and Technology, SRM University, Kattankulathur 603203, Tamil Nadu (India); Sethuraman, K. [School of Physics, Madurai Kamaraj University, Madurai 625021, Tamil Nadu (India); Ramesh Babu, R. [Crystal Growth and Thin Film Laboratory, School of Physics, Bharathidasan University, Tiruchirappalli 620024, Tamil Nadu (India)

    2017-05-31

    Highlights: • TiO{sub 2} 1D-NRs are tuned to 3D/1D-HHMs by increasing growth temperature-first report. • TiO{sub 2} seeded glass substrates are used to reduce the lattice mismatch of TiO{sub 2} HHMs. • Growth temperature influences the structural, morphological and optical properties. • Possible growth mechanism is proposed for morphological changes. - Abstract: Present investigation reports on the surfactant free hydrothermal synthesize of the morphologically tuned hierarchical hybrid rutile titanium oxide (TiO{sub 2}) microarchitectures showing three dimensional microflower structures and cook pine tree like structures on the one dimensional nanorods formed over TiO{sub 2} seed layer coated glass substrates by tuning growth temperature. TiO{sub 2} seed layer of ∼100 nm thick was coated on the glass substrates employing sol–gel spin coating method and then rutile TiO{sub 2} microarchitectures were synthesized on the TiO{sub 2} seed layer by one-step surfactant free hydrothermal method. Deposited samples were characterized by X-ray diffraction, scanning electron microscopy, energy dispersive spectroscopy, UV–vis spectroscopy and photoluminescence spectroscopy techniques. Influence of the growth temperature on the crystallinity, morphology and optical properties along with the growth mechanism to achieve hierarchical microarchitectures was investigated. Present work revealed that the structural, morphological and optical properties of the TiO{sub 2} hierarchical microarchitectures strongly depend on the growth temperature. Further we proposed a model for the cause to effect possible morphological changes of rutile TiO{sub 2} microarchitectures as a function of growth temperatures on the TiO{sub 2} seeded glass substrates.

  10. Partitioning clustering algorithms for protein sequence data sets

    Directory of Open Access Journals (Sweden)

    Fayech Sondes

    2009-04-01

    Full Text Available Abstract Background Genome-sequencing projects are currently producing an enormous amount of new sequences and cause the rapid increasing of protein sequence databases. The unsupervised classification of these data into functional groups or families, clustering, has become one of the principal research objectives in structural and functional genomics. Computer programs to automatically and accurately classify sequences into families become a necessity. A significant number of methods have addressed the clustering of protein sequences and most of them can be categorized in three major groups: hierarchical, graph-based and partitioning methods. Among the various sequence clustering methods in literature, hierarchical and graph-based approaches have been widely used. Although partitioning clustering techniques are extremely used in other fields, few applications have been found in the field of protein sequence clustering. It is not fully demonstrated if partitioning methods can be applied to protein sequence data and if these methods can be efficient compared to the published clustering methods. Methods We developed four partitioning clustering approaches using Smith-Waterman local-alignment algorithm to determine pair-wise similarities of sequences. Four different sets of protein sequences were used as evaluation data sets for the proposed methods. Results We show that these methods outperform several other published clustering methods in terms of correctly predicting a classifier and especially in terms of the correctness of the provided prediction. The software is available to academic users from the authors upon request.

  11. The potential of near-surface geophysical methods in a hierarchical monitoring approach for the detection of shallow CO2 seeps at geological storage sites

    Science.gov (United States)

    Sauer, U.; Schuetze, C.; Dietrich, P.

    2013-12-01

    The MONACO project (Monitoring approach for geological CO2 storage sites using a hierarchic observation concept) aims to find reliable monitoring tools that work on different spatial and temporal scales at geological CO2 storage sites. This integrative hierarchical monitoring approach based on different levels of coverage and resolutions is proposed as a means of reliably detecting CO2 degassing areas at ground surface level and for identifying CO2 leakages from storage formations into the shallow subsurface, as well as CO2 releases into the atmosphere. As part of this integrative hierarchical monitoring concept, several methods and technologies from ground-based remote sensing (Open-path Fourier-transform infrared (OP-FTIR) spectroscopy), regional measurements (near-surface geophysics, chamber-based soil CO2 flux measurement) and local in-situ measurements (using shallow boreholes) will either be combined or used complementary to one another. The proposed combination is a suitable concept for investigating CO2 release sites. This also presents the possibility of adopting a modular monitoring concept whereby our monitoring approach can be expanded to incorporate other methods in various coverage scales at any temporal resolution. The link between information obtained from large-scale surveys and local in-situ monitoring can be realized by sufficient geophysical techniques for meso-scale monitoring, such as geoelectrical and self-potential (SP) surveys. These methods are useful for characterizing fluid flow and transport processes in permeable near-surface sedimentary layers and can yield important information concerning CO2-affected subsurface structures. Results of measurements carried out a natural analogue site in the Czech Republic indicate that the hierarchical monitoring approach represents a successful multidisciplinary modular concept that can be used to monitor both physical and chemical processes taking place during CO2 migration and seepage. The

  12. A Combinational Clustering Based Method for cDNA Microarray Image Segmentation.

    Science.gov (United States)

    Shao, Guifang; Li, Tiejun; Zuo, Wangda; Wu, Shunxiang; Liu, Tundong

    2015-01-01

    Microarray technology plays an important role in drawing useful biological conclusions by analyzing thousands of gene expressions simultaneously. Especially, image analysis is a key step in microarray analysis and its accuracy strongly depends on segmentation. The pioneering works of clustering based segmentation have shown that k-means clustering algorithm and moving k-means clustering algorithm are two commonly used methods in microarray image processing. However, they usually face unsatisfactory results because the real microarray image contains noise, artifacts and spots that vary in size, shape and contrast. To improve the segmentation accuracy, in this article we present a combination clustering based segmentation approach that may be more reliable and able to segment spots automatically. First, this new method starts with a very simple but effective contrast enhancement operation to improve the image quality. Then, an automatic gridding based on the maximum between-class variance is applied to separate the spots into independent areas. Next, among each spot region, the moving k-means clustering is first conducted to separate the spot from background and then the k-means clustering algorithms are combined for those spots failing to obtain the entire boundary. Finally, a refinement step is used to replace the false segmentation and the inseparable ones of missing spots. In addition, quantitative comparisons between the improved method and the other four segmentation algorithms--edge detection, thresholding, k-means clustering and moving k-means clustering--are carried out on cDNA microarray images from six different data sets. Experiments on six different data sets, 1) Stanford Microarray Database (SMD), 2) Gene Expression Omnibus (GEO), 3) Baylor College of Medicine (BCM), 4) Swiss Institute of Bioinformatics (SIB), 5) Joe DeRisi's individual tiff files (DeRisi), and 6) University of California, San Francisco (UCSF), indicate that the improved approach is

  13. A Combinational Clustering Based Method for cDNA Microarray Image Segmentation.

    Directory of Open Access Journals (Sweden)

    Guifang Shao

    Full Text Available Microarray technology plays an important role in drawing useful biological conclusions by analyzing thousands of gene expressions simultaneously. Especially, image analysis is a key step in microarray analysis and its accuracy strongly depends on segmentation. The pioneering works of clustering based segmentation have shown that k-means clustering algorithm and moving k-means clustering algorithm are two commonly used methods in microarray image processing. However, they usually face unsatisfactory results because the real microarray image contains noise, artifacts and spots that vary in size, shape and contrast. To improve the segmentation accuracy, in this article we present a combination clustering based segmentation approach that may be more reliable and able to segment spots automatically. First, this new method starts with a very simple but effective contrast enhancement operation to improve the image quality. Then, an automatic gridding based on the maximum between-class variance is applied to separate the spots into independent areas. Next, among each spot region, the moving k-means clustering is first conducted to separate the spot from background and then the k-means clustering algorithms are combined for those spots failing to obtain the entire boundary. Finally, a refinement step is used to replace the false segmentation and the inseparable ones of missing spots. In addition, quantitative comparisons between the improved method and the other four segmentation algorithms--edge detection, thresholding, k-means clustering and moving k-means clustering--are carried out on cDNA microarray images from six different data sets. Experiments on six different data sets, 1 Stanford Microarray Database (SMD, 2 Gene Expression Omnibus (GEO, 3 Baylor College of Medicine (BCM, 4 Swiss Institute of Bioinformatics (SIB, 5 Joe DeRisi's individual tiff files (DeRisi, and 6 University of California, San Francisco (UCSF, indicate that the improved

  14. A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data

    DEFF Research Database (Denmark)

    Kent, Peter; Jensen, Rikke K; Kongsted, Alice

    2014-01-01

    intensity data collected for 52 weeks by text (SMS) messaging (n = 1,121 people), and the last dataset contained a range of clinical variables measured in low back pain patients (n = 543 people). Four artificial datasets (n = 1,000 each) containing subgroups of varying complexity were also analysed testing......BACKGROUND: There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA...

  15. Spectral characterization of hierarchical network modularity and limits of modularity detection.

    Science.gov (United States)

    Sarkar, Somwrita; Henderson, James A; Robinson, Peter A

    2013-01-01

    Many real world networks are reported to have hierarchically modular organization. However, there exists no algorithm-independent metric to characterize hierarchical modularity in a complex system. The main results of the paper are a set of methods to address this problem. First, classical results from random matrix theory are used to derive the spectrum of a typical stochastic block model hierarchical modular network form. Second, it is shown that hierarchical modularity can be fingerprinted using the spectrum of its largest eigenvalues and gaps between clusters of closely spaced eigenvalues that are well separated from the bulk distribution of eigenvalues around the origin. Third, some well-known results on fingerprinting non-hierarchical modularity in networks automatically follow as special cases, threreby unifying these previously fragmented results. Finally, using these spectral results, it is found that the limits of detection of modularity can be empirically established by studying the mean values of the largest eigenvalues and the limits of the bulk distribution of eigenvalues for an ensemble of networks. It is shown that even when modularity and hierarchical modularity are present in a weak form in the network, they are impossible to detect, because some of the leading eigenvalues fall within the bulk distribution. This provides a threshold for the detection of modularity. Eigenvalue distributions of some technological, social, and biological networks are studied, and the implications of detecting hierarchical modularity in real world networks are discussed.

  16. Evaluation of sliding baseline methods for spatial estimation for cluster detection in the biosurveillance system.

    Science.gov (United States)

    Xing, Jian; Burkom, Howard; Moniz, Linda; Edgerton, James; Leuze, Michael; Tokars, Jerome

    2009-07-17

    The Centers for Disease Control and Prevention's (CDC's) BioSense system provides near-real time situational awareness for public health monitoring through analysis of electronic health data. Determination of anomalous spatial and temporal disease clusters is a crucial part of the daily disease monitoring task. Our study focused on finding useful anomalies at manageable alert rates according to available BioSense data history. The study dataset included more than 3 years of daily counts of military outpatient clinic visits for respiratory and rash syndrome groupings. We applied four spatial estimation methods in implementations of space-time scan statistics cross-checked in Matlab and C. We compared the utility of these methods according to the resultant background cluster rate (a false alarm surrogate) and sensitivity to injected cluster signals. The comparison runs used a spatial resolution based on the facility zip code in the patient record and a finer resolution based on the residence zip code. Simple estimation methods that account for day-of-week (DOW) data patterns yielded a clear advantage both in background cluster rate and in signal sensitivity. A 28-day baseline gave the most robust results for this estimation; the preferred baseline is long enough to remove daily fluctuations but short enough to reflect recent disease trends and data representation. Background cluster rates were lower for the rash syndrome counts than for the respiratory counts, likely because of seasonality and the large scale of the respiratory counts. The spatial estimation method should be chosen according to characteristics of the selected data streams. In this dataset with strong day-of-week effects, the overall best detection performance was achieved using subregion averages over a 28-day baseline stratified by weekday or weekend/holiday behavior. Changing the estimation method for particular scenarios involving different spatial resolution or other syndromes can yield further

  17. Antiferromagnetism in the Hubbard model using a cluster slave-spin method

    Science.gov (United States)

    Lee, Wei-Cheng; Lee, Ting-Kuo

    2017-09-01

    The cluster slave-spin method is introduced to systematically investigate the solutions of the Hubbard model including the symmetry-broken phases. In this method, the electron operator is factorized into a fermionic spinon describing the physical spin and a slave-spin describing the charge fluctuations. Following the U (1 ) formalism derived by Yu and Si [Phys. Rev. B 86, 085104 (2012), 10.1103/PhysRevB.86.085104], it is shown that the self-consistent equations to explore various symmetry-broken density wave states can be constructed in general with a cluster of multiple slave-spin sites. We employ this method to study the antiferromagnetic (AFM) state in the single band Hubbard model with the two- and four-site clusters of slave spins. While the Hubbard gap, the charge gap due to the doubly occupied states, scales with the Hubbard interaction U as expected, the AFM gap Δ , the gap in the spinon dispersion in the AFM state, exhibits a crossover from the weak- to strong-coupling behaviors as U increases. Our cluster slave-spin method reproduces not only the traditional mean-field behavior of Δ ˜U in the weak-coupling limit, but also the behavior of Δ ˜t2/U predicted by the superexchange mechanism in the strong-coupling limit. In addition, the holon-doublon correlator as functions of U and doping x is also computed, which exhibits a strong tendency toward the holon-doublon binding in the strong coupling regime. We further show that the quasiparticle weight obtained by the cluster slave-spin method is in a good agreement with the generalized Gutzwiller approximation in both AFM and paramagnetic states, and the results can be improved beyond the generalized Gutzwiller approximation as the cluster is enlarged from a single site to four sites. Our results demonstrate that the cluster slave-spin method can be a powerful tool to systematically investigate the strongly correlated system.

  18. Negative Sequence Droop Method based Hierarchical Control for Low Voltage Ride-Through in Grid-Interactive Microgrids

    DEFF Research Database (Denmark)

    Zhao, Xin; Firoozabadi, Mehdi Savaghebi; Quintero, Juan Carlos Vasquez

    2015-01-01

    In highly microgrid (MG) integrated distribution systems, problems such as a sudden cut out of the MGs due to grid faults may lead to adverse effects to the grid. As a consequence, ancillary services provided by MGs are preferred since it can make the MG a contributor to ride through the faults....... In this paper, a voltage support strategy based on negative sequence droop control, which regulate the positive/negative sequence active and reactive power flow by means of sending proper voltage reference to the inner control loop, is proposed for the grid connected MGs to ride through voltage sags under...... complex line impedance conditions. In this case, the MGs should inject a certain amount of positive and negative sequence power to the grid so that the voltage quality at load side can be maintained at a satisfied level. A two layer hierarchical control strategy is proposed in this paper. The primary...

  19. IP2P K-means: an efficient method for data clustering on sensor networks

    Directory of Open Access Journals (Sweden)

    Peyman Mirhadi

    2013-03-01

    Full Text Available Many wireless sensor network applications require data gathering as the most important parts of their operations. There are increasing demands for innovative methods to improve energy efficiency and to prolong the network lifetime. Clustering is considered as an efficient topology control methods in wireless sensor networks, which can increase network scalability and lifetime. This paper presents a method, IP2P K-means – Improved P2P K-means, which uses efficient leveling in clustering approach, reduces false labeling and restricts the necessary communication among various sensors, which obviously saves more energy. The proposed method is examined in Network Simulator Ver.2 (NS2 and the preliminary results show that the algorithm works effectively and relatively more precisely.

  20. Using cluster analysis as a method of classification of the genus Salix L. representatives

    Directory of Open Access Journals (Sweden)

    М. В. Роїк

    2015-06-01

    Full Text Available Purpose. To study interactions among the representatives of the genus Salix L. through the cluster analysis, form groups of closely related species and hybrid forms basing on differences of morphological parameters of leaves. Methods. Field, cluster analysis and tree graphics. Results. Willow species were grouped according to absolute parameters of leaf, and three groups of clusters were identified. The degree of affinity between species were assessed using values of an Euclidean distance. Distinctive features of leaf parameters were defined: length of a leaf blade (Ll, distance (cm between the leaf tip and its maximum width (SDmxT and the distance between the leaf tip (cm and the line of its width that corresponds to the length of petiole (SLpT. Conclusions. Using the willow species collection as an example, diagnostically valuable quantitative parameters of leaves were revealed, the use of which allows to identify willow species and hybrid forms through PC applications.

  1. Clustered iterative stochastic ensemble method for multi-modal calibration of subsurface flow models

    KAUST Repository

    Elsheikh, Ahmed H.

    2013-05-01

    A novel multi-modal parameter estimation algorithm is introduced. Parameter estimation is an ill-posed inverse problem that might admit many different solutions. This is attributed to the limited amount of measured data used to constrain the inverse problem. The proposed multi-modal model calibration algorithm uses an iterative stochastic ensemble method (ISEM) for parameter estimation. ISEM employs an ensemble of directional derivatives within a Gauss-Newton iteration for nonlinear parameter estimation. ISEM is augmented with a clustering step based on k-means algorithm to form sub-ensembles. These sub-ensembles are used to explore different parts of the search space. Clusters are updated at regular intervals of the algorithm to allow merging of close clusters approaching the same local minima. Numerical testing demonstrates the potential of the proposed algorithm in dealing with multi-modal nonlinear parameter estimation for subsurface flow models. © 2013 Elsevier B.V.

  2. Image Retrieval Based on Multiview Constrained Nonnegative Matrix Factorization and Gaussian Mixture Model Spectral Clustering Method

    Directory of Open Access Journals (Sweden)

    Qunyi Xie

    2016-01-01

    Full Text Available Content-based image retrieval has recently become an important research topic and has been widely used for managing images from repertories. In this article, we address an efficient technique, called MNGS, which integrates multiview constrained nonnegative matrix factorization (NMF and Gaussian mixture model- (GMM- based spectral clustering for image retrieval. In the proposed methodology, the multiview NMF scheme provides competitive sparse representations of underlying images through decomposition of a similarity-preserving matrix that is formed by fusing multiple features from different visual aspects. In particular, the proposed method merges manifold constraints into the standard NMF objective function to impose an orthogonality constraint on the basis matrix and satisfy the structure preservation requirement of the coefficient matrix. To manipulate the clustering method on sparse representations, this paper has developed a GMM-based spectral clustering method in which the Gaussian components are regrouped in spectral space, which significantly improves the retrieval effectiveness. In this way, image retrieval of the whole database translates to a nearest-neighbour search in the cluster containing the query image. Simultaneously, this study investigates the proof of convergence of the objective function and the analysis of the computational complexity. Experimental results on three standard image datasets reveal the advantages that can be achieved with the proposed retrieval scheme.

  3. A Method for Context-Based Adaptive QRS Clustering in Real Time.

    Science.gov (United States)

    Castro, Daniel; Félix, Paulo; Presedo, Jesús

    2015-09-01

    Continuous followup of heart condition through long-term electrocardiogram monitoring is an invaluable tool for diagnosing some cardiac arrhythmias. In such context, providing tools for fast locating alterations of normal conduction patterns is mandatory and still remains an open issue. This paper presents a real-time method for adaptive clustering QRS complexes from multilead ECG signals that provides the set of QRS morphologies that appear during an ECG recording. The method processes the QRS complexes sequentially by grouping them into a dynamic set of clusters based on the information content of the temporal context. The clusters are represented by templates which evolve over time and adapt to the QRS morphology changes. Rules to create, merge, and remove clusters are defined along with techniques for noise detection in order to avoid their proliferation. To cope with beat misalignment, derivative dynamic time warping is used. The proposed method has been validated against the MIT-BIH Arrhythmia Database and the AHA ECG Database showing a global purity of 98.56% and 99.56%, respectively. Results show that our proposal not only provides better results than previous offline solutions but also fulfills real-time requirements.

  4. Smoothed Particle Inference: A Kilo-Parametric Method for X-ray Galaxy Cluster Modeling

    Energy Technology Data Exchange (ETDEWEB)

    Peterson, John R.; Marshall, P.J.; /KIPAC, Menlo Park; Andersson, K.; /Stockholm U. /SLAC

    2005-08-05

    We propose an ambitious new method that models the intracluster medium in clusters of galaxies as a set of X-ray emitting smoothed particles of plasma. Each smoothed particle is described by a handful of parameters including temperature, location, size, and elemental abundances. Hundreds to thousands of these particles are used to construct a model cluster of galaxies, with the appropriate complexity estimated from the data quality. This model is then compared iteratively with X-ray data in the form of adaptively binned photon lists via a two-sample likelihood statistic and iterated via Markov Chain Monte Carlo. The complex cluster model is propagated through the X-ray instrument response using direct sampling Monte Carlo methods. Using this approach the method can reproduce many of the features observed in the X-ray emission in a less assumption-dependent way that traditional analyses, and it allows for a more detailed characterization of the density, temperature, and metal abundance structure of clusters. Multi-instrument X-ray analyses and simultaneous X-ray, Sunyaev-Zeldovich (SZ), and lensing analyses are a straight-forward extension of this methodology. Significant challenges still exist in understanding the degeneracy in these models and the statistical noise induced by the complexity of the models.

  5. Learning Hierarchical Feature Extractors for Image Recognition

    Science.gov (United States)

    2012-09-01

    Learning Hierarchical Feature Extractors For Image Recognition by Y-Lan Boureau A dissertation submitted in partial fulfillment of the requirements...DATES COVERED 00-00-2012 to 00-00-2012 4. TITLE AND SUBTITLE Learning Hierarchical Feature Extractors For Image Recognition 5a. CONTRACT...pooling for all weighting schemes. With average pooling, weighting by the square root of the cluster weight performs best. P = 16 configuration space

  6. A method for context-based adaptive QRS clustering in real-time

    OpenAIRE

    Castro, Daniel; Félix Lamas, Paulo; Rodríguez Presedo, Jesús María

    2014-01-01

    Continuous follow-up of heart condition through long-term electrocardiogram monitoring is an invaluable tool for diagnosing some cardiac arrhythmias. In such context, providing tools for fast locating alterations of normal conduction patterns is mandatory and still remains an open issue. This work presents a real-time method for adaptive clustering QRS complexes from multilead ECG signals that provides the set of QRS morphologies that appear during an ECG recording. The method processes the Q...

  7. Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data.

    Science.gov (United States)

    Weber, Lukas M; Robinson, Mark D

    2016-12-01

    Recent technological developments in high-dimensional flow cytometry and mass cytometry (CyTOF) have made it possible to detect expression levels of dozens of protein markers in thousands of cells per second, allowing cell populations to be characterized in unprecedented detail. Traditional data analysis by "manual gating" can be inefficient and unreliable in these high-dimensional settings, which has led to the development of a large number of automated analysis methods. Methods designed for unsupervised analysis use specialized clustering algorithms to detect and define cell populations for further downstream analysis. Here, we have performed an up-to-date, extensible performance comparison of clustering methods for high-dimensional flow and mass cytometry data. We evaluated methods using several publicly available data sets from experiments in immunology, containing both major and rare cell populations, with cell population identities from expert manual gating as the reference standard. Several methods performed well, including FlowSOM, X-shift, PhenoGraph, Rclusterpp, and flowMeans. Among these, FlowSOM had extremely fast runtimes, making this method well-suited for interactive, exploratory analysis of large, high-dimensional data sets on a standard laptop or desktop computer. These results extend previously published comparisons by focusing on high-dimensional data and including new methods developed for CyTOF data. R scripts to reproduce all analyses are available from GitHub (https://github.com/lmweber/cytometry-clustering-comparison), and pre-processed data files are available from FlowRepository (FR-FCM-ZZPH), allowing our comparisons to be extended to include new clustering methods and reference data sets. © 2016 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of ISAC. © 2016 The Authors. Cytometry Part A Published by Wiley Periodicals, Inc. on behalf of ISAC.

  8. Data clustering theory, algorithms, and applications

    CERN Document Server

    Gan, Guojun; Wu, Jianhong

    2007-01-01

    Cluster analysis is an unsupervised process that divides a set of objects into homogeneous groups. This book starts with basic information on cluster analysis, including the classification of data and the corresponding similarity measures, followed by the presentation of over 50 clustering algorithms in groups according to some specific baseline methodologies such as hierarchical, center-based, and search-based methods. As a result, readers and users can easily identify an appropriate algorithm for their applications and compare novel ideas with existing results. The book also provides examples of clustering applications to illustrate the advantages and shortcomings of different clustering architectures and algorithms. Application areas include pattern recognition, artificial intelligence, information technology, image processing, biology, psychology, and marketing. Readers also learn how to perform cluster analysis with the C/C++ and MATLAB® programming languages.

  9. Star clusters in the Magellanic Clouds - I. Parametrization and classification of 1072 clusters in the LMC

    Science.gov (United States)

    Nayak, P. K.; Subramaniam, A.; Choudhury, S.; Indu, G.; Sagar, Ram

    2016-12-01

    We have introduced a semi-automated quantitative method to estimate the age and reddening of 1072 star clusters in the Large Magellanic Cloud (LMC) using the Optical Gravitational Lensing Experiment III survey data. This study brings out 308 newly parametrized clusters. In a first of its kind, the LMC clusters are classified into groups based on richness/mass as very poor, poor, moderate and rich clusters, similar to the classification scheme of open clusters in the Galaxy. A major cluster formation episode is found to happen at 125 ± 25 Myr in the inner LMC. The bar region of the LMC appears prominently in the age range 60-250 Myr and is found to have a relatively higher concentration of poor and moderate clusters. The eastern and the western ends of the bar are found to form clusters initially, which later propagates to the central part. We demonstrate that there is a significant difference in the distribution of clusters as a function of mass, using a movie based on the propagation (in space and time) of cluster formation in various groups. The importance of including the low-mass clusters in the cluster formation history is demonstrated. The catalogue with parameters, classification, and cleaned and isochrone fitted colour-magnitude diagrams of 1072 clusters, which are available as online material, can be further used to understand the hierarchical formation of clusters in selected regions of the LMC.

  10. ConsensusCluster: a software tool for unsupervised cluster discovery in numerical data.

    Science.gov (United States)

    Seiler, Michael; Huang, C Chris; Szalma, Sandor; Bhanot, Gyan

    2010-02-01

    We have created a stand-alone software tool, ConsensusCluster, for the analysis of high-dimensional single nucleotide polymorphism (SNP) and gene expression microarray data. Our software implements the consensus clustering algorithm and principal component analysis to stratify the data into a given number of robust clusters. The robustness is achieved by combining clustering results from data and sample resampling as well as by averaging over various algorithms and parameter settings to achieve accurate, stable clustering results. We have implemented several different clustering algorithms in the software, including K-Means, Partition Around Medoids, Self-Organizing Map, and Hierarchical clustering methods. After clustering the data, ConsensusCluster generates a consensus matrix heatmap to give a useful visual representation of cluster membership, and automatically generates a log of selected features that distinguish each pair of clusters. ConsensusCluster gives more robust and more reliable clusters than common software packages and, therefore, is a powerful unsupervised learning tool that finds hidden patterns in data that might shed light on its biological interpretation. This software is free and available from http://code.google.com/p/consensus-cluster .

  11. Applying clustering approach in predictive uncertainty estimation: a case study with the UNEEC method

    Science.gov (United States)

    Dogulu, Nilay; Solomatine, Dimitri; Lal Shrestha, Durga

    2014-05-01

    Within the context of flood forecasting, assessment of predictive uncertainty has become a necessity for most of the modelling studies in operational hydrology. There are several uncertainty analysis and/or prediction methods available in the literature; however, most of them rely on normality and homoscedasticity assumptions for model residuals occurring in reproducing the observed data. This study focuses on a statistical method analyzing model residuals without having any assumptions and based on a clustering approach: Uncertainty Estimation based on local Errors and Clustering (UNEEC). The aim of this work is to provide a comprehensive evaluation of the UNEEC method's performance in view of clustering approach employed within its methodology. This is done by analyzing normality of model residuals and comparing uncertainty analysis results (for 50% and 90% confidence level) with those obtained from uniform interval and quantile regression methods. An important part of the basis by which the methods are compared is analysis of data clusters representing different hydrometeorological conditions. The validation measures used are PICP, MPI, ARIL and NUE where necessary. A new validation measure linking prediction interval to the (hydrological) model quality - weighted mean prediction interval (WMPI) - is also proposed for comparing the methods more effectively. The case study is Brue catchment, located in the South West of England. A different parametrization of the method than its previous application in Shrestha and Solomatine (2008) is used, i.e. past error values in addition to discharge and effective rainfall is considered. The results show that UNEEC's notable characteristic in its methodology, i.e. applying clustering to data of predictors upon which catchment behaviour information is encapsulated, contributes increased accuracy of the method's results for varying flow conditions. Besides, classifying data so that extreme flow events are individually

  12. Hierarchically Structured Electrospun Fibers

    Directory of Open Access Journals (Sweden)

    Nicole E. Zander

    2013-01-01

    Full Text Available Traditional electrospun nanofibers have a myriad of applications ranging from scaffolds for tissue engineering to components of biosensors and energy harvesting devices. The generally smooth one-dimensional structure of the fibers has stood as a limitation to several interesting novel applications. Control of fiber diameter, porosity and collector geometry will be briefly discussed, as will more traditional methods for controlling fiber morphology and fiber mat architecture. The remainder of the review will focus on new techniques to prepare hierarchically structured fibers. Fibers with hierarchical primary structures—including helical, buckled, and beads-on-a-string fibers, as well as fibers with secondary structures, such as nanopores, nanopillars, nanorods, and internally structured fibers and their applications—will be discussed. These new materials with helical/buckled morphology are expected to possess unique optical and mechanical properties with possible applications for negative refractive index materials, highly stretchable/high-tensile-strength materials, and components in microelectromechanical devices. Core-shell type fibers enable a much wider variety of materials to be electrospun and are expected to be widely applied in the sensing, drug delivery/controlled release fields, and in the encapsulation of live cells for biological applications. Materials with a hierarchical secondary structure are expected to provide new superhydrophobic and self-cleaning materials.

  13. Using cluster analysis and a classification and regression tree model to developed cover types in the Sky Islands of southeastern Arizona

    Science.gov (United States)

    Jose M. Iniguez; Joseph L. Ganey; Peter J. Daughtery; John D. Bailey

    2005-01-01

    The objective of this study was to develop a rule based cover type classification system for the forest and woodland vegetation in the Sky Islands of southeastern Arizona. In order to develop such a system we qualitatively and quantitatively compared a hierarchical (Ward’s) and a non-hierarchical (k-means) clustering method. Ecologically, unique groups represented by...

  14. Using cluster analysis and a classification and regression tree model to developed cover types in the Sky Islands of southeastern Arizona [Abstract

    Science.gov (United States)

    Jose M. Iniguez; Joseph L. Ganey; Peter J. Daugherty; John D. Bailey

    2005-01-01

    The objective of this study was to develop a rule based cover type classification system for the forest and woodland vegetation in the Sky Islands of southeastern Arizona. In order to develop such system we qualitatively and quantitatively compared a hierarchical (Ward’s) and a non-hierarchical (k-means) clustering method. Ecologically, unique groups and plots...

  15. a Three-Step Spatial-Temporal Clustering Method for Human Activity Pattern Analysis

    Science.gov (United States)

    Huang, W.; Li, S.; Xu, S.

    2016-06-01

    How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the

  16. A THREE-STEP SPATIAL-TEMPORAL-SEMANTIC CLUSTERING METHOD FOR HUMAN ACTIVITY PATTERN ANALYSIS

    Directory of Open Access Journals (Sweden)

    W. Huang

    2016-06-01

    Full Text Available How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time to four dimensions (space, time and semantics. More specifically, not only a location and time that people stay and spend are collected, but also what people “say” for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The

  17. Research on the method of information system risk state estimation based on clustering particle filter

    Science.gov (United States)

    Cui, Jia; Hong, Bei; Jiang, Xuepeng; Chen, Qinghua

    2017-05-01

    With the purpose of reinforcing correlation analysis of risk assessment threat factors, a dynamic assessment method of safety risks based on particle filtering is proposed, which takes threat analysis as the core. Based on the risk assessment standards, the method selects threat indicates, applies a particle filtering algorithm to calculate influencing weight of threat indications, and confirms information system risk levels by combining with state estimation theory. In order to improve the calculating efficiency of the particle filtering algorithm, the k-means cluster algorithm is introduced to the particle filtering algorithm. By clustering all particles, the author regards centroid as the representative to operate, so as to reduce calculated amount. The empirical experience indicates that the method can embody the relation of mutual dependence and influence in risk elements reasonably. Under the circumstance of limited information, it provides the scientific basis on fabricating a risk management control strategy.

  18. Research on the method of information system risk state estimation based on clustering particle filter

    Directory of Open Access Journals (Sweden)

    Cui Jia

    2017-05-01

    Full Text Available With the purpose of reinforcing correlation analysis of risk assessment threat factors, a dynamic assessment method of safety risks based on particle filtering is proposed, which takes threat analysis as the core. Based on the risk assessment standards, the method selects threat indicates, applies a particle filtering algorithm to calculate influencing weight of threat indications, and confirms information system risk levels by combining with state estimation theory. In order to improve the calculating efficiency of the particle filtering algorithm, the k-means cluster algorithm is introduced to the particle filtering algorithm. By clustering all particles, the author regards centroid as the representative to operate, so as to reduce calculated amount. The empirical experience indicates that the method can embody the relation of mutual dependence and influence in risk elements reasonably. Under the circumstance of limited information, it provides the scientific basis on fabricating a risk management control strategy.

  19. Clustering Multiple Sclerosis Subgroups with Multifractal Methods and Self-Organizing Map Algorithm

    Science.gov (United States)

    Karaca, Yeliz; Cattani, Carlo

    Magnetic resonance imaging (MRI) is the most sensitive method to detect chronic nervous system diseases such as multiple sclerosis (MS). In this paper, Brownian motion Hölder regularity functions (polynomial, periodic (sine), exponential) for 2D image, such as multifractal methods were applied to MR brain images, aiming to easily identify distressed regions, in MS patients. With these regions, we have proposed an MS classification based on the multifractal method by using the Self-Organizing Map (SOM) algorithm. Thus, we obtained a cluster analysis by identifying pixels from distressed regions in MR images through multifractal methods and by diagnosing subgroups of MS patients through artificial neural networks.

  20. An Energy-Efficient Cluster-Based Vehicle Detection on Road Network Using Intention Numeration Method

    Directory of Open Access Journals (Sweden)

    Deepa Devasenapathy

    2015-01-01

    Full Text Available The traffic in the road network is progressively increasing at a greater extent. Good knowledge of network traffic can minimize congestions using information pertaining to road network obtained with the aid of communal callers, pavement detectors, and so on. Using these methods, low featured information is generated with respect to the user in the road network. Although the existing schemes obtain urban traffic information, they fail to calculate the energy drain rate of nodes and to locate equilibrium between the overhead and quality of the routing protocol that renders a great challenge. Thus, an energy-efficient cluster-based vehicle detection in road network using the intention numeration method (CVDRN-IN is developed. Initially, sensor nodes that detect a vehicle are grouped into separate clusters. Further, we approximate the strength of the node drain rate for a cluster using polynomial regression function. In addition, the total node energy is estimated by taking the integral over the area. Finally, enhanced data aggregation is performed to reduce the amount of data transmission using digital signature tree. The experimental performance is evaluated with Dodgers loop sensor data set from UCI repository and the performance evaluation outperforms existing work on energy consumption, clustering efficiency, and node drain rate.

  1. An Efficient High Dimensional Cluster Method and its Application in Global Climate Sets

    Directory of Open Access Journals (Sweden)

    Ke Li

    2007-10-01

    Full Text Available Because of the development of modern-day satellites and other data acquisition systems, global climate research often involves overwhelming volume and complexity of high dimensional datasets. As a data preprocessing and analysis method, the clustering method is playing a more and more important role in these researches. In this paper, we propose a spatial clustering algorithm that, to some extent, cures the problem of dimensionality in high dimensional clustering. The similarity measure of our algorithm is based on the number of top-k nearest neighbors that two grids share. The neighbors of each grid are computed based on the time series associated with each grid, and computing the nearest neighbor of an object is the most time consuming step. According to Tobler's "First Law of Geography," we add a spatial window constraint upon each grid to restrict the number of grids considered and greatly improve the efficiency of our algorithm. We apply this algorithm to a 100-year global climate dataset and partition the global surface into sub areas under various spatial granularities. Experiments indicate that our spatial clustering algorithm works well.